1
|
Howard JD, Edmonds D, Schoenbaum G, Kahnt T. Distributed midbrain responses signal the content of positive identity prediction errors. Curr Biol 2024; 34:4240-4247.e4. [PMID: 39197457 PMCID: PMC11421979 DOI: 10.1016/j.cub.2024.07.105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/12/2024] [Accepted: 07/31/2024] [Indexed: 09/01/2024]
Abstract
Recent work across species has shown that midbrain dopamine neurons signal not only errors in the prediction of reward value but also in the prediction of value-neutral sensory features. To support learning of associative structures in downstream areas, identity prediction errors (iPEs) should signal specific information about the mis-predicted outcome. Here, we used pattern-based analysis of functional magnetic resonance imaging (fMRI) data acquired during reversal learning to characterize the information content of iPE responses in the human midbrain. We find that fMRI responses to value-neutral identity errors contain information about the identity of the unexpectedly received reward (positive iPE+) but not about the identity of the omitted reward (negative iPE-). Exploratory analyses revealed representations of iPE- in the dorsomedial prefrontal cortex. These results demonstrate that ensemble midbrain responses to value-neutral identity errors convey information about the identity of unexpectedly received outcomes, which could shape the formation of novel stimulus-outcome associations that constitute cognitive maps.
Collapse
Affiliation(s)
- James D Howard
- Department of Psychology, Brandeis University, Waltham, MA 02453, USA.
| | - Donnisa Edmonds
- Department of Neurology, Northwestern University, Chicago, IL 60611, USA
| | - Geoffrey Schoenbaum
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD 21224, USA
| | - Thorsten Kahnt
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD 21224, USA.
| |
Collapse
|
2
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
3
|
Taira M, Millard SJ, Verghese A, DiFazio LE, Hoang IB, Jia R, Sias A, Wikenheiser A, Sharpe MJ. Dopamine Release in the Nucleus Accumbens Core Encodes the General Excitatory Components of Learning. J Neurosci 2024; 44:e0120242024. [PMID: 38969504 PMCID: PMC11358529 DOI: 10.1523/jneurosci.0120-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 06/18/2024] [Accepted: 06/20/2024] [Indexed: 07/07/2024] Open
Abstract
Dopamine release in the nucleus accumbens core (NAcC) is generally considered to be a proxy for phasic firing of the ventral tegmental area dopamine (VTADA) neurons. Thus, dopamine release in NAcC is hypothesized to reflect a unitary role in reward prediction error signaling. However, recent studies reveal more diverse roles of dopamine neurons, which support an emerging idea that dopamine regulates learning differently in distinct circuits. To understand whether the NAcC might regulate a unique component of learning, we recorded dopamine release in NAcC while male rats performed a backward conditioning task where a reward is followed by a neutral cue. We used this task because we can delineate different components of learning, which include sensory-specific inhibitory and general excitatory components. Furthermore, we have shown that VTADA neurons are necessary for both the specific and general components of backward associations. Here, we found that dopamine release in NAcC increased to the reward across learning while reducing to the cue that followed as it became more expected. This mirrors the dopamine prediction error signal seen during forward conditioning and cannot be accounted for temporal-difference reinforcement learning. Subsequent tests allowed us to dissociate these learning components and revealed that dopamine release in NAcC reflects the general excitatory component of backward associations, but not their sensory-specific component. These results emphasize the importance of examining distinct functions of different dopamine projections in reinforcement learning.
Collapse
Affiliation(s)
- Masakazu Taira
- Department of Psychology, University of Sydney, Camperdown, New South Wales 2006, Australia
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Samuel J Millard
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Anna Verghese
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Lauren E DiFazio
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ivy B Hoang
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ruiting Jia
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Ana Sias
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Andrew Wikenheiser
- Department of Psychology, University of California, Los Angeles 90095, California
| | - Melissa J Sharpe
- Department of Psychology, University of Sydney, Camperdown, New South Wales 2006, Australia
- Department of Psychology, University of California, Los Angeles 90095, California
| |
Collapse
|
4
|
Lehmann CM, Miller NE, Nair VS, Costa KM, Schoenbaum G, Moussawi K. Generalized cue reactivity in dopamine neurons after opioids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.02.597025. [PMID: 38853878 PMCID: PMC11160774 DOI: 10.1101/2024.06.02.597025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Cue reactivity is the maladaptive neurobiological and behavioral response upon exposure to drug cues and is a major driver of relapse. The leading hypothesis is that dopamine release by addictive drugs represents a persistently positive reward prediction error that causes runaway enhancement of dopamine responses to drug cues, leading to their pathological overvaluation compared to non-drug reward alternatives. However, this hypothesis has not been directly tested. Here we developed Pavlovian and operant procedures to measure firing responses, within the same dopamine neurons, to drug versus natural reward cues, which we found to be similarly enhanced compared to cues predicting natural rewards in drug-naïve controls. This enhancement was associated with increased behavioral reactivity to the drug cue, suggesting that dopamine release is still critical to cue reactivity, albeit not as previously hypothesized. These results challenge the prevailing hypothesis of cue reactivity, warranting new models of dopaminergic function in drug addiction, and provide critical insights into the neurobiology of cue reactivity with potential implications for relapse prevention.
Collapse
Affiliation(s)
- Collin M. Lehmann
- Department of Psychiatry, University of Pittsburgh; Pittsburgh, 15219, USA
| | - Nora E. Miller
- Department of Psychiatry, University of Pittsburgh; Pittsburgh, 15219, USA
| | - Varun S. Nair
- Department of Psychiatry, University of Pittsburgh; Pittsburgh, 15219, USA
| | - Kauê M. Costa
- Department of Psychology, University of Alabama at Birmingham; Birmingham, 35233, USA
| | - Geoffrey Schoenbaum
- National Institute on Drug Abuse, National Institutes of Health; Baltimore, 21224, USA
| | - Khaled Moussawi
- Department of Psychiatry, University of Pittsburgh; Pittsburgh, 15219, USA
- Department of Neurology, University of California San Francisco; San Francisco, 94158, USA
| |
Collapse
|
5
|
Piccin A, Plat H, Wolff M, Coutureau E. Adaptive Responding to Stimulus-Outcome Associations Requires Noradrenergic Transmission in the Medial Prefrontal Cortex. J Neurosci 2024; 44:e0078242024. [PMID: 38684363 PMCID: PMC11140671 DOI: 10.1523/jneurosci.0078-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/11/2024] [Accepted: 04/21/2024] [Indexed: 05/02/2024] Open
Abstract
A dynamic environment, such as the one we inhabit, requires organisms to continuously update their knowledge of the setting. While the prefrontal cortex is recognized for its pivotal role in regulating such adaptive behavior, the specific contribution of each prefrontal area remains elusive. In the current work, we investigated the direct involvement of two major prefrontal subregions, the medial prefrontal cortex (mPFC, A32D + A32V) and the orbitofrontal cortex (OFC, VO + LO), in updating pavlovian stimulus-outcome (S-O) associations following contingency degradation in male rats. Specifically, animals had to learn that a particular cue, previously fully predicting the delivery of a specific reward, was no longer a reliable predictor. First, we found that chemogenetic inhibition of mPFC, but not of OFC, neurons altered the rats' ability to adaptively respond to degraded and non-degraded cues. Next, given the growing evidence pointing at noradrenaline (NA) as a main neuromodulator of adaptive behavior, we decided to investigate the possible involvement of NA projections to the two subregions in this higher-order cognitive process. Employing a pair of novel retrograde vectors, we traced NA projections from the locus ceruleus (LC) to both structures and observed an equivalent yet relatively segregated amount of inputs. Then, we showed that chemogenetic inhibition of NA projections to the mPFC, but not to the OFC, also impaired the rats' ability to adaptively respond to the degradation procedure. Altogether, our findings provide important evidence of functional parcellation within the prefrontal cortex and point at mPFC NA as key for updating pavlovian S-O associations.
Collapse
Affiliation(s)
| | - Hadrien Plat
- Univ. Bordeaux, CNRS, INCIA, UMR 5287, Bordeaux F-33000, France
| | - Mathieu Wolff
- Univ. Bordeaux, CNRS, INCIA, UMR 5287, Bordeaux F-33000, France
| | | |
Collapse
|
6
|
Wurm F, Ernst B, Steinhauser M. Surprise-minimization as a solution to the structural credit assignment problem. PLoS Comput Biol 2024; 20:e1012175. [PMID: 38805546 PMCID: PMC11175464 DOI: 10.1371/journal.pcbi.1012175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 06/13/2024] [Accepted: 05/18/2024] [Indexed: 05/30/2024] Open
Abstract
The structural credit assignment problem arises when the causal structure between actions and subsequent outcomes is hidden from direct observation. To solve this problem and enable goal-directed behavior, an agent has to infer structure and form a representation thereof. In the scope of this study, we investigate a possible solution in the human brain. We recorded behavioral and electrophysiological data from human participants in a novel variant of the bandit task, where multiple actions lead to multiple outcomes. Crucially, the mapping between actions and outcomes was hidden and not instructed to the participants. Human choice behavior revealed clear hallmarks of credit assignment and learning. Moreover, a computational model which formalizes action selection as the competition between multiple representations of the hidden structure was fit to account for participants data. Starting in a state of uncertainty about the correct representation, the central mechanism of this model is the arbitration of action control towards the representation which minimizes surprise about outcomes. Crucially, single-trial latent-variable analysis reveals that the neural patterns clearly support central quantitative predictions of this surprise minimization model. The results suggest that neural activity is not only related to reinforcement learning under correct as well as incorrect task representations but also reflects central mechanisms of credit assignment and behavioral arbitration.
Collapse
Affiliation(s)
- Franz Wurm
- Catholic University of Eichstätt-Ingolstadt, Eichstätt, Germany
- Leiden University, Leiden, the Netherlands
- Leiden Institute for Brain and Cognition, Leiden University, Leiden, the Netherlands
| | - Benjamin Ernst
- Catholic University of Eichstätt-Ingolstadt, Eichstätt, Germany
| | | |
Collapse
|
7
|
Sias AC, Jafar Y, Goodpaster CM, Ramírez-Armenta K, Wrenn TM, Griffin NK, Patel K, Lamparelli AC, Sharpe MJ, Wassum KM. Dopamine projections to the basolateral amygdala drive the encoding of identity-specific reward memories. Nat Neurosci 2024; 27:728-736. [PMID: 38396258 PMCID: PMC11110430 DOI: 10.1038/s41593-024-01586-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 01/24/2024] [Indexed: 02/25/2024]
Abstract
To make adaptive decisions, we build an internal model of the associative relationships in an environment and use it to make predictions and inferences about specific available outcomes. Detailed, identity-specific cue-reward memories are a core feature of such cognitive maps. Here we used fiber photometry, cell-type and pathway-specific optogenetic manipulation, Pavlovian cue-reward conditioning and decision-making tests in male and female rats, to reveal that ventral tegmental area dopamine (VTADA) projections to the basolateral amygdala (BLA) drive the encoding of identity-specific cue-reward memories. Dopamine is released in the BLA during cue-reward pairing; VTADA→BLA activity is necessary and sufficient to link the identifying features of a reward to a predictive cue but does not assign general incentive properties to the cue or mediate reinforcement. These data reveal a dopaminergic pathway for the learning that supports adaptive decision-making and help explain how VTADA neurons achieve their emerging multifaceted role in learning.
Collapse
Affiliation(s)
- Ana C Sias
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yousif Jafar
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Caitlin M Goodpaster
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Tyler M Wrenn
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Nicholas K Griffin
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Keshav Patel
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Melissa J Sharpe
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
- Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA
- Integrative Center for Learning and Memory, University of California, Los Angeles, Los Angeles, CA, USA
- Integrative Center for Addictive Disorders, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Psychology, University of Sydney, Sydney, New South Wales, Australia
| | - Kate M Wassum
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA.
- Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA.
- Integrative Center for Learning and Memory, University of California, Los Angeles, Los Angeles, CA, USA.
- Integrative Center for Addictive Disorders, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Chow JJ, Pitts KM, Schoenbaum A, Costa KM, Schoenbaum G, Shaham Y. Different Effects of Peer Sex on Operant Responding for Social Interaction and Striatal Dopamine Activity. J Neurosci 2024; 44:e1887232024. [PMID: 38346894 PMCID: PMC10919252 DOI: 10.1523/jneurosci.1887-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 03/08/2024] Open
Abstract
When rats are given discrete choices between social interactions with a peer and opioid or psychostimulant drugs, they choose social interaction, even after extensive drug self-administration experience. Studies show that like drug and nondrug food reinforcers, social interaction is an operant reinforcer and induces dopamine release. However, these studies were conducted with same-sex peers. We examined if peer sex influences operant social interaction and the role of estrous cycle and striatal dopamine in same- versus opposite-sex social interaction. We trained male and female rats (n = 13 responders/12 peers) to lever-press (fixed-ratio 1 [FR1] schedule) for 15 s access to a same- or opposite-sex peer for 16 d (8 d/sex) while tracking females' estrous cycle. Next, we transfected GRAB-DA2m and implanted optic fibers into nucleus accumbens (NAc) core and dorsomedial striatum (DMS). We then retrained the rats for 15 s social interaction (FR1 schedule) for 16 d (8 d/sex) and recorded striatal dopamine during operant responding for a peer for 8 d (4 d/sex). Finally, we assessed economic demand by manipulating FR requirements for a peer (10 d/sex). In male, but not female rats, operant responding was higher for the opposite-sex peer. Female's estrous cycle fluctuations had no effect on operant social interaction. Striatal dopamine signals for operant social interaction were dependent on the peer's sex and striatal region (NAc core vs DMS). Results indicate that estrous cycle fluctuations did not influence operant social interaction and that NAc core and DMS dopamine activity reflect sex-dependent features of volitional social interaction.
Collapse
Affiliation(s)
- Jonathan J Chow
- Intramural Research Program, NIDA, NIH, Baltimore, Maryland 21230
| | - Kayla M Pitts
- Intramural Research Program, NIDA, NIH, Baltimore, Maryland 21230
| | - Ansel Schoenbaum
- Intramural Research Program, NIDA, NIH, Baltimore, Maryland 21230
| | - Kauê M Costa
- Intramural Research Program, NIDA, NIH, Baltimore, Maryland 21230
| | | | - Yavin Shaham
- Intramural Research Program, NIDA, NIH, Baltimore, Maryland 21230
| |
Collapse
|
9
|
Qian L, Burrell M, Hennig JA, Matias S, Murthy VN, Gershman SJ, Uchida N. The role of prospective contingency in the control of behavior and dopamine signals during associative learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.05.578961. [PMID: 38370735 PMCID: PMC10871210 DOI: 10.1101/2024.02.05.578961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Associative learning depends on contingency, the degree to which a stimulus predicts an outcome. Despite its importance, the neural mechanisms linking contingency to behavior remain elusive. Here we examined the dopamine activity in the ventral striatum - a signal implicated in associative learning - in a Pavlovian contingency degradation task in mice. We show that both anticipatory licking and dopamine responses to a conditioned stimulus decreased when additional rewards were delivered uncued, but remained unchanged if additional rewards were cued. These results conflict with contingency-based accounts using a traditional definition of contingency or a novel causal learning model (ANCCR), but can be explained by temporal difference (TD) learning models equipped with an appropriate inter-trial-interval (ITI) state representation. Recurrent neural networks trained within a TD framework develop state representations like our best 'handcrafted' model. Our findings suggest that the TD error can be a measure that describes both contingency and dopaminergic activity.
Collapse
Affiliation(s)
- Lechen Qian
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Mark Burrell
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Jay A. Hennig
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Sara Matias
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Venkatesh. N. Murthy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Samuel J. Gershman
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
10
|
Amo R. Prediction error in dopamine neurons during associative learning. Neurosci Res 2024; 199:12-20. [PMID: 37451506 DOI: 10.1016/j.neures.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 06/18/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
Dopamine neurons have long been thought to facilitate learning by broadcasting reward prediction error (RPE), a teaching signal used in machine learning, but more recent work has advanced alternative models of dopamine's computational role. Here, I revisit this critical issue and review new experimental evidences that tighten the link between dopamine activity and RPE. First, I introduce the recent observation of a gradual backward shift of dopamine activity that had eluded researchers for over a decade. I also discuss several other findings, such as dopamine ramping, that were initially interpreted to conflict but later found to be consistent with RPE. These findings improve our understanding of neural computation in dopamine neurons.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
11
|
de Jong JW, Liang Y, Verharen JPH, Fraser KM, Lammel S. State and rate-of-change encoding in parallel mesoaccumbal dopamine pathways. Nat Neurosci 2024; 27:309-318. [PMID: 38212586 DOI: 10.1038/s41593-023-01547-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 12/07/2023] [Indexed: 01/13/2024]
Abstract
The nervous system uses fast- and slow-adapting sensory detectors in parallel to enable neuronal representations of external states and their temporal dynamics. It is unknown whether this dichotomy also applies to internal representations that have no direct correlation in the physical world. Here we find that two distinct dopamine (DA) neuron subtypes encode either a state or its rate-of-change. In mice performing a reward-seeking task, we found that the animal's behavioral state and rate-of-change were encoded by the sustained activity of DA neurons in medial ventral tegmental area (VTA) DA neurons and transient activity in lateral VTA DA neurons, respectively. The neural activity patterns of VTA DA cell bodies matched DA release patterns within anatomically defined mesoaccumbal pathways. Based on these results, we propose a model in which the DA system uses two parallel lines for proportional-differential encoding of a state variable and its temporal dynamics.
Collapse
Affiliation(s)
- Johannes W de Jong
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| | - Yilan Liang
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| | - Jeroen P H Verharen
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| | - Kurt M Fraser
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| | - Stephan Lammel
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA.
| |
Collapse
|
12
|
Fraser KM, Collins VL, Wolff AR, Ottenheimer DJ, Bornhoft KN, Pat F, Chen BJ, Janak PH, Saunders BT. Contexts facilitate dynamic value encoding in the mesolimbic dopamine system. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.05.565687. [PMID: 37961363 PMCID: PMC10635154 DOI: 10.1101/2023.11.05.565687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Adaptive behavior in a dynamic environment often requires rapid revaluation of stimuli that deviates from well-learned associations. The divergence between stable value-encoding and appropriate behavioral output remains a critical test to theories of dopamine's function in learning, motivation, and motor control. Yet how dopamine neurons are involved in the revaluation of cues when the world changes to alter our behavior remains unclear. Here we make use of pharmacology, in vivo electrophysiology, fiber photometry, and optogenetics to resolve the contributions of the mesolimbic dopamine system to the dynamic reorganization of reward-seeking. Male and female rats were trained to discriminate when a conditioned stimulus would be followed by sucrose reward by exploiting the prior, non-overlapping presentation of a separate discrete cue - an occasion setter. Only when the occasion setter's presentation preceded the conditioned stimulus did the conditioned stimulus predict sucrose delivery. As a result, in this task we were able to dissociate the average value of the conditioned stimulus from its immediate expected value on a trial-to-trial basis. Both the activity of ventral tegmental area dopamine neurons and dopamine signaling in the nucleus accumbens were essential for rats to successfully update behavioral responding in response to the occasion setter. Moreover, dopamine release in the nucleus accumbens following the conditioned stimulus only occurred when the occasion setter indicated it would predict reward. Downstream of dopamine release, we found that single neurons in the nucleus accumbens dynamically tracked the value of the conditioned stimulus. Together these results reveal a novel mechanism within the mesolimbic dopamine system for the rapid revaluation of motivation.
Collapse
Affiliation(s)
- Kurt M Fraser
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | | | - Amy R Wolff
- Department of Neuroscience, University of Minnesota
| | | | | | - Fiona Pat
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | - Bridget J Chen
- Department of Psychological and Brain Sciences, Johns Hopkins University
| | - Patricia H Janak
- Department of Psychological and Brain Sciences, Johns Hopkins University
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University
| | - Benjamin T Saunders
- Department of Neuroscience, University of Minnesota
- Medical Discovery Team on Addiction, University of Minnesota
| |
Collapse
|
13
|
Iglesias AG, Chiu AS, Wong J, Campus P, Li F, Liu ZN, Bhatti JK, Patel SA, Deisseroth K, Akil H, Burgess CR, Flagel SB. Inhibition of Dopamine Neurons Prevents Incentive Value Encoding of a Reward Cue: With Revelations from Deep Phenotyping. J Neurosci 2023; 43:7376-7392. [PMID: 37709540 PMCID: PMC10621773 DOI: 10.1523/jneurosci.0848-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 08/08/2023] [Accepted: 09/08/2023] [Indexed: 09/16/2023] Open
Abstract
The survival of an organism is dependent on its ability to respond to cues in the environment. Such cues can attain control over behavior as a function of the value ascribed to them. Some individuals have an inherent tendency to attribute reward-paired cues with incentive motivational value, or incentive salience. For these individuals, termed sign-trackers, a discrete cue that precedes reward delivery becomes attractive and desirable in its own right. Prior work suggests that the behavior of sign-trackers is dopamine-dependent, and cue-elicited dopamine in the NAc is believed to encode the incentive value of reward cues. Here we exploited the temporal resolution of optogenetics to determine whether selective inhibition of ventral tegmental area (VTA) dopamine neurons during cue presentation attenuates the propensity to sign-track. Using male tyrosine hydroxylase (TH)-Cre Long Evans rats, it was found that, under baseline conditions, ∼84% of TH-Cre rats tend to sign-track. Laser-induced inhibition of VTA dopamine neurons during cue presentation prevented the development of sign-tracking behavior, without affecting goal-tracking behavior. When laser inhibition was terminated, these same rats developed a sign-tracking response. Video analysis using DeepLabCutTM revealed that, relative to rats that received laser inhibition, rats in the control group spent more time near the location of the reward cue even when it was not present and were more likely to orient toward and approach the cue during its presentation. These findings demonstrate that cue-elicited dopamine release is critical for the attribution of incentive salience to reward cues.SIGNIFICANCE STATEMENT Activity of dopamine neurons in the ventral tegmental area (VTA) during cue presentation is necessary for the development of a sign-tracking, but not a goal-tracking, conditioned response in a Pavlovian task. We capitalized on the temporal precision of optogenetics to pair cue presentation with inhibition of VTA dopamine neurons. A detailed behavioral analysis with DeepLabCutTM revealed that cue-directed behaviors do not emerge without dopamine neuron activity in the VTA. Importantly, however, when optogenetic inhibition is lifted, cue-directed behaviors increase, and a sign-tracking response develops. These findings confirm the necessity of dopamine neuron activity in the VTA during cue presentation to encode the incentive value of reward cues.
Collapse
Affiliation(s)
- Amanda G Iglesias
- Neuroscience Graduate Program, University of Michigan, Ann Arbor, Michigan 48104
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Alvin S Chiu
- Neuroscience Graduate Program, University of Michigan, Ann Arbor, Michigan 48104
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Jason Wong
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, Michigan 48104
| | - Paolo Campus
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Fei Li
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Zitong Nemo Liu
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Jasmine K Bhatti
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Shiv A Patel
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Karl Deisseroth
- Department of Bioengineering, Stanford University, Stanford, California 94305
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, California 94305
- Howard Hughes Medical Institute, Stanford University, Stanford, California 94305
| | - Huda Akil
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
- Department of Psychiatry, University of Michigan, Ann Arbor, Michigan 48104
| | - Christian R Burgess
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
| | - Shelly B Flagel
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan 48104
- Department of Psychiatry, University of Michigan, Ann Arbor, Michigan 48104
| |
Collapse
|
14
|
Rouhani N, Niv Y, Frank MJ, Schwabe L. Multiple routes to enhanced memory for emotionally relevant events. Trends Cogn Sci 2023; 27:867-882. [PMID: 37479601 DOI: 10.1016/j.tics.2023.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 06/13/2023] [Accepted: 06/15/2023] [Indexed: 07/23/2023]
Abstract
Events associated with aversive or rewarding outcomes are prioritized in memory. This memory boost is commonly attributed to the elicited affective response, closely linked to noradrenergic and dopaminergic modulation of hippocampal plasticity. Herein we review and compare this 'affect' mechanism to an additional, recently discovered, 'prediction' mechanism whereby memories are strengthened by the extent to which outcomes deviate from expectations, that is, by prediction errors (PEs). The mnemonic impact of PEs is separate from the affective outcome itself and has a distinct neural signature. While both routes enhance memory, these mechanisms are linked to different - and sometimes opposing - predictions for memory integration. We discuss new findings that highlight mechanisms by which emotional events strengthen, integrate, and segment memory.
Collapse
Affiliation(s)
- Nina Rouhani
- Division of Biology and Biological Engineering and Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Yael Niv
- Department of Psychology and Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Michael J Frank
- Department of Cognitive, Linguistic & Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, RI, USA
| | - Lars Schwabe
- Department of Cognitive Psychology, Institute of Psychology, Universität Hamburg, Hamburg, Germany.
| |
Collapse
|
15
|
McNally GP, Jean-Richard-Dit-Bressel P, Millan EZ, Lawrence AJ. Pathways to the persistence of drug use despite its adverse consequences. Mol Psychiatry 2023; 28:2228-2237. [PMID: 36997610 PMCID: PMC10611585 DOI: 10.1038/s41380-023-02040-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 03/10/2023] [Accepted: 03/15/2023] [Indexed: 04/01/2023]
Abstract
The persistence of drug taking despite its adverse consequences plays a central role in the presentation, diagnosis, and impacts of addiction. Eventual recognition and appraisal of these adverse consequences is central to decisions to reduce or cease use. However, the most appropriate ways of conceptualizing persistence in the face of adverse consequences remain unclear. Here we review evidence that there are at least three pathways to persistent use despite the negative consequences of that use. A cognitive pathway for recognition of adverse consequences, a motivational pathway for valuation of these consequences, and a behavioral pathway for responding to these adverse consequences. These pathways are dynamic, not linear, with multiple possible trajectories between them, and each is sufficient to produce persistence. We describe these pathways, their characteristics, brain cellular and circuit substrates, and we highlight their relevance to different pathways to self- and treatment-guided behavior change.
Collapse
Affiliation(s)
- Gavan P McNally
- School of Psychology, UNSW Sydney, Sydney, NSW, 2052, Australia.
| | | | - E Zayra Millan
- School of Psychology, UNSW Sydney, Sydney, NSW, 2052, Australia
| | - Andrew J Lawrence
- Florey Institute of Neuroscience and Mental Health, Parkville, VIC, 3010, Australia
- Florey Department of Neuroscience and Mental Health, University of Melbourne, Melbourne, VIC, 3010, Australia
| |
Collapse
|
16
|
Gyawali U, Martin DA, Sun F, Li Y, Calu D. Dopamine in the dorsal bed nucleus of stria terminalis signals Pavlovian sign-tracking and reward violations. eLife 2023; 12:e81980. [PMID: 37232554 PMCID: PMC10219648 DOI: 10.7554/elife.81980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 05/05/2023] [Indexed: 05/27/2023] Open
Abstract
Midbrain and striatal dopamine signals have been extremely well characterized over the past several decades, yet novel dopamine signals and functions in reward learning and motivation continue to emerge. A similar characterization of real-time sub-second dopamine signals in areas outside of the striatum has been limited. Recent advances in fluorescent sensor technology and fiber photometry permit the measurement of dopamine binding correlates, which can divulge basic functions of dopamine signaling in non-striatal dopamine terminal regions, like the dorsal bed nucleus of the stria terminalis (dBNST). Here, we record GRABDA signals in the dBNST during a Pavlovian lever autoshaping task. We observe greater Pavlovian cue-evoked dBNST GRABDA signals in sign-tracking (ST) compared to goal-tracking/intermediate (GT/INT) rats and the magnitude of cue-evoked dBNST GRABDA signals decreases immediately following reinforcer-specific satiety. When we deliver unexpected rewards or omit expected rewards, we find that dBNST dopamine signals encode bidirectional reward prediction errors in GT/INT rats, but only positive prediction errors in ST rats. Since sign- and goal-tracking approach strategies are associated with distinct drug relapse vulnerabilities, we examined the effects of experimenter-administered fentanyl on dBNST dopamine associative encoding. Systemic fentanyl injections do not disrupt cue discrimination but generally potentiate dBNST dopamine signals. These results reveal multiple dBNST dopamine correlates of learning and motivation that depend on the Pavlovian approach strategy employed.
Collapse
Affiliation(s)
- Utsav Gyawali
- Program in Neuroscience, University of Maryland School of MedicineBaltimoreUnited States
- Department of Anatomy and Neurobiology, University of Maryland School of MedicineBaltimoreUnited States
| | - David A Martin
- Department of Anatomy and Neurobiology, University of Maryland School of MedicineBaltimoreUnited States
| | - Fangmiao Sun
- State Key Laboratory of Membrane Biology, Peking University School of Life Sciences; PKU-IDG/McGovern Institute for Brain Research; Peking-Tsinghua Center for Life SciencesBeijingChina
| | - Yulong Li
- State Key Laboratory of Membrane Biology, Peking University School of Life Sciences; PKU-IDG/McGovern Institute for Brain Research; Peking-Tsinghua Center for Life SciencesBeijingChina
| | - Donna Calu
- Program in Neuroscience, University of Maryland School of MedicineBaltimoreUnited States
- Department of Anatomy and Neurobiology, University of Maryland School of MedicineBaltimoreUnited States
| |
Collapse
|
17
|
Iglesias AG, Chiu AS, Wong J, Campus P, Li F, Liu Z(N, Patel SA, Deisseroth K, Akil H, Burgess CR, Flagel SB. Inhibition of dopamine neurons prevents incentive value encoding of a reward cue: With revelations from deep phenotyping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539324. [PMID: 37205506 PMCID: PMC10187226 DOI: 10.1101/2023.05.03.539324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
The survival of an organism is dependent on their ability to respond to cues in the environment. Such cues can attain control over behavior as a function of the value ascribed to them. Some individuals have an inherent tendency to attribute reward-paired cues with incentive motivational value, or incentive salience. For these individuals, termed sign-trackers, a discrete cue that precedes reward delivery becomes attractive and desirable in its own right. Prior work suggests that the behavior of sign-trackers is dopamine-dependent, and cue-elicited dopamine in the nucleus accumbens is believed to encode the incentive value of reward cues. Here we exploited the temporal resolution of optogenetics to determine whether selective inhibition of ventral tegmental area (VTA) dopamine neurons during cue presentation attenuates the propensity to sign-track. Using male tyrosine hydroxylase (TH)-Cre Long Evans rats it was found that, under baseline conditions, ∼84% of TH-Cre rats tend to sign-track. Laser-induced inhibition of VTA dopamine neurons during cue presentation prevented the development of sign-tracking behavior, without affecting goal-tracking behavior. When laser inhibition was terminated, these same rats developed a sign-tracking response. Video analysis using DeepLabCut revealed that, relative to rats that received laser inhibition, rats in the control group spent more time near the location of the reward cue even when it was not present and were more likely to orient towards and approach the cue during its presentation. These findings demonstrate that cue-elicited dopamine release is critical for the attribution of incentive salience to reward cues. Significance Statement Activity of dopamine neurons in the ventral tegmental area (VTA) during cue presentation is necessary for the development of a sign-tracking, but not a goal-tracking, conditioned response in a Pavlovian task. We capitalized on the temporal precision of optogenetics to pair cue presentation with inhibition of VTA dopamine neurons. A detailed behavioral analysis with DeepLabCut revealed that cue-directed behaviors do not emerge without VTA dopamine. Importantly, however, when optogenetic inhibition is lifted, cue-directed behaviors increase, and a sign-tracking response develops. These findings confirm the necessity of VTA dopamine during cue presentation to encode the incentive value of reward cues.
Collapse
Affiliation(s)
- Amanda G. Iglesias
- Neuroscience Graduate Program, University of Michigan, Ann Arbor 48104, Michigan
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Alvin S. Chiu
- Neuroscience Graduate Program, University of Michigan, Ann Arbor 48104, Michigan
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Jason Wong
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor 48104, Michigan
| | - Paolo Campus
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Fei Li
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Zitong (Nemo) Liu
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Shiv A. Patel
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Karl Deisseroth
- Department of Bioengineering, Stanford University, Stanford 94305, California
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford 94305, California
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford 94305, California
- Howard Hughes Medical Institute, Stanford University, Stanford 94305, California
| | - Huda Akil
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
- Department of Psychiatry, University of Michigan, Ann Arbor 48104, Michigan
| | - Christian R. Burgess
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
| | - Shelly B. Flagel
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor 48104, Michigan
- Department of Psychiatry, University of Michigan, Ann Arbor 48104, Michigan
| |
Collapse
|
18
|
Takahashi YK, Stalnaker TA, Mueller LE, Harootonian SK, Langdon AJ, Schoenbaum G. Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model. Nat Neurosci 2023; 26:830-839. [PMID: 37081296 PMCID: PMC10646487 DOI: 10.1038/s41593-023-01310-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/16/2023] [Indexed: 04/22/2023]
Abstract
Dopamine neuron activity is tied to the prediction error in temporal difference reinforcement learning models. These models make significant simplifying assumptions, particularly with regard to the structure of the predictions fed into the dopamine neurons, which consist of a single chain of timepoint states. Although this predictive structure can explain error signals observed in many studies, it cannot cope with settings where subjects might infer multiple independent events and outcomes. In the present study, we recorded dopamine neurons in the ventral tegmental area in such a setting to test the validity of the single-stream assumption. Rats were trained in an odor-based choice task, in which the timing and identity of one of several rewards delivered in each trial changed across trial blocks. This design revealed an error signaling pattern that requires the dopamine neurons to access and update multiple independent predictive streams reflecting the subject's belief about timing and potentially unique identities of expected rewards.
Collapse
Affiliation(s)
- Yuji K Takahashi
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA.
| | - Thomas A Stalnaker
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | - Lauren E Mueller
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | | | - Angela J Langdon
- Intramural Research Program, National Institute of Mental Health, Bethesda, MD, USA.
| | - Geoffrey Schoenbaum
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA.
| |
Collapse
|
19
|
Grahek I, Frömer R, Prater Fahey M, Shenhav A. Learning when effort matters: neural dynamics underlying updating and adaptation to changes in performance efficacy. Cereb Cortex 2023; 33:2395-2411. [PMID: 35695774 PMCID: PMC9977373 DOI: 10.1093/cercor/bhac215] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 05/06/2022] [Accepted: 05/08/2022] [Indexed: 11/13/2022] Open
Abstract
To determine how much cognitive control to invest in a task, people need to consider whether exerting control matters for obtaining rewards. In particular, they need to account for the efficacy of their performance-the degree to which rewards are determined by performance or by independent factors. Yet it remains unclear how people learn about their performance efficacy in an environment. Here we combined computational modeling with measures of task performance and EEG, to provide a mechanistic account of how people (i) learn and update efficacy expectations in a changing environment and (ii) proactively adjust control allocation based on current efficacy expectations. Across 2 studies, subjects performed an incentivized cognitive control task while their performance efficacy (the likelihood that rewards are performance-contingent or random) varied over time. We show that people update their efficacy beliefs based on prediction errors-leveraging similar neural and computational substrates as those that underpin reward learning-and adjust how much control they allocate according to these beliefs. Using computational modeling, we show that these control adjustments reflect changes in information processing, rather than the speed-accuracy tradeoff. These findings demonstrate the neurocomputational mechanism through which people learn how worthwhile their cognitive control is.
Collapse
Affiliation(s)
- Ivan Grahek
- Department of Cognitive, Linguistic, & Psychological Sciences, Carney Institute for Brain Science, Brown University, Box 1821, Providence, RI 02912, United States
| | - Romy Frömer
- Department of Cognitive, Linguistic, & Psychological Sciences, Carney Institute for Brain Science, Brown University, Box 1821, Providence, RI 02912, United States
| | - Mahalia Prater Fahey
- Department of Cognitive, Linguistic, & Psychological Sciences, Carney Institute for Brain Science, Brown University, Box 1821, Providence, RI 02912, United States
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, & Psychological Sciences, Carney Institute for Brain Science, Brown University, Box 1821, Providence, RI 02912, United States
| |
Collapse
|
20
|
Desch S, Schweinhardt P, Seymour B, Flor H, Becker S. Evidence for dopaminergic involvement in endogenous modulation of pain relief. eLife 2023; 12:e81436. [PMID: 36722857 PMCID: PMC9988263 DOI: 10.7554/elife.81436] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 01/31/2023] [Indexed: 02/02/2023] Open
Abstract
Relief of ongoing pain is a potent motivator of behavior, directing actions to escape from or reduce potentially harmful stimuli. Whereas endogenous modulation of pain events is well characterized, relatively little is known about the modulation of pain relief and its corresponding neurochemical basis. Here, we studied pain modulation during a probabilistic relief-seeking task (a 'wheel of fortune' gambling task), in which people actively or passively received reduction of a tonic thermal pain stimulus. We found that relief perception was enhanced by active decisions and unpredictability, and greater in high novelty-seeking trait individuals, consistent with a model in which relief is tuned by its informational content. We then probed the roles of dopaminergic and opioidergic signaling, both of which are implicated in relief processing, by embedding the task in a double-blinded cross-over design with administration of the dopamine precursor levodopa and the opioid receptor antagonist naltrexone. We found that levodopa enhanced each of these information-specific aspects of relief modulation but no significant effects of the opioidergic manipulation. These results show that dopaminergic signaling has a key role in modulating the perception of pain relief to optimize motivation and behavior.
Collapse
Affiliation(s)
- Simon Desch
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
- Clinical Psychology, Department of Experimental Psychology, Heinrich Heine University DüsseldorfDüsseldorfGermany
| | - Petra Schweinhardt
- Integrative Spinal Research, Department of Chiropractic Medicine, Balgrist University Hospital, University of ZurichZurichSwitzerland
| | - Ben Seymour
- Wellcome Centre for Integrative Neuroimaging, John Radcliffe HospitalOxfordUnited Kingdom
| | - Herta Flor
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
| | - Susanne Becker
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
- Clinical Psychology, Department of Experimental Psychology, Heinrich Heine University DüsseldorfDüsseldorfGermany
- Integrative Spinal Research, Department of Chiropractic Medicine, Balgrist University Hospital, University of ZurichZurichSwitzerland
| |
Collapse
|
21
|
Mikus N, Korb S, Massaccesi C, Gausterer C, Graf I, Willeit M, Eisenegger C, Lamm C, Silani G, Mathys C. Effects of dopamine D2/3 and opioid receptor antagonism on the trade-off between model-based and model-free behaviour in healthy volunteers. eLife 2022; 11:e79661. [PMID: 36468832 PMCID: PMC9721617 DOI: 10.7554/elife.79661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 11/22/2022] [Indexed: 12/11/2022] Open
Abstract
Human behaviour requires flexible arbitration between actions we do out of habit and actions that are directed towards a specific goal. Drugs that target opioid and dopamine receptors are notorious for inducing maladaptive habitual drug consumption; yet, how the opioidergic and dopaminergic neurotransmitter systems contribute to the arbitration between habitual and goal-directed behaviour is poorly understood. By combining pharmacological challenges with a well-established decision-making task and a novel computational model, we show that the administration of the dopamine D2/3 receptor antagonist amisulpride led to an increase in goal-directed or 'model-based' relative to habitual or 'model-free' behaviour, whereas the non-selective opioid receptor antagonist naltrexone had no appreciable effect. The effect of amisulpride on model-based/model-free behaviour did not scale with drug serum levels in the blood. Furthermore, participants with higher amisulpride serum levels showed higher explorative behaviour. These findings highlight the distinct functional contributions of dopamine and opioid receptors to goal-directed and habitual behaviour and support the notion that even small doses of amisulpride promote flexible application of cognitive control.
Collapse
Affiliation(s)
- Nace Mikus
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
- Interacting Minds Centre, Aarhus UniversityAarhusDenmark
| | - Sebastian Korb
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
- Department of Psychology, University of EssexColchesterUnited Kingdom
| | - Claudia Massaccesi
- Department of Clinical and Health Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Christian Gausterer
- FDZ‐Forensisches DNA Zentrallabor GmbH, Medical University of ViennaViennaAustria
| | - Irene Graf
- Department of Psychiatry and Psychotherapy, Medical University of ViennaViennaAustria
| | - Matthäus Willeit
- Department of Psychiatry and Psychotherapy, Medical University of ViennaViennaAustria
| | - Christoph Eisenegger
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Claus Lamm
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Giorgia Silani
- Department of Clinical and Health Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Christoph Mathys
- Interacting Minds Centre, Aarhus UniversityAarhusDenmark
- Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich and ETH ZurichZurichSwitzerland
- Scuola Internazionale Superiore di Studi Avanzati (SISSA)TriesteItaly
| |
Collapse
|
22
|
Monosov IE, Ogasawara T, Haber SN, Heimel JA, Ahmadlou M. The zona incerta in control of novelty seeking and investigation across species. Curr Opin Neurobiol 2022; 77:102650. [PMID: 36399897 DOI: 10.1016/j.conb.2022.102650] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 10/02/2022] [Accepted: 10/06/2022] [Indexed: 11/17/2022]
Abstract
Many organisms rely on a capacity to rapidly replicate, disperse, and evolve when faced with uncertainty and novelty. But mammals do not evolve and replicate quickly. They rely on a sophisticated nervous system to generate predictions and select responses when confronted with these challenges. An important component of their behavioral repertoire is the adaptive context-dependent seeking or avoiding of perceptually novel objects, even when their values have not yet been learned. Here, we outline recent cross-species breakthroughs that shed light on how the zona incerta (ZI), a relatively evolutionarily conserved brain area, supports novelty-seeking and novelty-related investigations. We then conjecture how the architecture of the ZI's anatomical connectivity - the wide-ranging top-down cortical inputs to the ZI, and its specifically strong outputs to both the brainstem action controllers and to brain areas involved in action value learning - place the ZI in a unique role at the intersection of cognitive control and learning.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Takaya Ogasawara
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Suzanne N Haber
- Department of Pharmacology and Physiology, University of Rochester School of Medicine & Dentistry, Rochester, NY, 14642, USA; Department of Psychiatry, McLean Hospital, Harvard Medical School, Belmont, MA, 02478, USA
| | - J Alexander Heimel
- Circuits Structure and Function Group, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA, Amsterdam, the Netherlands
| | - Mehran Ahmadlou
- Circuits Structure and Function Group, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA, Amsterdam, the Netherlands; Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, 25 Howland St., W1T4JG London, UK
| |
Collapse
|
23
|
Mahr JB, Fischer B. Internally Triggered Experiences of Hedonic Valence in Nonhuman Animals: Cognitive and Welfare Considerations. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2022; 18:688-701. [PMID: 36288434 DOI: 10.1177/17456916221120425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Do any nonhuman animals have hedonically valenced experiences not directly caused by stimuli in their current environment? Do they, like us humans, experience anticipated or previously experienced pains and pleasures as respectively painful and pleasurable? We review evidence from comparative neuroscience about hippocampus-dependent simulation in relation to this question. Hippocampal sharp-wave ripples and theta oscillations have been found to instantiate previous and anticipated experiences. These hippocampal activations coordinate with neural reward and fear centers as well as sensory and cortical areas in ways that are associated with conscious episodic mental imagery in humans. Moreover, such hippocampal “re- and preplay” has been found to contribute to instrumental decision making, the learning of value representations, and the delay of rewards in rats. The functional and structural features of hippocampal simulation are highly conserved across mammals. This evidence makes it reasonable to assume that internally triggered experiences of hedonic valence (IHVs) are pervasive across (at least) all mammals. This conclusion has important welfare implications. Most prominently, IHVs act as a kind of “welfare multiplier” through which the welfare impacts of any given experience of pain or pleasure are increased through each future retrieval. However, IHVs also have practical implications for welfare assessment and cause prioritization.
Collapse
Affiliation(s)
| | - Bob Fischer
- Department of Philosophy, Texas State University
| |
Collapse
|
24
|
Colas JT, Dundon NM, Gerraty RT, Saragosa‐Harris NM, Szymula KP, Tanwisuth K, Tyszka JM, van Geen C, Ju H, Toga AW, Gold JI, Bassett DS, Hartley CA, Shohamy D, Grafton ST, O'Doherty JP. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T. Hum Brain Mapp 2022; 43:4750-4790. [PMID: 35860954 PMCID: PMC9491297 DOI: 10.1002/hbm.25988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 11/12/2022] Open
Abstract
The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| | - Neil M. Dundon
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Department of Child and Adolescent Psychiatry, Psychotherapy, and PsychosomaticsUniversity of FreiburgFreiburg im BreisgauGermany
| | - Raphael T. Gerraty
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Center for Science and SocietyColumbia UniversityNew YorkNew YorkUSA
| | - Natalie M. Saragosa‐Harris
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of CaliforniaLos AngelesCaliforniaUSA
| | - Karol P. Szymula
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Koranis Tanwisuth
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Department of PsychologyUniversity of CaliforniaBerkeleyCaliforniaUSA
| | - J. Michael Tyszka
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
| | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harang Ju
- Neuroscience Graduate GroupUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Arthur W. Toga
- Laboratory of Neuro ImagingUSC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Joshua I. Gold
- Department of NeuroscienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dani S. Bassett
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Electrical and Systems EngineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of NeurologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Physics and AstronomyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Santa Fe InstituteSanta FeNew MexicoUSA
| | - Catherine A. Hartley
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Center for Neural ScienceNew York UniversityNew YorkNew YorkUSA
| | - Daphna Shohamy
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Kavli Institute for Brain ScienceColumbia UniversityNew YorkNew YorkUSA
| | - Scott T. Grafton
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
| | - John P. O'Doherty
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| |
Collapse
|
25
|
Lan DCL, Browning M. What Can Reinforcement Learning Models of Dopamine and Serotonin Tell Us about the Action of Antidepressants? COMPUTATIONAL PSYCHIATRY (CAMBRIDGE, MASS.) 2022; 6:166-188. [PMID: 38774776 PMCID: PMC11104395 DOI: 10.5334/cpsy.83] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 06/29/2022] [Indexed: 11/20/2022]
Abstract
Although evidence suggests that antidepressants are effective at treating depression, the mechanisms behind antidepressant action remain unclear, especially at the cognitive/computational level. In recent years, reinforcement learning (RL) models have increasingly been used to characterise the roles of neurotransmitters and to probe the computations that might be altered in psychiatric disorders like depression. Hence, RL models might present an opportunity for us to better understand the computational mechanisms underlying antidepressant effects. Moreover, RL models may also help us shed light on how these computations may be implemented in the brain (e.g., in midbrain, striatal, and prefrontal regions) and how these neural mechanisms may be altered in depression and remediated by antidepressant treatments. In this paper, we evaluate the ability of RL models to help us understand the processes underlying antidepressant action. To do this, we review the preclinical literature on the roles of dopamine and serotonin in RL, draw links between these findings and clinical work investigating computations altered in depression, and appraise the evidence linking modification of RL processes to antidepressant function. Overall, while there is no shortage of promising ideas about the computational mechanisms underlying antidepressant effects, there is insufficient evidence directly implicating these mechanisms in the response of depressed patients to antidepressant treatment. Consequently, future studies should investigate these mechanisms in samples of depressed patients and assess whether modifications in RL processes mediate the clinical effect of antidepressant treatments.
Collapse
Affiliation(s)
- Denis C. L. Lan
- Department of Experimental Psychology, University of Oxford, Oxford, GB
| | | |
Collapse
|
26
|
de Jong JW, Fraser KM, Lammel S. Mesoaccumbal Dopamine Heterogeneity: What Do Dopamine Firing and Release Have to Do with It? Annu Rev Neurosci 2022; 45:109-129. [PMID: 35226827 PMCID: PMC9271543 DOI: 10.1146/annurev-neuro-110920-011929] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Ventral tegmental area (VTA) dopamine (DA) neurons are often thought to uniformly encode reward prediction errors. Conversely, DA release in the nucleus accumbens (NAc), the prominent projection target of these neurons, has been implicated in reinforcement learning, motivation, aversion, and incentive salience. This contrast between heterogeneous functions of DA release versus a homogeneous role for DA neuron activity raises numerous questions regarding how VTA DA activity translates into NAc DA release. Further complicating this issue is increasing evidence that distinct VTA DA projections into defined NAc subregions mediate diverse behavioral functions. Here, we evaluate evidence for heterogeneity within the mesoaccumbal DA system and argue that frameworks of DA function must incorporate the precise topographic organization of VTA DA neurons to clarify their contribution to health and disease.
Collapse
Affiliation(s)
- Johannes W de Jong
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, California, USA;
| | - Kurt M Fraser
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, California, USA;
| | - Stephan Lammel
- Department of Molecular and Cell Biology and Helen Wills Neuroscience Institute, University of California, Berkeley, California, USA;
| |
Collapse
|
27
|
Seitz BM, Hoang IB, DiFazio LE, Blaisdell AP, Sharpe MJ. Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner. Curr Biol 2022; 32:3210-3218.e3. [PMID: 35752165 DOI: 10.1016/j.cub.2022.06.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/29/2022] [Accepted: 06/13/2022] [Indexed: 01/06/2023]
Abstract
For over two decades, phasic activity in midbrain dopamine neurons was considered synonymous with the prediction error in temporal-difference reinforcement learning.1-4 Central to this proposal is the notion that reward-predictive stimuli become endowed with the scalar value of predicted rewards. When these cues are subsequently encountered, their predictive value is compared to the value of the actual reward received, allowing for the calculation of prediction errors.5,6 Phasic firing of dopamine neurons was proposed to reflect this computation,1,2 facilitating the backpropagation of value from the predicted reward to the reward-predictive stimulus, thus reducing future prediction errors. There are two critical assumptions of this proposal: (1) that dopamine errors can only facilitate learning about scalar value and not more complex features of predicted rewards, and (2) that the dopamine signal can only be involved in anticipatory cue-reward learning in which cues or actions precede rewards. Recent work7-15 has challenged the first assumption, demonstrating that phasic dopamine signals across species are involved in learning about more complex features of the predicted outcomes, in a manner that transcends this value computation. Here, we tested the validity of the second assumption. Specifically, we examined whether phasic midbrain dopamine activity would be necessary for backward conditioning-when a neutral cue reliably follows a rewarding outcome.16-20 Using a specific Pavlovian-to-instrumental transfer (PIT) procedure,21-23 we show rats learn both excitatory and inhibitory components of a backward association, and that this association entails knowledge of the specific identity of the reward and cue. We demonstrate that brief optogenetic inhibition of VTADA neurons timed to the transition between the reward and cue reduces both of these components of backward conditioning. These findings suggest VTADA neurons are capable of facilitating associations between contiguously occurring events, regardless of the content of those events. We conclude that these data may be in line with suggestions that the VTADA error acts as a universal teaching signal. This may provide insight into why dopamine function has been implicated in myriad psychological disorders that are characterized by very distinct reinforcement-learning deficits.
Collapse
Affiliation(s)
- Benjamin M Seitz
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Ivy B Hoang
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Lauren E DiFazio
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Aaron P Blaisdell
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Melissa J Sharpe
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA.
| |
Collapse
|
28
|
Witkowski PP, Park SA, Boorman ED. Neural mechanisms of credit assignment for inferred relationships in a structured world. Neuron 2022; 110:2680-2690.e9. [PMID: 35714610 DOI: 10.1016/j.neuron.2022.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 04/12/2022] [Accepted: 05/18/2022] [Indexed: 10/18/2022]
Abstract
Animals abstract compact representations of a task's structure, which supports accelerated learning and flexible behavior. Whether and how such abstracted representations may be used to assign credit for inferred, but unobserved, relationships in structured environments are unknown. We develop a hierarchical reversal-learning task and Bayesian learning model to assess the computational and neural mechanisms underlying how humans infer specific choice-outcome associations via structured knowledge. We find that the medial prefrontal cortex (mPFC) efficiently represents hierarchically related choice-outcome associations governed by the same latent cause, using a generalized code to assign credit for both experienced and inferred outcomes. Furthermore, the mPFC and lateral orbitofrontal cortex track the current "position" within a latent association space that generalizes over stimuli. Collectively, these findings demonstrate the importance of both tracking the current position in an abstracted task space and efficient, generalizable representations in the prefrontal cortex for supporting flexible learning and inference in structured environments.
Collapse
Affiliation(s)
- Phillip P Witkowski
- Center for Mind and Brain, University of California, Davis, Davis, CA 95618; Department of Psychology, University of California, Davis, Davis, CA 95618.
| | - Seongmin A Park
- Center for Mind and Brain, University of California, Davis, Davis, CA 95618
| | - Erie D Boorman
- Center for Mind and Brain, University of California, Davis, Davis, CA 95618; Department of Psychology, University of California, Davis, Davis, CA 95618.
| |
Collapse
|
29
|
Rybicki AJ, Sowden SL, Schuster B, Cook JL. Dopaminergic challenge dissociates learning from primary versus secondary sources of information. eLife 2022; 11:74893. [PMID: 35289748 PMCID: PMC9023054 DOI: 10.7554/elife.74893] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
Some theories of human cultural evolution posit that humans have social-specific learning mechanisms that are adaptive specialisations moulded by natural selection to cope with the pressures of group living. However, the existence of neurochemical pathways that are specialised for learning from social information and individual experience is widely debated. Cognitive neuroscientific studies present mixed evidence for social-specific learning mechanisms: some studies find dissociable neural correlates for social and individual learning, whereas others find the same brain areas and, dopamine-mediated, computations involved in both. Here, we demonstrate that, like individual learning, social learning is modulated by the dopamine D2 receptor antagonist haloperidol when social information is the primary learning source, but not when it comprises a secondary, additional element. Two groups (total N = 43) completed a decision-making task which required primary learning, from own experience, and secondary learning from an additional source. For one group, the primary source was social, and secondary was individual; for the other group this was reversed. Haloperidol affected primary learning irrespective of social/individual nature, with no effect on learning from the secondary source. Thus, we illustrate that dopaminergic mechanisms underpinning learning can be dissociated along a primary-secondary but not a social-individual axis. These results resolve conflict in the literature and support an expanding field showing that, rather than being specialised for particular inputs, neurochemical pathways in the human brain can process both social and non-social cues and arbitrate between the two depending upon which cue is primarily relevant for the task at hand.
Collapse
|
30
|
Aguilar-Canto F, Calvo H. A Hebbian Approach to Non-Spatial Prelinguistic Reasoning. Brain Sci 2022; 12:brainsci12020281. [PMID: 35204044 PMCID: PMC8870645 DOI: 10.3390/brainsci12020281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 02/12/2022] [Accepted: 02/15/2022] [Indexed: 02/01/2023] Open
Abstract
This research integrates key concepts of Computational Neuroscience, including the Bienestock-CooperMunro (BCM) rule, Spike Timing-Dependent Plasticity Rules (STDP), and the Temporal Difference Learning algorithm, with an important structure of Deep Learning (Convolutional Networks) to create an architecture with the potential of replicating observations of some cognitive experiments (particularly, those that provided some basis for sequential reasoning) while sharing the advantages already achieved by the previous proposals. In particular, we present Ring Model B, which is capable of associating visual with auditory stimulus, performing sequential predictions, and predicting reward from experience. Despite its simplicity, we considered such abilities to be a first step towards the formulation of more general models of prelinguistic reasoning.
Collapse
|
31
|
Millard SJ, Bearden CE, Karlsgodt KH, Sharpe MJ. The prediction-error hypothesis of schizophrenia: new data point to circuit-specific changes in dopamine activity. Neuropsychopharmacology 2022; 47:628-640. [PMID: 34588607 PMCID: PMC8782867 DOI: 10.1038/s41386-021-01188-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/23/2021] [Accepted: 09/07/2021] [Indexed: 02/07/2023]
Abstract
Schizophrenia is a severe psychiatric disorder affecting 21 million people worldwide. People with schizophrenia suffer from symptoms including psychosis and delusions, apathy, anhedonia, and cognitive deficits. Strikingly, schizophrenia is characterised by a learning paradox involving difficulties learning from rewarding events, whilst simultaneously 'overlearning' about irrelevant or neutral information. While dysfunction in dopaminergic signalling has long been linked to the pathophysiology of schizophrenia, a cohesive framework that accounts for this learning paradox remains elusive. Recently, there has been an explosion of new research investigating how dopamine contributes to reinforcement learning, which illustrates that midbrain dopamine contributes in complex ways to reinforcement learning, not previously envisioned. This new data brings new possibilities for how dopamine signalling contributes to the symptomatology of schizophrenia. Building on recent work, we present a new neural framework for how we might envision specific dopamine circuits contributing to this learning paradox in schizophrenia in the context of models of reinforcement learning. Further, we discuss avenues of preclinical research with the use of cutting-edge neuroscience techniques where aspects of this model may be tested. Ultimately, it is hoped that this review will spur to action more research utilising specific reinforcement learning paradigms in preclinical models of schizophrenia, to reconcile seemingly disparate symptomatology and develop more efficient therapeutics.
Collapse
Affiliation(s)
- Samuel J. Millard
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA
| | - Carrie E. Bearden
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA ,grid.19006.3e0000 0000 9632 6718Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095 USA
| | - Katherine H. Karlsgodt
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA ,grid.19006.3e0000 0000 9632 6718Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095 USA
| | - Melissa J. Sharpe
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA
| |
Collapse
|
32
|
Deserno L, Moran R, Michely J, Lee Y, Dayan P, Dolan RJ. Dopamine enhances model-free credit assignment through boosting of retrospective model-based inference. eLife 2021; 10:e67778. [PMID: 34882092 PMCID: PMC8758138 DOI: 10.7554/elife.67778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 12/08/2021] [Indexed: 11/13/2022] Open
Abstract
Dopamine is implicated in representing model-free (MF) reward prediction errors a as well as influencing model-based (MB) credit assignment and choice. Putative cooperative interactions between MB and MF systems include a guidance of MF credit assignment by MB inference. Here, we used a double-blind, placebo-controlled, within-subjects design to test an hypothesis that enhancing dopamine levels boosts the guidance of MF credit assignment by MB inference. In line with this, we found that levodopa enhanced guidance of MF credit assignment by MB inference, without impacting MF and MB influences directly. This drug effect correlated negatively with a dopamine-dependent change in purely MB credit assignment, possibly reflecting a trade-off between these two MB components of behavioural control. Our findings of a dopamine boost in MB inference guidance of MF learning highlight a novel DA influence on MB-MF cooperative interactions.
Collapse
Affiliation(s)
- Lorenz Deserno
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University of WürzburgWürzburgGermany
- Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
| | - Rani Moran
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
| | - Jochen Michely
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin BerlinBerlinGermany
| | - Ying Lee
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
- Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
| | - Peter Dayan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- Max Planck Institute for Biological CyberneticsTübingenGermany
- University of TübingenTübingenGermany
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
| |
Collapse
|
33
|
Liakoni V, Lehmann MP, Modirshanechi A, Brea J, Lutti A, Gerstner W, Preuschoff K. Brain signals of a Surprise-Actor-Critic model: Evidence for multiple learning modules in human decision making. Neuroimage 2021; 246:118780. [PMID: 34875383 DOI: 10.1016/j.neuroimage.2021.118780] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 08/03/2021] [Accepted: 12/04/2021] [Indexed: 11/25/2022] Open
Abstract
Learning how to reach a reward over long series of actions is a remarkable capability of humans, and potentially guided by multiple parallel learning modules. Current brain imaging of learning modules is limited by (i) simple experimental paradigms, (ii) entanglement of brain signals of different learning modules, and (iii) a limited number of computational models considered as candidates for explaining behavior. Here, we address these three limitations and (i) introduce a complex sequential decision making task with surprising events that allows us to (ii) dissociate correlates of reward prediction errors from those of surprise in functional magnetic resonance imaging (fMRI); and (iii) we test behavior against a large repertoire of model-free, model-based, and hybrid reinforcement learning algorithms, including a novel surprise-modulated actor-critic algorithm. Surprise, derived from an approximate Bayesian approach for learning the world-model, is extracted in our algorithm from a state prediction error. Surprise is then used to modulate the learning rate of a model-free actor, which itself learns via the reward prediction error from model-free value estimation by the critic. We find that action choices are well explained by pure model-free policy gradient, but reaction times and neural data are not. We identify signatures of both model-free and surprise-based learning signals in blood oxygen level dependent (BOLD) responses, supporting the existence of multiple parallel learning modules in the brain. Our results extend previous fMRI findings to a multi-step setting and emphasize the role of policy gradient and surprise signalling in human learning.
Collapse
Affiliation(s)
- Vasiliki Liakoni
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland.
| | - Marco P Lehmann
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Alireza Modirshanechi
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Johanni Brea
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Antoine Lutti
- Laboratoire de recherche en neuroimagerie (LREN), Department of Clinical Neurosciences, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Wulfram Gerstner
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Kerstin Preuschoff
- Geneva Finance Research Institute & Interfaculty Center for Affective Sciences, University of Geneva, Geneva, Switzerland
| |
Collapse
|
34
|
Prével A, Krebs RM. Higher-Order Conditioning With Simultaneous and Backward Conditioned Stimulus: Implications for Models of Pavlovian Conditioning. Front Behav Neurosci 2021; 15:749517. [PMID: 34858147 PMCID: PMC8632485 DOI: 10.3389/fnbeh.2021.749517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 10/18/2021] [Indexed: 11/23/2022] Open
Abstract
In a new environment, humans and animals can detect and learn that cues predict meaningful outcomes, and use this information to adapt their responses. This process is termed Pavlovian conditioning. Pavlovian conditioning is also observed for stimuli that predict outcome-associated cues; a second type of conditioning is termed higher-order Pavlovian conditioning. In this review, we will focus on higher-order conditioning studies with simultaneous and backward conditioned stimuli. We will examine how the results from these experiments pose a challenge to models of Pavlovian conditioning like the Temporal Difference (TD) models, in which learning is mainly driven by reward prediction errors. Contrasting with this view, the results suggest that humans and animals can form complex representations of the (temporal) structure of the task, and use this information to guide behavior, which seems consistent with model-based reinforcement learning. Future investigations involving these procedures could result in important new insights on the mechanisms that underlie Pavlovian conditioning.
Collapse
Affiliation(s)
- Arthur Prével
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Ruth M Krebs
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
35
|
McDougle SD, Ballard IC, Baribault B, Bishop SJ, Collins AGE. Executive Function Assigns Value to Novel Goal-Congruent Outcomes. Cereb Cortex 2021; 32:231-247. [PMID: 34231854 PMCID: PMC8634563 DOI: 10.1093/cercor/bhab205] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 05/10/2021] [Accepted: 06/04/2021] [Indexed: 11/14/2022] Open
Abstract
People often learn from the outcomes of their actions, even when these outcomes do not involve material rewards or punishments. How does our brain provide this flexibility? We combined behavior, computational modeling, and functional neuroimaging to probe whether learning from abstract novel outcomes harnesses the same circuitry that supports learning from familiar secondary reinforcers. Behavior and neuroimaging revealed that novel images can act as a substitute for rewards during instrumental learning, producing reliable reward-like signals in dopaminergic circuits. Moreover, we found evidence that prefrontal correlates of executive control may play a role in shaping flexible responses in reward circuits. These results suggest that learning from novel outcomes is supported by an interplay between high-level representations in prefrontal cortex and low-level responses in subcortical reward circuits. This interaction may allow for human reinforcement learning over arbitrarily abstract reward functions.
Collapse
Affiliation(s)
| | - Ian C Ballard
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA
| | - Beth Baribault
- Department of Psychology, University of California, Berkeley, CA 94704, USA
| | - Sonia J Bishop
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA
- Department of Psychology, University of California, Berkeley, CA 94704, USA
| | - Anne G E Collins
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA
- Department of Psychology, University of California, Berkeley, CA 94704, USA
| |
Collapse
|
36
|
Subramanian A, Chitlangia S, Baths V. Reinforcement learning and its connections with neuroscience and psychology. Neural Netw 2021; 145:271-287. [PMID: 34781215 DOI: 10.1016/j.neunet.2021.10.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 09/26/2021] [Accepted: 10/01/2021] [Indexed: 11/19/2022]
Abstract
Reinforcement learning methods have recently been very successful at performing complex sequential tasks like playing Atari games, Go and Poker. These algorithms have outperformed humans in several tasks by learning from scratch, using only scalar rewards obtained through interaction with their environment. While there certainly has been considerable independent innovation to produce such results, many core ideas in reinforcement learning are inspired by phenomena in animal learning, psychology and neuroscience. In this paper, we comprehensively review a large number of findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain. In doing so, we construct a mapping between various classes of modern RL algorithms and specific findings in both neurophysiological and behavioral literature. We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science.
Collapse
Affiliation(s)
- Ajay Subramanian
- Department of Psychology, New York University, New York, New York, 10003, USA; Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| | - Sharad Chitlangia
- Amazon; Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| | - Veeky Baths
- Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India; Department of Biological Sciences, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| |
Collapse
|
37
|
Langdon A, Botvinick M, Nakahara H, Tanaka K, Matsumoto M, Kanai R. Meta-learning, social cognition and consciousness in brains and machines. Neural Netw 2021; 145:80-89. [PMID: 34735893 DOI: 10.1016/j.neunet.2021.10.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 09/20/2021] [Accepted: 10/01/2021] [Indexed: 12/11/2022]
Abstract
The intersection between neuroscience and artificial intelligence (AI) research has created synergistic effects in both fields. While neuroscientific discoveries have inspired the development of AI architectures, new ideas and algorithms from AI research have produced new ways to study brain mechanisms. A well-known example is the case of reinforcement learning (RL), which has stimulated neuroscience research on how animals learn to adjust their behavior to maximize reward. In this review article, we cover recent collaborative work between the two fields in the context of meta-learning and its extension to social cognition and consciousness. Meta-learning refers to the ability to learn how to learn, such as learning to adjust hyperparameters of existing learning algorithms and how to use existing models and knowledge to efficiently solve new tasks. This meta-learning capability is important for making existing AI systems more adaptive and flexible to efficiently solve new tasks. Since this is one of the areas where there is a gap between human performance and current AI systems, successful collaboration should produce new ideas and progress. Starting from the role of RL algorithms in driving neuroscience, we discuss recent developments in deep RL applied to modeling prefrontal cortex functions. Even from a broader perspective, we discuss the similarities and differences between social cognition and meta-learning, and finally conclude with speculations on the potential links between intelligence as endowed by model-based RL and consciousness. For future work we highlight data efficiency, autonomy and intrinsic motivation as key research areas for advancing both fields.
Collapse
Affiliation(s)
- Angela Langdon
- Princeton Neuroscience Institute, Princeton University, USA
| | - Matthew Botvinick
- DeepMind, London, UK; Gatsby Computational Neuroscience Unit, University College London, London, UK
| | | | - Keiji Tanaka
- RIKEN Center for Brain Science, Wako, Saitama, Japan
| | - Masayuki Matsumoto
- Division of Biomedical Science, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan; Graduate School of Comprehensive Human Sciences, University of Tsukuba, Ibaraki, Japan; Transborder Medical Research Center, University of Tsukuba, Ibaraki, Japan
| | | |
Collapse
|
38
|
Value-free reinforcement learning: policy optimization as a minimal model of operant behavior. Curr Opin Behav Sci 2021; 41:114-121. [DOI: 10.1016/j.cobeha.2021.04.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
39
|
Knowlton CJ, Ziouziou TI, Hammer N, Roeper J, Canavier CC. Inactivation mode of sodium channels defines the different maximal firing rates of conventional versus atypical midbrain dopamine neurons. PLoS Comput Biol 2021; 17:e1009371. [PMID: 34534209 PMCID: PMC8480832 DOI: 10.1371/journal.pcbi.1009371] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 09/29/2021] [Accepted: 08/23/2021] [Indexed: 12/21/2022] Open
Abstract
Two subpopulations of midbrain dopamine (DA) neurons are known to have different dynamic firing ranges in vitro that correspond to distinct projection targets: the originally identified conventional DA neurons project to the dorsal striatum and the lateral shell of the nucleus accumbens, whereas an atypical DA population with higher maximum firing frequencies projects to prefrontal regions and other limbic regions including the medial shell of nucleus accumbens. Using a computational model, we show that previously identified differences in biophysical properties do not fully account for the larger dynamic range of the atypical population and predict that the major difference is that originally identified conventional cells have larger occupancy of voltage-gated sodium channels in a long-term inactivated state that recovers slowly; stronger sodium and potassium conductances during action potential firing are also predicted for the conventional compared to the atypical DA population. These differences in sodium channel gating imply that longer intervals between spikes are required in the conventional population for full recovery from long-term inactivation induced by the preceding spike, hence the lower maximum frequency. These same differences can also change the bifurcation structure to account for distinct modes of entry into depolarization block: abrupt versus gradual. The model predicted that in cells that have entered depolarization block, it is much more likely that an additional depolarization can evoke an action potential in conventional DA population. New experiments comparing lateral to medial shell projecting neurons confirmed this model prediction, with implications for differential synaptic integration in the two populations. We developed a theoretical and mathematical framework that could explain the major electrophysiological differences between the conventional midbrain dopamine (DA) neurons with a low maximum firing rate, and the more recently identified atypical DA neurons. Testable predictions from this framework were then verified with in vitro patch-clamp recordings from DA neurons with identified phenotypes and projection targets. Since different subpopulations of DA neurons participate in different circuits, and these circuits are likely differentially dysregulated in diseases such as addiction, Parkinson disease, and schizophrenia, it is important to identify the differences of their intrinsic electrophysiological properties as a prelude to developing more precisely targeted therapies.
Collapse
Affiliation(s)
- Christopher J. Knowlton
- Department of Cell Biology and Anatomy, School of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, United States of America
| | | | - Niklas Hammer
- Institut für Neurophysiologie, Goethe University, Frankfurt, Germany
| | - Jochen Roeper
- Institut für Neurophysiologie, Goethe University, Frankfurt, Germany
| | - Carmen C. Canavier
- Department of Cell Biology and Anatomy, School of Medicine, Louisiana State University Health Sciences Center, New Orleans, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
40
|
Yu LQ, Wilson RC, Nassar MR. Adaptive learning is structure learning in time. Neurosci Biobehav Rev 2021; 128:270-281. [PMID: 34144114 PMCID: PMC8422504 DOI: 10.1016/j.neubiorev.2021.06.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/19/2021] [Accepted: 06/11/2021] [Indexed: 10/21/2022]
Abstract
People use information flexibly. They often combine multiple sources of relevant information over time in order to inform decisions with little or no interference from intervening irrelevant sources. They adjust the degree to which they use new information over time rationally in accordance with environmental statistics and their own uncertainty. They can even use information gained in one situation to solve a problem in a very different one. Learning flexibly rests on the ability to infer the context at a given time, and therefore knowing which pieces of information to combine and which to separate. We review the psychological and neural mechanisms behind adaptive learning and structure learning to outline how people pool together relevant information, demarcate contexts, prevent interference between information collected in different contexts, and transfer information from one context to another. By examining all of these processes through the lens of optimal inference we bridge concepts from multiple fields to provide a unified multi-system view of how the brain exploits structure in time to optimize learning.
Collapse
Affiliation(s)
- Linda Q Yu
- Carney Institute for Brain Sciences, Brown University, 164 Angell Street, Providence, RI, 02912, USA.
| | - Robert C Wilson
- Department of Psychology, University of Arizona, Tucson, AZ, 85721, USA
| | - Matthew R Nassar
- Carney Institute for Brain Sciences, Brown University, 164 Angell Street, Providence, RI, 02912, USA
| |
Collapse
|
41
|
Chen Y. Neural Representation of Costs and Rewards in Decision Making. Brain Sci 2021; 11:1096. [PMID: 34439715 PMCID: PMC8391424 DOI: 10.3390/brainsci11081096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 08/17/2021] [Accepted: 08/18/2021] [Indexed: 11/16/2022] Open
Abstract
Decision making is crucial for animal survival because the choices they make based on their current situation could influence their future rewards and could have potential costs. This review summarises recent developments in decision making, discusses how rewards and costs could be encoded in the brain, and how different options are compared such that the most optimal one is chosen. The reward and cost are mainly encoded by the forebrain structures (e.g., anterior cingulate cortex, orbitofrontal cortex), and their value is updated through learning. The recent development on dopamine and the lateral habenula's role in reporting prediction errors and instructing learning will be emphasised. The importance of dopamine in powering the choice and accounting for the internal state will also be discussed. While the orbitofrontal cortex is the place where the state values are stored, the anterior cingulate cortex is more important when the environment is volatile. All of these structures compare different attributes of the task simultaneously, and the local competition of different neuronal networks allows for the selection of the most appropriate one. Therefore, the total value of the task is not encoded as a scalar quantity in the brain but, instead, as an emergent phenomenon, arising from the computation at different brain regions.
Collapse
Affiliation(s)
- Yixuan Chen
- Queens' College, University of Cambridge, Cambridgeshire CB3 9ET, UK
| |
Collapse
|
42
|
McDannald MA. Decision making: Serotonin goes for goal. Curr Biol 2021; 31:R726-R727. [PMID: 34102122 DOI: 10.1016/j.cub.2021.04.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Decision making is adaptive when our actions align with our goals. A new study shows that activity of dorsal raphe serotonin neurons is essential to adaptive decision making, permitting actions to reflect the current goal value.
Collapse
Affiliation(s)
- Michael A McDannald
- Department of Psychology and Neuroscience, Boston College, 514 McGuinn Hall, Chestnut Hill, MA, USA.
| |
Collapse
|
43
|
Wood AN. New roles for dopamine in motor skill acquisition: lessons from primates, rodents, and songbirds. J Neurophysiol 2021; 125:2361-2374. [PMID: 33978497 DOI: 10.1152/jn.00648.2020] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Motor learning is a core aspect of human life and appears to be ubiquitous throughout the animal kingdom. Dopamine, a neuromodulator with a multifaceted role in synaptic plasticity, may be a key signaling molecule for motor skill learning. Though typically studied in the context of reward-based associative learning, dopamine appears to be necessary for some types of motor learning. Mesencephalic dopamine structures are highly conserved among vertebrates, as are some of their primary targets within the basal ganglia, a subcortical circuit important for motor learning and motor control. With a focus on the benefits of cross-species comparisons, this review examines how "model-free" and "model-based" computational frameworks for understanding dopamine's role in associative learning may be applied to motor learning. The hypotheses that dopamine could drive motor learning either by functioning as a reward prediction error, through passive facilitating of normal basal ganglia activity, or through other mechanisms are examined in light of new studies using humans, rodents, and songbirds. Additionally, new paradigms that could enhance our understanding of dopamine's role in motor learning by bridging the gap between the theoretical literature on motor learning in humans and other species are discussed.
Collapse
Affiliation(s)
- A N Wood
- Department of Biology and Graduate Program in Neuroscience, Emory University, Atlanta, Georgia
| |
Collapse
|
44
|
Frömer R, Nassar MR, Bruckner R, Stürmer B, Sommer W, Yeung N. Response-based outcome predictions and confidence regulate feedback processing and learning. eLife 2021; 10:e62825. [PMID: 33929323 PMCID: PMC8121545 DOI: 10.7554/elife.62825] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 04/30/2021] [Indexed: 12/30/2022] Open
Abstract
Influential theories emphasize the importance of predictions in learning: we learn from feedback to the extent that it is surprising, and thus conveys new information. Here, we explore the hypothesis that surprise depends not only on comparing current events to past experience, but also on online evaluation of performance via internal monitoring. Specifically, we propose that people leverage insights from response-based performance monitoring - outcome predictions and confidence - to control learning from feedback. In line with predictions from a Bayesian inference model, we find that people who are better at calibrating their confidence to the precision of their outcome predictions learn more quickly. Further in line with our proposal, EEG signatures of feedback processing are sensitive to the accuracy of, and confidence in, post-response outcome predictions. Taken together, our results suggest that online predictions and confidence serve to calibrate neural error signals to improve the efficiency of learning.
Collapse
Affiliation(s)
- Romy Frömer
- Humboldt-Universität zu BerlinBerlinGermany
- Brown UniversityProvidenceUnited States
| | | | - Rasmus Bruckner
- Freie Universität BerlinBerlinGermany
- Max Planck School of CognitionLeipzigGermany
- International Max Planck Research School LIFEBerlinGermany
| | | | | | - Nick Yeung
- University of OxfordOxfordUnited Kingdom
| |
Collapse
|
45
|
Takei K, Fujita K, Kashimori Y. A Neural Mechanism of Cue-Outcome Expectancy Generated by the Interaction Between Orbitofrontal Cortex and Amygdala. Chem Senses 2021; 45:15-26. [PMID: 31599930 DOI: 10.1093/chemse/bjz066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Taste perception is important for animals to take adequate nutrients and avoid toxins for their survival. Appetitive and aversive behaviors are produced by value evaluation of taste and taste expectation caused by other sensations. The value evaluation, coupled with a cue presentation, produces outcome expectation and guides flexible behaviors when the environment is changed. Experimental studies demonstrated distinct functional roles of basolateral amygdala (ABL) and orbitofrontal cortex (OFC) in value evaluation and adaptive behavior. ABL is involved in generating a cue-outcome association, whereas OFC makes a contribution of generating a cue-triggered expectation to guide adaptive behavior. However, it remains unclear how ABL and OFC form their functional roles, with the learning of adaptive behavior. To address this issue, we focus on an odor discrimination task of rats and develop a computational model that consists of OFC and ABL, interacting with reward and decision systems. We present the neural mechanisms underlying the rapid formation of cue-outcome association in ABL and late behavioral adaptation mediated by OFC. Moreover, we offer 2 functions of cue-selective neurons in OFC: one is that the activation of cue-selective neurons transmits value information to decision area to guide behavior and another is that persistent activity of cue-selective neurons evokes a weak activity of taste-sensitive OFC neurons, leading to cue-outcome expectation. Our model further accounts for ABL and OFC responses caused by lesions of these areas. The results provide a computational framework of how ABL and OFC are functionally linked through their interactions with the reward and decision systems.
Collapse
Affiliation(s)
- Kenji Takei
- Department of Engineering Science, University of Electro-Communications, Chofu, Tokyo, Japan
| | - Kazuhisa Fujita
- Department of Engineering Science, University of Electro-Communications, Chofu, Tokyo, Japan.,Department of Clinical Engineering, Faculty of Health Sciences, Komatsu University, Komatsu, Ishikawa, Japan
| | - Yoshiki Kashimori
- Department of Engineering Science, University of Electro-Communications, Chofu, Tokyo, Japan
| |
Collapse
|
46
|
Abstract
Experiments have implicated dopamine in model-based reinforcement learning (RL). These findings are unexpected as dopamine is thought to encode a reward prediction error (RPE), which is the key teaching signal in model-free RL. Here we examine two possible accounts for dopamine's involvement in model-based RL: the first that dopamine neurons carry a prediction error used to update a type of predictive state representation called a successor representation, the second that two well established aspects of dopaminergic activity, RPEs and surprise signals, can together explain dopamine's involvement in model-based RL.
Collapse
|
47
|
|
48
|
|
49
|
|
50
|
Rmus M, McDougle SD, Collins AGE. The role of executive function in shaping reinforcement learning. Curr Opin Behav Sci 2021; 38:66-73. [DOI: 10.1016/j.cobeha.2020.10.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|