1
|
Burwell SC, Yan H, Lim SS, Shields BC, Tadross MR. Natural phasic inhibition of dopamine neurons signals cognitive rigidity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.09.593320. [PMID: 38766037 PMCID: PMC11100816 DOI: 10.1101/2024.05.09.593320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
When animals unexpectedly fail, their dopamine neurons undergo phasic inhibition that canonically drives extinction learning-a cognitive-flexibility mechanism for discarding outdated strategies. However, the existing evidence equates natural and artificial phasic inhibition, despite their spatiotemporal differences. Addressing this gap, we targeted a GABAA-receptor antagonist precisely to dopamine neurons, yielding three unexpected findings. First, this intervention blocked natural phasic inhibition selectively, leaving tonic activity unaffected. Second, blocking natural phasic inhibition accelerated extinction learning-opposite to canonical mechanisms. Third, our approach selectively benefitted perseverative mice, restoring rapid extinction without affecting new reward learning. Our findings reveal that extinction learning is rapid by default and slowed by natural phasic inhibition-challenging foundational learning theories, while delineating a synaptic mechanism and therapeutic target for cognitive rigidity.
Collapse
Affiliation(s)
- Sasha C.V. Burwell
- Department of Neurobiology, Duke University, Durham, NC
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD
| | - Haidun Yan
- Department of Biomedical Engineering, Duke University, NC
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD
| | - Shaun S.X. Lim
- Department of Biomedical Engineering, Duke University, NC
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD
| | - Brenda C. Shields
- Department of Biomedical Engineering, Duke University, NC
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD
| | - Michael R. Tadross
- Department of Neurobiology, Duke University, Durham, NC
- Department of Biomedical Engineering, Duke University, NC
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD
| |
Collapse
|
2
|
Jeong H, Taylor A, Floeder JR, Lohmann M, Mihalas S, Wu B, Zhou M, Burke DA, Namboodiri VMK. Mesolimbic dopamine release conveys causal associations. Science 2022; 378:eabq6740. [PMID: 36480599 PMCID: PMC9910357 DOI: 10.1126/science.abq6740] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Learning to predict rewards based on environmental cues is essential for survival. It is believed that animals learn to predict rewards by updating predictions whenever the outcome deviates from expectations, and that such reward prediction errors (RPEs) are signaled by the mesolimbic dopamine system-a key controller of learning. However, instead of learning prospective predictions from RPEs, animals can infer predictions by learning the retrospective cause of rewards. Hence, whether mesolimbic dopamine instead conveys a causal associative signal that sometimes resembles RPE remains unknown. We developed an algorithm for retrospective causal learning and found that mesolimbic dopamine release conveys causal associations but not RPE, thereby challenging the dominant theory of reward learning. Our results reshape the conceptual and biological framework for associative learning.
Collapse
Affiliation(s)
- Huijeong Jeong
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Annie Taylor
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Joseph R Floeder
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | | | - Stefan Mihalas
- Allen Institute for Brain Science, Seattle, WA, USA
- Department of Applied Mathematics, University of Washington, Seattle, WA, USA
| | - Brenda Wu
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Mingkang Zhou
- Department of Neurology, University of California, San Francisco, CA, USA
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Dennis A Burke
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Vijay Mohan K Namboodiri
- Department of Neurology, University of California, San Francisco, CA, USA
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
- Weill Institute for Neuroscience, Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| |
Collapse
|
3
|
Ferguson LM, Ahrens AM, Longyear LG, Aldridge JW. Neurons of the Ventral Tegmental Area Encode Individual Differences in Motivational "Wanting" for Reward Cues. J Neurosci 2020; 40:8951-8963. [PMID: 33046552 PMCID: PMC7659453 DOI: 10.1523/jneurosci.2947-19.2020] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 08/31/2020] [Accepted: 09/16/2020] [Indexed: 12/27/2022] Open
Abstract
It has been argued that the dopaminergic system is involved in the attribution of motivational value to reward predictive cues as well as prediction error. To evaluate, dopamine neurons were recorded from male rats performing a Pavlovian approach task containing cues that have both "predictive" and "incentive" properties. All animals learned the predictive nature of the cue (illuminated lever entry into cage), but some also found the cue to be attractive and were motivated toward it ("sign-trackers," STs). "Goal-trackers" (GTs) predominantly approached the location of reward receptacle. Rats were implanted with tetrodes for neural electrophysiological recordings in the ventral tegmental area. Cells were characterized by spike waveform shape and firing rate. Firing rates and magnitudes of responses in relation to Pavlovian behaviors, cue presentation, and reward delivery were assessed. We identified 103 dopamine and 141 nondopamine neurons. GTs and STs both showed responses to the initial lever presentation (CS1) and lever retraction (CS2). However, higher firing rates were sustained during the lever interaction period only in STs. Further, dopamine cells of STs showed a significantly higher proportion of cells responding to both CS1 and CS2. These are the first results to show that neurons from the VTA encode both predictive and incentive cues, support an important role for dopamine neurons in the attribution of incentive salience to reward-paired cues, and underscore the consequences of potential differences in motivational behavior between individuals.SIGNIFICANCE STATEMENT This project serves to determine whether dopamine neurons encode differences in cued approach behaviors and incentive salience. How neurons of the VTA affect signaling through the NAcc and subsequent dopamine release is still not well known. All cues that precede a reward are predictive in nature. Some, however, also have incentive value, in that they elicit approach toward them. We quantified the attribution of incentive salience through cue approach behavior and cue interaction, and the corresponding magnitude of VTA neural firing. We found dopamine neurons of the VTA encode strength of incentive salience of reward cues. This suggests that dopamine neurons specifically in the VTA encode motivation.
Collapse
Affiliation(s)
- Lindsay M Ferguson
- Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109
- Department of Neurosurgery, University of California Los Angeles, Los Angeles, California 90025
| | - Allison M Ahrens
- Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109
| | - Lauren G Longyear
- Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109
| | - J Wayne Aldridge
- Department of Psychology, University of Michigan, Ann Arbor, Michigan 48109
| |
Collapse
|
4
|
Mollick JA, Hazy TE, Krueger KA, Nair A, Mackie P, Herd SA, O'Reilly RC. A systems-neuroscience model of phasic dopamine. Psychol Rev 2020; 127:972-1021. [PMID: 32525345 PMCID: PMC8453660 DOI: 10.1037/rev0000199] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We describe a neurobiologically informed computational model of phasic dopamine signaling to account for a wide range of findings, including many considered inconsistent with the simple reward prediction error (RPE) formalism. The central feature of this PVLV framework is a distinction between a primary value (PV) system for anticipating primary rewards (Unconditioned Stimuli [USs]), and a learned value (LV) system for learning about stimuli associated with such rewards (CSs). The LV system represents the amygdala, which drives phasic bursting in midbrain dopamine areas, while the PV system represents the ventral striatum, which drives shunting inhibition of dopamine for expected USs (via direct inhibitory projections) and phasic pausing for expected USs (via the lateral habenula). Our model accounts for data supporting the separability of these systems, including individual differences in CS-based (sign-tracking) versus US-based learning (goal-tracking). Both systems use competing opponent-processing pathways representing evidence for and against specific USs, which can explain data dissociating the processes involved in acquisition versus extinction conditioning. Further, opponent processing proved critical in accounting for the full range of conditioned inhibition phenomena, and the closely related paradigm of second-order conditioning. Finally, we show how additional separable pathways representing aversive USs, largely mirroring those for appetitive USs, also have important differences from the positive valence case, allowing the model to account for several important phenomena in aversive conditioning. Overall, accounting for all of these phenomena strongly constrains the model, thus providing a well-validated framework for understanding phasic dopamine signaling. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
- Jessica A Mollick
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Thomas E Hazy
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Kai A Krueger
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Ananta Nair
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Prescott Mackie
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Seth A Herd
- Department of Psychology and Neuroscience, University of Colorado Boulder
| | - Randall C O'Reilly
- Department of Psychology and Neuroscience, University of Colorado Boulder
| |
Collapse
|
5
|
Aquili L, Bowman EM, Schmidt R. Occasion setters determine responses of putative DA neurons to discriminative stimuli. Neurobiol Learn Mem 2020; 173:107270. [PMID: 32565408 DOI: 10.1016/j.nlm.2020.107270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 05/20/2020] [Accepted: 06/15/2020] [Indexed: 11/16/2022]
Abstract
Midbrain dopamine (DA) neurons are involved in the processing of rewards and reward-predicting stimuli, possibly analogous to reinforcement learning reward prediction errors. Here we studied the activity of putative DA neurons (n = 37) recorded in the ventral tegmental area of rats (n = 6) performing a behavioural task involving occasion setting. In this task an occasion setter (OS) indicated that the relationship between a discriminative stimulus (DS) and reinforcement is in effect, so that reinforcement of bar pressing occurred only after the OS (tone or houselight) was followed by the DS (houselight or tone). We found that responses of putative DA cells to the DS were enhanced when preceded by the OS, as were behavioural responses to obtain rewards. Surprisingly though, we did not find a homogeneous increase in the mean activity of the population of putative DA neurons to the OS, contrary to predictions of standard temporal-difference models of DA neurons. Instead, putative DA neurons exhibited a heterogeneous response on a single unit level, so that some units increased and others decreased their activity as a response to the OS. Similarly, putative non-DA cells did not show a homogeneous response to the DS on a population level, but also had heterogeneous responses on a single unit level. The heterogeneity in the responses of neurons in the ventral tegmental area may reflect how DA neurons encode context and point to local differences in DA signalling.
Collapse
Affiliation(s)
- Luca Aquili
- School of Psychology and Neuroscience, University of St Andrews, St Mary's Quadrangle, South Street, St Andrews, Scotland KY16 9JP, UK.
| | - Eric M Bowman
- School of Psychology and Neuroscience, University of St Andrews, St Mary's Quadrangle, South Street, St Andrews, Scotland KY16 9JP, UK
| | - Robert Schmidt
- Department of Psychology, the University of Sheffield, Western Bank, Sheffield S10 2TP, UK
| |
Collapse
|
6
|
Hunger L, Kumar A, Schmidt R. Abundance Compensates Kinetics: Similar Effect of Dopamine Signals on D1 and D2 Receptor Populations. J Neurosci 2020; 40:2868-2881. [PMID: 32071139 PMCID: PMC7117896 DOI: 10.1523/jneurosci.1951-19.2019] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 12/03/2019] [Accepted: 12/29/2019] [Indexed: 11/21/2022] Open
Abstract
The neuromodulator dopamine plays a key role in motivation, reward-related learning, and normal motor function. The different affinity of striatal D1 and D2 dopamine receptor types has been argued to constrain the D1 and D2 signaling pathways to phasic and tonic dopamine signals, respectively. However, this view assumes that dopamine receptor kinetics are instantaneous so that the time courses of changes in dopamine concentration and changes in receptor occupation are basically identical. Here we developed a neurochemical model of dopamine receptor binding taking into account the different kinetics and abundance of D1 and D2 receptors in the striatum. Testing a large range of behaviorally-relevant dopamine signals, we found that the D1 and D2 dopamine receptor populations responded very similarly to tonic and phasic dopamine signals. Furthermore, because of slow unbinding rates, both receptor populations integrated dopamine signals over a timescale of minutes. Our model provides a description of how physiological dopamine signals translate into changes in dopamine receptor occupation in the striatum, and explains why dopamine ramps are an effective signal to occupy dopamine receptors. Overall, our model points to the importance of taking into account receptor kinetics for functional considerations of dopamine signaling.SIGNIFICANCE STATEMENT Current models of basal ganglia function are often based on a distinction of two types of dopamine receptors, D1 and D2, with low and high affinity, respectively. Thereby, phasic dopamine signals are believed to mostly affect striatal neurons with D1 receptors, and tonic dopamine signals are believed to mostly affect striatal neurons with D2 receptors. This view does not take into account the rates for the binding and unbinding of dopamine to D1 and D2 receptors. By incorporating these kinetics into a computational model we show that D1 and D2 receptors both respond to phasic and tonic dopamine signals. This has implications for the processing of reward-related and motivational signals in the basal ganglia.
Collapse
Affiliation(s)
- Lars Hunger
- Department of Psychology, University of Sheffield, Sheffield S1 2LT, United Kingdom, and
| | - Arvind Kumar
- Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden
| | - Robert Schmidt
- Department of Psychology, University of Sheffield, Sheffield S1 2LT, United Kingdom, and
| |
Collapse
|
7
|
Sheynin J, Baetu I, Collins-Praino LE, Myers CE, Winwood-Smith R, Moustafa AA. Maladaptive avoidance patterns in Parkinson's disease are exacerbated by symptoms of depression. Behav Brain Res 2020; 382:112473. [PMID: 31935419 DOI: 10.1016/j.bbr.2020.112473] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 01/05/2020] [Accepted: 01/05/2020] [Indexed: 11/26/2022]
Abstract
Parkinson's disease (PD) is a chronic, progressive neurodegenerative disorder, characterized by a loss of dopaminergic neurons in the substantia nigra pars compacta. Given that dopamine is critically involved in learning and other cognitive processes, such as working memory, dopamine loss in PD has been linked both to learning abnormalities and to cognitive dysfunction more generally in the disease. It is unclear, however, whether avoidance behavior is impacted in PD. This is significant, as this type of instrumental behavior plays an important role in both decision-making and emotional (dys) function. Consequently, the aim of the present study was to examine avoidance learning and operant extinction in PD using a computer-based task. On this task, participants control a spaceship and attempt to shoot an enemy spaceship to gain points. They also learn to hide in safe areas to protect from (i.e., avoid) aversive events (on-screen explosions and point loss). The results showed that patients with PD (N = 25) acquired an avoidance response during aversive periods to the same extent as healthy age-matched controls (N = 19); however, patients demonstrated greater hiding during safe periods not associated with aversive events, which could represent maladaptive generalization of the avoidance response. Furthermore, this impairment was more pronounced during the extinction phase, and in patients who reported higher levels of depression. These results demonstrate for the first time that PD is associated with maladaptive avoidance patterns, which could possibly contribute to the emergence of depression in the disease.
Collapse
Affiliation(s)
- Jony Sheynin
- Veterans Affairs Ann Arbor Healthcare System, Ann Arbor, MI, USA; Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA.
| | - Irina Baetu
- School of Psychology, University of Adelaide, Adelaide, SA, Australia
| | - Lyndsey E Collins-Praino
- Department of Medical Sciences, Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Catherine E Myers
- Department of Veterans Affairs, New Jersey Health Care System, East Orange, NJ, USA; Department of Pharmacology, Physiology & Neuroscience, New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - Robyn Winwood-Smith
- School of Social Sciences and Psychology, Western Sydney University, Sydney, NSW, Australia
| | - Ahmed A Moustafa
- School of Social Sciences and Psychology, Western Sydney University, Sydney, NSW, Australia; The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, NSW, Australia
| |
Collapse
|
8
|
Song MR, Lee SW. Dynamic resource allocation during reinforcement learning accounts for ramping and phasic dopamine activity. Neural Netw 2020; 126:95-107. [PMID: 32203877 DOI: 10.1016/j.neunet.2020.03.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 01/22/2020] [Accepted: 03/02/2020] [Indexed: 11/29/2022]
Abstract
For an animal to learn about its environment with limited motor and cognitive resources, it should focus its resources on potentially important stimuli. However, too narrow focus is disadvantageous for adaptation to environmental changes. Midbrain dopamine neurons are excited by potentially important stimuli, such as reward-predicting or novel stimuli, and allocate resources to these stimuli by modulating how an animal approaches, exploits, explores, and attends. The current study examined the theoretical possibility that dopamine activity reflects the dynamic allocation of resources for learning. Dopamine activity may transition between two patterns: (1) phasic responses to cues and rewards, and (2) ramping activity arising as the agent approaches the reward. Phasic excitation has been explained by prediction errors generated by experimentally inserted cues. However, when and why dopamine activity transitions between the two patterns remain unknown. By parsimoniously modifying a standard temporal difference (TD) learning model to accommodate a mixed presentation of both experimental and environmental stimuli, we simulated dopamine transitions and compared them with experimental data from four different studies. The results suggested that dopamine transitions from ramping to phasic patterns as the agent focuses its resources on a small number of reward-predicting stimuli, thus leading to task dimensionality reduction. The opposite occurs when the agent re-distributes its resources to adapt to environmental changes, resulting in task dimensionality expansion. This research elucidates the role of dopamine in a broader context, providing a potential explanation for the diverse repertoire of dopamine activity that cannot be explained solely by prediction error.
Collapse
Affiliation(s)
- Minryung R Song
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, South Korea
| | - Sang Wan Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, South Korea; Program of Brain and Cognitive Engineering, Daejeon, 34141, South Korea; KAIST Institute for Health, Science, and Technology, Daejeon, 34141, South Korea; KAIST Institute for Artificial Intelligence, Daejeon, 34141, South Korea; KAIST Center for Neuroscience-inspired AI, Daejeon, 34141, South Korea.
| |
Collapse
|
9
|
Fischbach S, Janak PH. Decreases in Cued Reward Seeking After Reward-Paired Inhibition of Mesolimbic Dopamine. Neuroscience 2019; 412:259-269. [PMID: 31029728 DOI: 10.1016/j.neuroscience.2019.04.035] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 04/03/2019] [Accepted: 04/13/2019] [Indexed: 01/07/2023]
Abstract
Reward-paired optogenetic manipulation of dopamine neurons can increase or decrease behavioral responding to antecedent cues when subjects have the opportunity for new learning, in accordance with a dopamine-mediated error learning signal. Here we examined the impact of reward-paired dopamine neuron inhibition on behavioral responding to reward-predictive cues after subjects had learned. We trained male TH-IRES-Cre mice to lever press for food reward in a progressive ratio procedure, a 2-cue choice procedure, or when continuously reinforced; in all procedures, completion of the response requirement was signaled by an auditory cue presented prior to food delivery. After training, mice underwent successive sessions in which optogenetic inhibition of dopamine neurons was triggered during food receipt. Rather than mimic brief inhibitions associated with negative reward prediction errors, we applied inhibition throughout the ingestion period on each trial. We found in all procedures that optogenetic inhibition of dopamine neurons during reward receipt decreased behavioral responding to the preceding reward-predictive cue over days, a behavioral change observed during time periods without optogenetic neuronal inhibition. Extinction-like behavioral responding was selective for learned associations: it was observed in the 2-cue choice procedure in which each subject was trained on two associations and inhibition was paired with reward for only one of the associations. Thus, inhibition during reward receipt can decrease responding to reward-predictive cues, sharing some features of behavioral extinction. These findings suggest changes in mesolimbic dopaminergic transmission at the time of experienced reward impacts subsequent responding to cues in well-trained subjects as predicted for a learning signal.
Collapse
Affiliation(s)
- Sarah Fischbach
- Neuroscience Graduate Program, University of California at San Francisco, San Francisco, CA 94158, USA
| | - Patricia H Janak
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA; The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA; Kavli Neuroscience Discovery Institute, Johns Hopkins School of Medicine, Baltimore, MD 21205.
| |
Collapse
|
10
|
Gillis ZS, Morrison SE. Sign Tracking and Goal Tracking Are Characterized by Distinct Patterns of Nucleus Accumbens Activity. eNeuro 2019; 6:ENEURO.0414-18.2019. [PMID: 30886890 PMCID: PMC6419996 DOI: 10.1523/eneuro.0414-18.2019] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 02/11/2019] [Accepted: 02/18/2019] [Indexed: 01/17/2023] Open
Abstract
During Pavlovian conditioning, if a cue (e.g., lever extension) predicts reward delivery in a different location (e.g., a food magazine), some individuals will come to approach and interact with the cue, a behavior known as sign tracking (ST), and others will approach the site of reward, a behavior known as goal tracking (GT). In rats, the acquisition of ST versus GT behavior is associated with distinct profiles of dopamine release in the nucleus accumbens (NAc), but it is unknown whether it is associated with different patterns of accumbens neural activity. Therefore, we recorded from individual neurons in the NAc core during the acquisition, maintenance, and extinction of ST and GT behavior. Even though NAc dopamine is specifically important for the acquisition and expression of ST, we found that cue-evoked excitatory responses encode the vigor of both ST and GT behavior. In contrast, among sign trackers only, there was a prominent decrease in reward-related activity over the course of training, which may reflect the decreasing reward prediction error encoded by phasic dopamine. Finally, both behavior and cue-evoked activity were relatively resistant to extinction in sign trackers, as compared with goal trackers, although a subset of neurons in both groups retained their cue-evoked responses. Overall, the results point to the convergence of multiple forms of reward learning in the NAc.
Collapse
Affiliation(s)
- Zachary S. Gillis
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, PA 15260
| | - Sara E. Morrison
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
11
|
The timing of action determines reward prediction signals in identified midbrain dopamine neurons. Nat Neurosci 2018; 21:1563-1573. [PMID: 30323275 PMCID: PMC6226028 DOI: 10.1038/s41593-018-0245-7] [Citation(s) in RCA: 116] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 08/20/2018] [Indexed: 11/25/2022]
Abstract
Animals adapt behavior in response to informative sensory cues using multiple brain circuits. The activity of midbrain dopamine (mDA) neurons is thought to convey a critical teaching signal: reward prediction error (RPE). Although RPE signals are thought to be essential to learning, little is known about the dynamic changes in mDA neuron activity as animals learn about novel sensory cues and appetitive rewards. Here we describe a large dataset of cell-attached recordings of identified dopaminergic neurons as naïve mice learned a novel cue-reward association. During learning mDA neuron activity results from summation of sensory cue-related and movement initiation-related response components. These components are both a function of reward expectation yet dissociable. Learning produces an increasingly precise coordination of action initiation following sensory cues that results in apparent RPE correlates. Our data thus provide new insights into circuit mechanisms underlying a critical computation in a highly conserved learning circuit.
Collapse
|
12
|
Langdon AJ, Sharpe MJ, Schoenbaum G, Niv Y. Model-based predictions for dopamine. Curr Opin Neurobiol 2018; 49:1-7. [PMID: 29096115 PMCID: PMC6034703 DOI: 10.1016/j.conb.2017.10.006] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Revised: 10/07/2017] [Accepted: 10/09/2017] [Indexed: 01/16/2023]
Abstract
Phasic dopamine responses are thought to encode a prediction-error signal consistent with model-free reinforcement learning theories. However, a number of recent findings highlight the influence of model-based computations on dopamine responses, and suggest that dopamine prediction errors reflect more dimensions of an expected outcome than scalar reward value. Here, we review a selection of these recent results and discuss the implications and complications of model-based predictions for computational theories of dopamine and learning.
Collapse
Affiliation(s)
- Angela J Langdon
- Princeton Neuroscience Institute & Department of Psychology, Princeton University, Princeton, NJ 08540, United States.
| | - Melissa J Sharpe
- Princeton Neuroscience Institute & Department of Psychology, Princeton University, Princeton, NJ 08540, United States; National Institute on Drug Abuse, Baltimore, MD 21224, United States; School of Psychology, University of New South Wales, Australia
| | | | - Yael Niv
- Princeton Neuroscience Institute & Department of Psychology, Princeton University, Princeton, NJ 08540, United States
| |
Collapse
|
13
|
Kato A, Morita K. Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation. PLoS Comput Biol 2016; 12:e1005145. [PMID: 27736881 PMCID: PMC5063413 DOI: 10.1371/journal.pcbi.1005145] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 09/14/2016] [Indexed: 12/12/2022] Open
Abstract
It has been suggested that dopamine (DA) represents reward-prediction-error (RPE) defined in reinforcement learning and therefore DA responds to unpredicted but not predicted reward. However, recent studies have found DA response sustained towards predictable reward in tasks involving self-paced behavior, and suggested that this response represents a motivational signal. We have previously shown that RPE can sustain if there is decay/forgetting of learned-values, which can be implemented as decay of synaptic strengths storing learned-values. This account, however, did not explain the suggested link between tonic/sustained DA and motivation. In the present work, we explored the motivational effects of the value-decay in self-paced approach behavior, modeled as a series of ‘Go’ or ‘No-Go’ selections towards a goal. Through simulations, we found that the value-decay can enhance motivation, specifically, facilitate fast goal-reaching, albeit counterintuitively. Mathematical analyses revealed that underlying potential mechanisms are twofold: (1) decay-induced sustained RPE creates a gradient of ‘Go’ values towards a goal, and (2) value-contrasts between ‘Go’ and ‘No-Go’ are generated because while chosen values are continually updated, unchosen values simply decay. Our model provides potential explanations for the key experimental findings that suggest DA's roles in motivation: (i) slowdown of behavior by post-training blockade of DA signaling, (ii) observations that DA blockade severely impairs effortful actions to obtain rewards while largely sparing seeking of easily obtainable rewards, and (iii) relationships between the reward amount, the level of motivation reflected in the speed of behavior, and the average level of DA. These results indicate that reinforcement learning with value-decay, or forgetting, provides a parsimonious mechanistic account for the DA's roles in value-learning and motivation. Our results also suggest that when biological systems for value-learning are active even though learning has apparently converged, the systems might be in a state of dynamic equilibrium, where learning and forgetting are balanced. Dopamine (DA) has been suggested to have two reward-related roles: (1) representing reward-prediction-error (RPE), and (2) providing motivational drive. Role(1) is based on the physiological results that DA responds to unpredicted but not predicted reward, whereas role(2) is supported by the pharmacological results that blockade of DA signaling causes motivational impairments such as slowdown of self-paced behavior. So far, these two roles are considered to be played by two different temporal patterns of DA signals: role(1) by phasic signals and role(2) by tonic/sustained signals. However, recent studies have found sustained DA signals with features indicative of both roles (1) and (2), complicating this picture. Meanwhile, whereas synaptic/circuit mechanisms for role(1), i.e., how RPE is calculated in the upstream of DA neurons and how RPE-dependent update of learned-values occurs through DA-dependent synaptic plasticity, have now become clarified, mechanisms for role(2) remain unclear. In this work, we modeled self-paced behavior by a series of ‘Go’ or ‘No-Go’ selections in the framework of reinforcement-learning assuming DA's role(1), and demonstrated that incorporation of decay/forgetting of learned-values, which is presumably implemented as decay of synaptic strengths storing learned-values, provides a potential unified mechanistic account for the DA's two roles, together with its various temporal patterns.
Collapse
Affiliation(s)
- Ayaka Kato
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
- * E-mail:
| |
Collapse
|
14
|
Li Y, Nakae K, Ishii S, Naoki H. Uncertainty-Dependent Extinction of Fear Memory in an Amygdala-mPFC Neural Circuit Model. PLoS Comput Biol 2016; 12:e1005099. [PMID: 27617747 PMCID: PMC5019407 DOI: 10.1371/journal.pcbi.1005099] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 08/11/2016] [Indexed: 11/29/2022] Open
Abstract
Uncertainty of fear conditioning is crucial for the acquisition and extinction of fear memory. Fear memory acquired through partial pairings of a conditioned stimulus (CS) and an unconditioned stimulus (US) is more resistant to extinction than that acquired through full pairings; this effect is known as the partial reinforcement extinction effect (PREE). Although the PREE has been explained by psychological theories, the neural mechanisms underlying the PREE remain largely unclear. Here, we developed a neural circuit model based on three distinct types of neurons (fear, persistent and extinction neurons) in the amygdala and medial prefrontal cortex (mPFC). In the model, the fear, persistent and extinction neurons encode predictions of net severity, of unconditioned stimulus (US) intensity, and of net safety, respectively. Our simulation successfully reproduces the PREE. We revealed that unpredictability of the US during extinction was represented by the combined responses of the three types of neurons, which are critical for the PREE. In addition, we extended the model to include amygdala subregions and the mPFC to address a recent finding that the ventral mPFC (vmPFC) is required for consolidating extinction memory but not for memory retrieval. Furthermore, model simulations led us to propose a novel procedure to enhance extinction learning through re-conditioning with a stronger US; strengthened fear memory up-regulates the extinction neuron, which, in turn, further inhibits the fear neuron during re-extinction. Thus, our models increased the understanding of the functional roles of the amygdala and vmPFC in the processing of uncertainty in fear conditioning and extinction. Animals live in environments that contain uncertainty. To adapt to uncertain situations, they flexibly learn to associate environmental cues with rewards and punishments. Understanding how the brain processes uncertainty has remained an important issue in neuroscience. To address this question, we focused on neural processing in the amygdala and mPFC during fear conditioning and extinction. We developed a neural circuit model that incorporates distinct neural populations in the amygdala and mPFC. Our model first successfully reproduced uncertainty-dependent resistance to the extinction of fear memory. An extension of the model provided a possible explanation for observations made during optogenetic manipulation of the ventral mPFC. Finally, we proposed a procedure to accelerate the efficacy of subsequent extinction based on our model.
Collapse
Affiliation(s)
- Yuzhe Li
- Graduate School of Biostudies, Kyoto University, Kyoto, Japan
| | - Ken Nakae
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Shin Ishii
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Honda Naoki
- Imaging Platform of Spatio-temporal Information, Graduate School of Medicine, Kyoto University, Kyoto, Japan
- * E-mail:
| |
Collapse
|
15
|
Modelling ADHD: A review of ADHD theories through their predictions for computational models of decision-making and reinforcement learning. Neurosci Biobehav Rev 2016; 71:633-656. [PMID: 27608958 DOI: 10.1016/j.neubiorev.2016.09.002] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Revised: 08/31/2016] [Accepted: 09/04/2016] [Indexed: 01/13/2023]
Abstract
Attention deficit hyperactivity disorder (ADHD) is characterized by altered decision-making (DM) and reinforcement learning (RL), for which competing theories propose alternative explanations. Computational modelling contributes to understanding DM and RL by integrating behavioural and neurobiological findings, and could elucidate pathogenic mechanisms behind ADHD. This review of neurobiological theories of ADHD describes predictions for the effect of ADHD on DM and RL as described by the drift-diffusion model of DM (DDM) and a basic RL model. Empirical studies employing these models are also reviewed. While theories often agree on how ADHD should be reflected in model parameters, each theory implies a unique combination of predictions. Empirical studies agree with the theories' assumptions of a lowered DDM drift rate in ADHD, while findings are less conclusive for boundary separation. The few studies employing RL models support a lower choice sensitivity in ADHD, but not an altered learning rate. The discussion outlines research areas for further theoretical refinement in the ADHD field.
Collapse
|
16
|
Kim Y, Simon NW, Wood J, Moghaddam B. Reward Anticipation Is Encoded Differently by Adolescent Ventral Tegmental Area Neurons. Biol Psychiatry 2016; 79:878-86. [PMID: 26067679 PMCID: PMC4636980 DOI: 10.1016/j.biopsych.2015.04.026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Revised: 04/23/2015] [Accepted: 04/28/2015] [Indexed: 11/23/2022]
Abstract
BACKGROUND Elucidating the neurobiology of the adolescent brain is fundamental to our understanding of the etiology of psychiatric disorders such as schizophrenia and addiction, the symptoms of which often manifest during this developmental period. Dopamine neurons in the ventral tegmental area (VTA) are strongly implicated in adolescent behavioral and psychiatric vulnerabilities, but little is known about how adolescent VTA neurons encode information during motivated behavior. METHODS We recorded daily from VTA neurons in adolescent and adult rats during learning and maintenance of a cued, reward-motivated instrumental task and extinction from this task. RESULTS During performance of the same motivated behavior, identical events were encoded differently by adult and adolescent VTA neurons. Adolescent VTA neurons with dopamine-like characteristics lacked a reward anticipation signal and showed a smaller response to reward delivery compared with adults. After extinction, however, these neurons maintained a strong phasic response to cues formerly predictive of reward opportunity. CONCLUSIONS Anticipatory neuronal activity in the VTA supports preparatory attention and is implicated in error prediction signaling. Absence of this activity, combined with persistent representations of previously rewarded experiences, may provide a mechanism for rash decision making in adolescents.
Collapse
Affiliation(s)
- Yunbok Kim
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Nicholas W Simon
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Jesse Wood
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Bita Moghaddam
- Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania..
| |
Collapse
|
17
|
Degoulet M, Stelly CE, Ahn KC, Morikawa H. L-type Ca²⁺ channel blockade with antihypertensive medication disrupts VTA synaptic plasticity and drug-associated contextual memory. Mol Psychiatry 2016; 21:394-402. [PMID: 26100537 PMCID: PMC4689680 DOI: 10.1038/mp.2015.84] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2014] [Revised: 04/14/2015] [Accepted: 04/30/2015] [Indexed: 02/08/2023]
Abstract
Drug addiction is driven, in part, by powerful and enduring memories of sensory cues associated with drug intake. As such, relapse to drug use during abstinence is frequently triggered by an encounter with drug-associated cues, including the drug itself. L-type Ca(2+) channels (LTCCs) are known to regulate different forms of synaptic plasticity, the major neural substrate for learning and memory, in various brain areas. Long-term potentiation (LTP) of NMDA receptor (NMDAR)-mediated glutamatergic transmission in the ventral tegmental area (VTA) may contribute to the increased motivational valence of drug-associated cues triggering relapse. In this study, using rat brain slices, we found that isradipine, a general LTCC antagonist used as antihypertensive medication, not only blocks the induction of NMDAR LTP but also promotes the reversal of previously induced LTP in the VTA. In behaving rats, isradipine injected into the VTA suppressed the acquisition of cocaine-paired contextual cue memory assessed using a conditioned place preference (CPP) paradigm. Furthermore, administration of isradipine or a CaV1.3 subtype-selective LTCC antagonist (systemic or intra-VTA) before a single extinction or reinstatement session, while having no immediate effect at the time of administration, abolished previously acquired cocaine and alcohol (ethanol) CPP on subsequent days. Notably, CPP thus extinguished cannot be reinstated by drug re-exposure, even after 2 weeks of withdrawal. These results suggest that LTCC blockade during exposure to drug-associated cues may cause unlearning of the increased valence of those cues, presumably via reversal of glutamatergic synaptic plasticity in the VTA.
Collapse
Affiliation(s)
- Mickael Degoulet
- Department of Neuroscience and Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, TX 78712, USA
| | - Claire E. Stelly
- Department of Neuroscience and Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, TX 78712, USA
| | - Kee-Chan Ahn
- Department of Neuroscience and Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, TX 78712, USA
| | - Hitoshi Morikawa
- Department of Neuroscience and Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, Austin, TX 78712, USA,Corresponding author: Hitoshi Morikawa, Department of Neuroscience and Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, 2400 Speedway, Austin, TX 78712, USA., Tel: 1-512-232-9299, Fax: 1-512-471-3878,
| |
Collapse
|
18
|
A Specific Component of the Evoked Potential Mirrors Phasic Dopamine Neuron Activity during Conditioning. J Neurosci 2015. [PMID: 26203140 DOI: 10.1523/jneurosci.4096-14.2015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Midbrain dopamine (DA) neurons are thought to be a critical node in the circuitry that mediates reward learning. DA neurons receive diverse inputs from regions distributed throughout the neuraxis from frontal neocortex to the mesencephalon. While a great deal is known about changes in the activity of individual DA neurons during learning, much less is known about the functional changes in the microcircuits in which DA neurons are embedded. Here we used local field potentials recorded from the midbrain of behaving mice to show that the midbrain evoked potential (mEP) faithfully reflects the temporal and spatial structure of the phasic response of midbrain neuron populations during conditioning. By comparing the mEP to simultaneously recorded single units, we identified specific components of the mEP that corresponded to phasic DA and non-DA responses to salient stimuli. The DA component of the mEP emerged with the acquisition of a conditioned stimulus, was extinguished following changes in reinforcement contingency, and could be inhibited by pharmacological manipulations that attenuate the phasic responses of DA neurons. In contrast to single-unit recordings, the mEP permitted relatively dense sampling of the midbrain circuit during conditioning and thus could be used to reveal the spatiotemporal structure of multiple intermingled midbrain circuits. Finally, the mEP response was stable for months and thus provides a new approach to study long-term changes in the organization of ventral midbrain microcircuits during learning. Significance statement: Neurons that synthesize and release the neurotransmitter dopamine play a critical role in voluntary reward-seeking behavior. Much of our insight into the function of dopamine neurons comes from recordings of individual cells in behaving animals; however, it is notoriously difficult to record from dopamine neurons due to their sparsity and depth, as well as the presence of intermingled non-dopaminergic neurons. Here we show that much of the information that can be learned from recordings of individual dopamine and non-dopamine neurons is also revealed by changes in specific components of the local field potential. This technique provides an accessible measurement that could prove critical to our burgeoning understanding of the molecular, functional, and anatomical diversity of neuron populations in the midbrain.
Collapse
|
19
|
Alfei JM, Ferrer Monti RI, Molina VA, Bueno AM, Urcelay GP. Prediction error and trace dominance determine the fate of fear memories after post-training manipulations. Learn Mem 2015; 22:385-400. [PMID: 26179232 PMCID: PMC4509917 DOI: 10.1101/lm.038513.115] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 06/05/2015] [Indexed: 01/15/2023]
Abstract
Different mnemonic outcomes have been observed when associative memories are reactivated by CS exposure and followed by amnestics. These outcomes include mere retrieval, destabilization-reconsolidation, a transitional period (which is insensitive to amnestics), and extinction learning. However, little is known about the interaction between initial learning conditions and these outcomes during a reinforced or nonreinforced reactivation. Here we systematically combined temporally specific memories with different reactivation parameters to observe whether these four outcomes are determined by the conditions established during training. First, we validated two training regimens with different temporal expectations about US arrival. Then, using Midazolam (MDZ) as an amnestic agent, fear memories in both learning conditions were submitted to retraining either under identical or different parameters to the original training. Destabilization (i.e., susceptibly to MDZ) occurred when reactivation was reinforced, provided the occurrence of a temporal prediction error about US arrival. In subsequent experiments, both treatments were systematically reactivated by nonreinforced context exposure of different lengths, which allowed to explore the interaction between training and reactivation lengths. These results suggest that temporal prediction error and trace dominance determine the extent to which reactivation produces the different outcomes.
Collapse
Affiliation(s)
- Joaquín M Alfei
- Laboratorio de Psicología Experimental, Facultad de Psicología
| | | | - Victor A Molina
- Departamento de Farmacología, Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Córdoba, 5000, Argentina
| | - Adrián M Bueno
- Laboratorio de Psicología Experimental, Facultad de Psicología
| | - Gonzalo P Urcelay
- Department of Psychology and Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge CB2 3EB, United Kingdom
| |
Collapse
|
20
|
Actigraph evaluation of acupuncture for treating restless legs syndrome. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2015; 2015:343201. [PMID: 25763089 PMCID: PMC4339862 DOI: 10.1155/2015/343201] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Accepted: 11/26/2014] [Indexed: 01/01/2023]
Abstract
We evaluated the effects of acupuncture in patients with restless legs syndrome (RLS) by actigraph recordings. Among the 38 patients with RLS enrolled, 31 (M = 12, F = 19; mean age, 47.2 ± 9.7 years old) completed the study. Patients were treated with either standard acupuncture (n = 15) or randomized acupuncture (n = 16) in a single-blind manner for 6 weeks. Changes in nocturnal activity (NA) and early sleep activity (ESA) between week 0 (baseline), week 2, week 4, and week 6 were assessed using leg actigraph recordings, the International Restless Legs Syndrome Rating Scale (IRLSRS), and Epworth Sleepiness Scale (ESS). Standard but not randomized acupuncture reduced the abnormal leg activity of NA and ESA significantly in week 2, week 4, and week 6 based on the changes in the clinical scores for IRLSRS and ESS in week 4 and week 6 compared with the baseline. No side effects were observed. The results indicate that standard acupuncture might improve the abnormal leg activity in RLS patients and thus is a potentially suitable integrative treatment for long-term use.
Collapse
|
21
|
Paladini C, Roeper J. Generating bursts (and pauses) in the dopamine midbrain neurons. Neuroscience 2014; 282:109-21. [DOI: 10.1016/j.neuroscience.2014.07.032] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Revised: 07/18/2014] [Accepted: 07/21/2014] [Indexed: 01/01/2023]
|
22
|
Sunsay C, Rebec GV. Extinction and reinstatement of phasic dopamine signals in the nucleus accumbens core during Pavlovian conditioning. Behav Neurosci 2014; 128:579-87. [PMID: 25111335 PMCID: PMC4172664 DOI: 10.1037/bne0000012] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The prediction-error model of dopamine (DA) signaling has largely been confirmed with various appetitive Pavlovian conditioning procedures and has been supported in tests of Pavlovian extinction. Studies have repeatedly shown, however, that extinction does not erase the original memory of conditioning as the prediction-error model presumes, putting the model at odds with contemporary views that treat extinction as an episode of learning rather than unlearning of conditioning. Here, we combined fast-scan cyclic voltammetry (FSCV) with appetitive Pavlovian conditioning to assess DA release directly during extinction and reinstatement. DA was monitored in the nucleus accumbens core, which plays a key role in reward processing. Following at least 4 daily sessions of 16 tone-food pairings, fast-scan cyclic voltammetry was performed while rats received additional tone-food pairings followed by tone alone presentations (i.e., extinction). Acquisition memory was reinstated with noncontingent presentations of reward and then tested with cue presentation. Tone-food pairings produced transient (1- to 3-s) DA release in response to tone. During extinction, the amplitude of the DA response decreased significantly. Following presentation of 2 noncontingent food pellets, subsequent tone presentation reinstated the DA signal. Our results support the prediction-error model for appetitive Pavlovian extinction but not for reinstatement.
Collapse
Affiliation(s)
- Ceyhun Sunsay
- Department of Psychology, Indiana University Northwest
| | - George V Rebec
- Department of Psychological and Brain Sciences, Indiana University
| |
Collapse
|
23
|
Puig MV, Rose J, Schmidt R, Freund N. Dopamine modulation of learning and memory in the prefrontal cortex: insights from studies in primates, rodents, and birds. Front Neural Circuits 2014; 8:93. [PMID: 25140130 PMCID: PMC4122189 DOI: 10.3389/fncir.2014.00093] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2014] [Accepted: 07/18/2014] [Indexed: 02/02/2023] Open
Abstract
In this review, we provide a brief overview over the current knowledge about the role of dopamine transmission in the prefrontal cortex during learning and memory. We discuss work in humans, monkeys, rats, and birds in order to provide a basis for comparison across species that might help identify crucial features and constraints of the dopaminergic system in executive function. Computational models of dopamine function are introduced to provide a framework for such a comparison. We also provide a brief evolutionary perspective showing that the dopaminergic system is highly preserved across mammals. Even birds, following a largely independent evolution of higher cognitive abilities, have evolved a comparable dopaminergic system. Finally, we discuss the unique advantages and challenges of using different animal models for advancing our understanding of dopamine function in the healthy and diseased brain.
Collapse
Affiliation(s)
- M. Victoria Puig
- The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of TechnologyCambridge, MA, USA
| | - Jonas Rose
- The Picower Institute for Learning and Memory, Department of Brain and Cognitive Sciences, Massachusetts Institute of TechnologyCambridge, MA, USA
- Animal Physiology, Institute of Neurobiology, University of TübingenTübingen, Germany
| | - Robert Schmidt
- BrainLinks-BrainTools, Department of Biology, Bernstein Center Freiburg, University of FreiburgFreiburg, Germany
| | - Nadja Freund
- Department of Psychiatry and Psychotherapy, University of TübingenTübingen, Germany
| |
Collapse
|
24
|
Burkhardt JM, Adermark L. Locus of onset and subpopulation specificity of in vivo ethanol effect in the reciprocal ventral tegmental area-nucleus accumbens circuit. Neurochem Int 2014; 76:122-30. [PMID: 25058792 DOI: 10.1016/j.neuint.2014.07.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Revised: 07/10/2014] [Accepted: 07/14/2014] [Indexed: 12/13/2022]
Abstract
Elevated levels of dopamine in the nucleus accumbens (nAc) as a consequence of increased activation of dopaminergic neurons in the VTA are associated with the reinforcing properties of ethanol consumption, but whether the initiation of drug-response is connected to a direct activation of dopaminergic cell bodies in the VTA region or involves GABAergic neurons in VTA and/or the nAc is unclear. To this end, neuronal firing rate was recorded simultaneously in the VTA and nAc of awake and freely-moving C57BL6/J mice receiving an intraperitoneal (i.p.) injection of ethanol (0.75, 2.0, or 3.5g/kg) or saline. Recorded units were classified based on electrophysiological properties and the pharmacological response to the dopamine D2 receptor agonist quinpirole into putative dopaminergic (DA) neurons and fast-spiking or slow-spiking putative GABAergic neurons. Our data show that ethanol acutely decreases the firing frequency of GABAergic units in both the VTA and nAc in a dose-dependent manner, and enhances the firing rate of DA neurons. In order to define the onset of ethanol-induced rate changes normalized population vectors describing the collective firing rate of classes of neurons over time were generated and compared with saline-treatment. Population vectors of DA neurons in the VTA and GABAergic units in the nAc showed a significant deviation from the saline condition within 40s following ethanol-administration (2.0g/kg), while inhibition of GABAergic units in the VTA had a slower onset. In conclusion, the data presented here suggests that EtOH exerts a direct effect on DA firing frequency, but that decreased firing frequency of inhibitory neurons in VTA and nAc contributes to the dopamine-elevating properties of ethanol.
Collapse
Affiliation(s)
- John M Burkhardt
- Centre for Molecular Medicine Norway, Nordic EMBL Partnership, University of Oslo, Oslo, Norway; Champalimaud Neuroscience Programme, Champalimaud Center for the Unknown, Lisbon, Portugal
| | - Louise Adermark
- Addiction Biology Unit, Institute of Neuroscience and Physiology, Gothenburg University, Sweden.
| |
Collapse
|
25
|
Aquili L. The causal role between phasic midbrain dopamine signals and learning. Front Behav Neurosci 2014; 8:139. [PMID: 24795588 PMCID: PMC4007013 DOI: 10.3389/fnbeh.2014.00139] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 04/04/2014] [Indexed: 12/22/2022] Open
Affiliation(s)
- Luca Aquili
- Department of Psychology, Sunway University Bandar Sunway, Petaling Jaya, Malaysia
| |
Collapse
|
26
|
Song MR, Fellous JM. Value learning and arousal in the extinction of probabilistic rewards: the role of dopamine in a modified temporal difference model. PLoS One 2014; 9:e89494. [PMID: 24586823 PMCID: PMC3935866 DOI: 10.1371/journal.pone.0089494] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2013] [Accepted: 01/23/2014] [Indexed: 11/30/2022] Open
Abstract
Because most rewarding events are probabilistic and changing, the extinction of probabilistic rewards is important for survival. It has been proposed that the extinction of probabilistic rewards depends on arousal and the amount of learning of reward values. Midbrain dopamine neurons were suggested to play a role in both arousal and learning reward values. Despite extensive research on modeling dopaminergic activity in reward learning (e.g. temporal difference models), few studies have been done on modeling its role in arousal. Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards. By adding an arousal signal to a temporal difference model, we were able to simulate the extinction of probabilistic rewards and its dependence on the amount of learning. Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards. Using this model, we predicted that, by signaling the prediction error, dopamine determines the learned reward value that has to be extinguished during extinction and participates in regulating the size of the arousal signal that controls the learning rate. These predictions were supported by pharmacological experiments in rats.
Collapse
Affiliation(s)
- Minryung R. Song
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
| | - Jean-Marc Fellous
- Graduate Interdisciplinary Program in Neuroscience, University of Arizona, Tucson, Arizona, United States of America
- Department of Psychology, University of Arizona, Tucson, Arizona, United States of America
- Department of Applied Mathematics, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
27
|
Dopamine and extinction: a convergence of theory with fear and reward circuitry. Neurobiol Learn Mem 2013; 108:65-77. [PMID: 24269353 DOI: 10.1016/j.nlm.2013.11.007] [Citation(s) in RCA: 147] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Revised: 11/01/2013] [Accepted: 11/08/2013] [Indexed: 01/11/2023]
Abstract
Research on dopamine lies at the intersection of sophisticated theoretical and neurobiological approaches to learning and memory. Dopamine has been shown to be critical for many processes that drive learning and memory, including motivation, prediction error, incentive salience, memory consolidation, and response output. Theories of dopamine's function in these processes have, for the most part, been developed from behavioral approaches that examine learning mechanisms in reward-related tasks. A parallel and growing literature indicates that dopamine is involved in fear conditioning and extinction. These studies are consistent with long-standing ideas about appetitive-aversive interactions in learning theory and they speak to the general nature of cellular and molecular processes that underlie behavior. We review the behavioral and neurobiological literature showing a role for dopamine in fear conditioning and extinction. At a cellular level, we review dopamine signaling and receptor pharmacology, cellular and molecular events that follow dopamine receptor activation, and brain systems in which dopamine functions. At a behavioral level, we describe theories of learning and dopamine function that could describe the fundamental rules underlying how dopamine modulates different aspects of learning and memory processes.
Collapse
|
28
|
An elemental model of retrospective revaluation without within-compound associations. Learn Behav 2013; 42:22-38. [DOI: 10.3758/s13420-013-0112-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
29
|
Corlett PR, Cambridge V, Gardner JM, Piggot JS, Turner DC, Everitt JC, Arana FS, Morgan HL, Milton AL, Lee JL, Aitken MRF, Dickinson A, Everitt BJ, Absalom AR, Adapa R, Subramanian N, Taylor JR, Krystal JH, Fletcher PC. Ketamine effects on memory reconsolidation favor a learning model of delusions. PLoS One 2013; 8:e65088. [PMID: 23776445 PMCID: PMC3680467 DOI: 10.1371/journal.pone.0065088] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Accepted: 04/19/2013] [Indexed: 11/19/2022] Open
Abstract
Delusions are the persistent and often bizarre beliefs that characterise psychosis. Previous studies have suggested that their emergence may be explained by disturbances in prediction error-dependent learning. Here we set up complementary studies in order to examine whether such a disturbance also modulates memory reconsolidation and hence explains their remarkable persistence. First, we quantified individual brain responses to prediction error in a causal learning task in 18 human subjects (8 female). Next, a placebo-controlled within-subjects study of the impact of ketamine was set up on the same individuals. We determined the influence of this NMDA receptor antagonist (previously shown to induce aberrant prediction error signal and lead to transient alterations in perception and belief) on the evolution of a fear memory over a 72 hour period: they initially underwent Pavlovian fear conditioning; 24 hours later, during ketamine or placebo administration, the conditioned stimulus (CS) was presented once, without reinforcement; memory strength was then tested again 24 hours later. Re-presentation of the CS under ketamine led to a stronger subsequent memory than under placebo. Moreover, the degree of strengthening correlated with individual vulnerability to ketamine's psychotogenic effects and with prediction error brain signal. This finding was partially replicated in an independent sample with an appetitive learning procedure (in 8 human subjects, 4 female). These results suggest a link between altered prediction error, memory strength and psychosis. They point to a core disruption that may explain not only the emergence of delusional beliefs but also their persistence.
Collapse
Affiliation(s)
- Philip R Corlett
- Department of Psychiatry, Ribicoff Research Facility, Yale University, New Haven, Connecticut, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Porter-Stransky KA, Seiler JL, Day JJ, Aragona BJ. Development of behavioral preferences for the optimal choice following unexpected reward omission is mediated by a reduction of D2-like receptor tone in the nucleus accumbens. Eur J Neurosci 2013; 38:2572-88. [PMID: 23692625 DOI: 10.1111/ejn.12253] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Revised: 04/28/2013] [Accepted: 04/12/2013] [Indexed: 11/30/2022]
Abstract
To survive in a dynamic environment, animals must identify changes in resource availability and rapidly apply adaptive strategies to obtain resources that promote survival. We have utilised a behavioral paradigm to assess differences in foraging strategy when resource (reward) availability unexpectedly changes. When reward magnitude was reduced by 50% (receive one reward pellet instead of two), male and female rats developed a preference for the optimal choice by the second session. However, when an expected reward was omitted (receive no reward pellets instead of one), subjects displayed a robust preference for the optimal choice during the very first session. Previous research shows that, when an expected reward is omitted, dopamine neurons phasically decrease their firing rate, which is hypothesised to decrease dopamine release preferentially affecting D2-like receptors. As robust changes in behavioral preference were specific to reward omission, we tested this hypothesis and the functional role of D1- and D2-like receptors in the nucleus accumbens in mediating the rapid development of a behavioral preference for the rewarded option during reward omission in male rats. Blockade of both receptor types had no effect on this behavior; however, holding D2-like, but not D1-like, receptor tone via infusion of dopamine receptor agonists prevented the development of the preference for the rewarded option during reward omission. These results demonstrate that avoiding an outcome that has been tagged with aversive motivational properties is facilitated through decreased dopamine transmission and subsequent functional disruption of D2-like, but not D1-like, receptor tone in the nucleus accumbens.
Collapse
Affiliation(s)
- Kirsten A Porter-Stransky
- Department of Psychology, Biopsychology Area, University of Michigan, 530 Church Street, Ann Arbor, 48109 MI, USA.
| | | | | | | |
Collapse
|
31
|
Over-expectation generated in a complex appetitive goal-tracking task is capable of inducing memory reconsolidation. Psychopharmacology (Berl) 2013; 226:649-58. [PMID: 23239132 DOI: 10.1007/s00213-012-2934-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 11/19/2012] [Indexed: 10/27/2022]
Abstract
RATIONALE Discrepancies in an expected outcome have been demonstrated to result in modification of behaviour in both appetitive and aversive conditioning settings. OBJECTIVES In this study, we sought to establish whether overexpectation generated from compound conditioning with two previously rewarded stimuli was able to induce memory destabilisation and subsequent reconsolidation in a Pavlovian conditioned approach setting. RESULTS It was shown that 4 days, but not 1 day, of overexpectation training was required to induce memory reconsolidation, and this was disrupted by application of the NMDA subtype of glutamate receptor antagonist MK-801 prior to overexpectation training, but not by MK-801 application 6 h post-training. CONCLUSIONS These data provide evidence that the memories underlying Pavlovian conditioned approach do undergo reconsolidation and that such reconsolidation can be triggered by overexpectation. Therefore, the updating of appetitive conditioned stimulus and unconditioned stimulus associations underpinning conditioned responding in manners other than extinction training is likely achieved through memory reconsolidation.
Collapse
|
32
|
Li Y, Dalphin N, Hyland BI. Association with reward negatively modulates short latency phasic conditioned responses of dorsal raphe nucleus neurons in freely moving rats. J Neurosci 2013; 33:5065-78. [PMID: 23486976 PMCID: PMC6618993 DOI: 10.1523/jneurosci.5679-12.2013] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Revised: 01/23/2013] [Accepted: 02/02/2013] [Indexed: 01/23/2023] Open
Abstract
The dorsal raphe nucleus (DRN) is implicated in mood regulation, control of impulsive behavior, and in processing aversive and reward-related signals. DRN neurons show phasic responses to sensory stimuli, but whether association with reward modulates these responses is unknown. We recorded DRN neurons from rats in a contextual conditioned approach paradigm in which an auditory cue was either followed or not followed by reward, depending on a global context signal. Conditioned approach (licking) occurred after cues in the reward context, but was suppressed in the no-reward context. Many DRN neurons showed short-latency phasic activations in response to the cues. There was striking contextual bias, with more and stronger excitations in the no-reward context than in the reward context. Therefore, DRN activity scaled inversely with cue salience and with the probability of subsequent conditioned approach. Tonic changes were similarly discriminatory, with increases being dominant after cues in the no-reward context, when licking was suppressed, and tonic decreases in rate dominant after reward-predictive cues during expression of conditioned licking. Phasic and tonic DRN responses thus provide signals of consistent valence but over different timescales. The tonic changes in activity are consistent with previous data and hypotheses relating DRN activity to response suppression and impulse control. Phasic responses could contribute to this via online modulation of attention allocation through projections to sensory-processing regions.
Collapse
Affiliation(s)
- Yuhong Li
- Department of Physiology, School of Medical Sciences, and Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand
| | - Neil Dalphin
- Department of Physiology, School of Medical Sciences, and Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand
| | - Brian I. Hyland
- Department of Physiology, School of Medical Sciences, and Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand
| |
Collapse
|
33
|
Whitaker LR, Degoulet M, Morikawa H. Social deprivation enhances VTA synaptic plasticity and drug-induced contextual learning. Neuron 2013; 77:335-45. [PMID: 23352169 PMCID: PMC3559005 DOI: 10.1016/j.neuron.2012.11.022] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/20/2012] [Indexed: 01/08/2023]
Abstract
Drug addiction is driven, in part, by powerful drug-related memories. Deficits in social life, particularly during adolescence, increase addiction vulnerability. Social isolation in rodents has been used extensively to model the effects of deficient social experience, yet its impact on learning and memory processes underlying addiction remains elusive. Here, we show that social isolation of rats during a critical period of adolescence (postnatal days 21-42) enhances long-term potentiation of NMDA receptor (NMDAR)-mediated glutamatergic transmission in the ventral tegmental area (VTA). This enhancement, which is caused by an increase in metabotropic glutamate receptor-dependent Ca(2+) signaling, cannot be reversed by subsequent resocialization. Notably, memories of amphetamine- and ethanol-paired contextual stimuli are acquired faster and, once acquired, amphetamine-associated contextual memory is more resistant to extinction in socially isolated rats. We propose that NMDAR plasticity in the VTA may represent a neural substrate by which early life deficits in social experience increase addiction vulnerability.
Collapse
Affiliation(s)
- Leslie R. Whitaker
- Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Section of Neurobiology, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Institute for Neuroscience, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
| | - Mickael Degoulet
- Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Section of Neurobiology, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Institute for Neuroscience, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
| | - Hitoshi Morikawa
- Waggoner Center for Alcohol and Addiction Research, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Section of Neurobiology, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
- Institute for Neuroscience, University of Texas at Austin, 2400 Speedway, Austin, TX, 78712, USA
| |
Collapse
|
34
|
Abstract
The temporal-difference (TD) algorithm from reinforcement learning provides a simple method for incrementally learning predictions of upcoming events. Applied to classical conditioning, TD models suppose that animals learn a real-time prediction of the unconditioned stimulus (US) on the basis of all available conditioned stimuli (CSs). In the TD model, similar to other error-correction models, learning is driven by prediction errors--the difference between the change in US prediction and the actual US. With the TD model, however, learning occurs continuously from moment to moment and is not artificially constrained to occur in trials. Accordingly, a key feature of any TD model is the assumption about the representation of a CS on a moment-to-moment basis. Here, we evaluate the performance of the TD model with a heretofore unexplored range of classical conditioning tasks. To do so, we consider three stimulus representations that vary in their degree of temporal generalization and evaluate how the representation influences the performance of the TD model on these conditioning tasks.
Collapse
|
35
|
Rose J, Schiffer AM, Güntürkün O. Striatal dopamine D1 receptors are involved in the dissociation of learning based on reward-magnitude. Neuroscience 2013; 230:132-8. [DOI: 10.1016/j.neuroscience.2012.10.064] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Revised: 10/22/2012] [Accepted: 10/25/2012] [Indexed: 11/28/2022]
|
36
|
Saito Y, Matsumoto M, Yanagawa Y, Hiraide S, Inoue S, Kubo Y, Shimamura KI, Togashi H. Facilitation of fear extinction by the 5-HT(1A) receptor agonist tandospirone: possible involvement of dopaminergic modulation. Synapse 2012; 67:161-70. [PMID: 23152167 DOI: 10.1002/syn.21621] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Revised: 09/20/2012] [Accepted: 10/07/2012] [Indexed: 12/30/2022]
Abstract
Fear extinction-based exposure treatment is an important component of psychotherapy for anxiety disorders such as posttraumatic stress disorder (PTSD). Recent studies have focused on pharmacological approaches combined with exposure therapy to augment extinction. In this study, we elucidated the therapeutic potential of the serotonin 1A (5-HT(1A) ) receptor agonist tandospirone compared with the effects of the N-methyl-D-aspartate partial agonist D-cycloserine (DCS), focusing on the possible involvement of dopaminergic mechanisms. We used a rat model of juvenile stress [aversive footshock (FS)] exposure during the third postnatal week (3wFS). The 3wFS group exhibited extinction deficit reflected in sustained fear-related behavior and synaptic dysfunction in the hippocampal CA1 field and medial prefrontal cortex (mPFC), which are responsible for extinction processes. Tandospirone administration (5 mg/kg, i.p.) before and after the extinction trials ameliorated both the behavioral deficit and synaptic dysfunction, i.e., synaptic efficacy in the CA1 field and mPFC associated with extinction training and retrieval, respectively, was potentiated in the tandospirone-treated 3wFS group. Extracellular dopamine release in the mPFC was increased by extinction retrieval in the non-FS control group. This facilitation was not observed in the 3wFS group; however, tandospirone treatment increased cortical dopamine levels after extinction retrieval. DCS (15 mg/kg, i.p.) also ameliorated the extinction deficit in the 3wFS group, but impaired extinction in the non-FS control group. These results suggest that tandospirone has therapeutic potential for enhancing synaptic efficacy associated with extinction processes by involving dopaminergic mechanisms. Pharmacological agents that target cortical dopaminergic systems may provide new insights into the development of therapeutic treatments of anxiety disorders, including PTSD.
Collapse
Affiliation(s)
- Yasuhiro Saito
- Department of Pharmacology, School of Pharmaceutical Science, Health Sciences University of Hokkaido, Ishikari-Tobetsu 061-0293, Japan
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Neural signals of extinction in the inhibitory microcircuit of the ventral midbrain. Nat Neurosci 2012; 16:71-8. [PMID: 23222913 PMCID: PMC3563090 DOI: 10.1038/nn.3283] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 11/16/2012] [Indexed: 11/24/2022]
Abstract
Midbrain dopaminergic (DA) neurons are thought to guide learning via phasic elevations of firing in response to reward predicting stimuli. The circuit mechanism for these signals remains unclear. Using extracellular recording during associative learning we show that inhibitory neurons in the ventral midbrain of mice respond to salient auditory stimuli with a burst of activity that occurs prior to the onset of the phasic response of DA neurons. This population of inhibitory neurons exhibited enhanced responses during extinction and was anti correlated with the phasic response of simultaneously recorded DA neurons. Optogenetic stimulation suggested that this population was in part derived from inhibitory projection neurons of the substantia nigra that provide a robust monosynaptic inhibition of DA neurons. Our results thus elaborate upon the dynamic upstream circuits that shape the phasic activity of DA neurons and suggest that the inhibitory microcircuit of the midbrain is critical for new learning in extinction.
Collapse
|
38
|
Moustafa AA, Gilbertson MW, Orr SP, Herzallah MM, Servatius RJ, Myers CE. A model of amygdala-hippocampal-prefrontal interaction in fear conditioning and extinction in animals. Brain Cogn 2012; 81:29-43. [PMID: 23164732 DOI: 10.1016/j.bandc.2012.10.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Revised: 09/26/2012] [Accepted: 10/09/2012] [Indexed: 02/06/2023]
Abstract
Empirical research has shown that the amygdala, hippocampus, and ventromedial prefrontal cortex (vmPFC) are involved in fear conditioning. However, the functional contribution of each brain area and the nature of their interactions are not clearly understood. Here, we extend existing neural network models of the functional roles of the hippocampus in classical conditioning to include interactions with the amygdala and prefrontal cortex. We apply the model to fear conditioning, in which animals learn physiological (e.g. heart rate) and behavioral (e.g. freezing) responses to stimuli that have been paired with a highly aversive event (e.g. electrical shock). The key feature of our model is that learning of these conditioned responses in the central nucleus of the amygdala is modulated by two separate processes, one from basolateral amygdala and signaling a positive prediction error, and one from the vmPFC, via the intercalated cells of the amygdala, and signaling a negative prediction error. In addition, we propose that hippocampal input to both vmPFC and basolateral amygdala is essential for contextual modulation of fear acquisition and extinction. The model is sufficient to account for a body of data from various animal fear conditioning paradigms, including acquisition, extinction, reacquisition, and context specificity effects. Consistent with studies on lesioned animals, our model shows that damage to the vmPFC impairs extinction, while damage to the hippocampus impairs extinction in a different context (e.g., a different conditioning chamber from that used in initial training in animal experiments). We also discuss model limitations and predictions, including the effects of number of training trials on fear conditioning.
Collapse
Affiliation(s)
- Ahmed A Moustafa
- School of Social Sciences and Psychology, Marcs Institute for Brain and Behaviour, University of Western Sydney, Sydney, NSW, Australia.
| | | | | | | | | | | |
Collapse
|
39
|
Aggarwal M, Hyland BI, Wickens JR. Neural control of dopamine neurotransmission: implications for reinforcement learning. Eur J Neurosci 2012; 35:1115-23. [DOI: 10.1111/j.1460-9568.2012.08055.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
40
|
Li W, Doyon WM, Dani JA. Quantitative unit classification of ventral tegmental area neurons in vivo. J Neurophysiol 2012; 107:2808-20. [PMID: 22378178 DOI: 10.1152/jn.00575.2011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Neurons in the ventral tegmental area (VTA) synthesize several major neurotransmitters, including dopamine (DA), GABA, and glutamate. To classify VTA single-unit neural activity from freely moving rats, we used hierarchical agglomerative clustering and probability distributions as quantitative methods. After many parameters were examined, a firing rate of 10 Hz emerged as a transition frequency between clusters of low-firing and high-firing neurons. To form a subgroup identified as high-firing neurons with GABAergic characteristics, the high-firing classification was sorted by spike duration. To form a subgroup identified as putative DA neurons, the low-firing classification was sorted by DA D2-type receptor pharmacological responses to quinpirole and eticlopride. Putative DA neurons were inhibited by the D2-type receptor agonist quinpirole and returned to near-baseline firing rates or higher following the D2-type receptor antagonist eticlopride. Other unit types showed different responses to these D2-type receptor drugs. A multidimensional comparison of neural properties indicated that these subgroups often clustered independently of each other with minimal overlap. Firing pattern variability reliably distinguished putative DA neurons from other unit types. A combination of phasic burst properties and a low skew in the interspike interval distribution produced a neural population that was comparable to the one sorted by D2 pharmacology. These findings provide a quantitative statistical approach for the classification of VTA neurons in unanesthetized animals.
Collapse
Affiliation(s)
- Wei Li
- Center on Addiction, Learning, Memory, Department of Neuroscience, Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA
| | | | | |
Collapse
|
41
|
Attenuating GABA(A) receptor signaling in dopamine neurons selectively enhances reward learning and alters risk preference in mice. J Neurosci 2012; 31:17103-12. [PMID: 22114279 DOI: 10.1523/jneurosci.1715-11.2011] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Phasic dopamine (DA) transmission encodes the value of reward-predictive stimuli and influences both learning and decision-making. Altered DA signaling is associated with psychiatric conditions characterized by risky choices such as pathological gambling. These observations highlight the importance of understanding how DA neuron activity is modulated. While excitatory drive onto DA neurons is critical for generating phasic DA responses, emerging evidence suggests that inhibitory signaling also modulates these responses. To address the functional importance of inhibitory signaling in DA neurons, we generated mice lacking the β3 subunit of the GABA(A) receptor specifically in DA neurons (β3-KO mice) and examined their behavior in tasks that assessed appetitive learning, aversive learning, and risk preference. DA neurons in midbrain slices from β3-KO mice exhibited attenuated GABA-evoked IPSCs. Furthermore, electrical stimulation of excitatory afferents to DA neurons elicited more DA release in the nucleus accumbens of β3-KO mice as measured by fast-scan cyclic voltammetry. β3-KO mice were more active than controls when given morphine, which correlated with potential compensatory upregulation of GABAergic tone onto DA neurons. β3-KO mice learned faster in two food-reinforced learning paradigms, but extinguished their learned behavior normally. Enhanced learning was specific for appetitive tasks, as aversive learning was unaffected in β3-KO mice. Finally, we found that β3-KO mice had enhanced risk preference in a probabilistic selection task that required mice to choose between a small certain reward and a larger uncertain reward. Collectively, these findings identify a selective role for GABA(A) signaling in DA neurons in appetitive learning and decision-making.
Collapse
|
42
|
Penner MR, Mizumori SJY. Neural systems analysis of decision making during goal-directed navigation. Prog Neurobiol 2011; 96:96-135. [PMID: 21964237 DOI: 10.1016/j.pneurobio.2011.08.010] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2011] [Revised: 08/06/2011] [Accepted: 08/29/2011] [Indexed: 10/17/2022]
Abstract
The ability to make adaptive decisions during goal-directed navigation is a fundamental and highly evolved behavior that requires continual coordination of perceptions, learning and memory processes, and the planning of behaviors. Here, a neurobiological account for such coordination is provided by integrating current literatures on spatial context analysis and decision-making. This integration includes discussions of our current understanding of the role of the hippocampal system in experience-dependent navigation, how hippocampal information comes to impact midbrain and striatal decision making systems, and finally the role of the striatum in the implementation of behaviors based on recent decisions. These discussions extend across cellular to neural systems levels of analysis. Not only are key findings described, but also fundamental organizing principles within and across neural systems, as well as between neural systems functions and behavior, are emphasized. It is suggested that studying decision making during goal-directed navigation is a powerful model for studying interactive brain systems and their mediation of complex behaviors.
Collapse
Affiliation(s)
- Marsha R Penner
- Department of Psychology, University of Washington, Seattle, WA 98195-1525, United States
| | | |
Collapse
|
43
|
Chorley P, Seth AK. Dopamine-signaled reward predictions generated by competitive excitation and inhibition in a spiking neural network model. Front Comput Neurosci 2011; 5:21. [PMID: 21629770 PMCID: PMC3099399 DOI: 10.3389/fncom.2011.00021] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Accepted: 04/26/2011] [Indexed: 11/13/2022] Open
Abstract
Dopaminergic neurons in the mammalian substantia nigra display characteristic phasic responses to stimuli which reliably predict the receipt of primary rewards. These responses have been suggested to encode reward prediction-errors similar to those used in reinforcement learning. Here, we propose a model of dopaminergic activity in which prediction-error signals are generated by the joint action of short-latency excitation and long-latency inhibition, in a network undergoing dopaminergic neuromodulation of both spike-timing dependent synaptic plasticity and neuronal excitability. In contrast to previous models, sensitivity to recent events is maintained by the selective modification of specific striatal synapses, efferent to cortical neurons exhibiting stimulus-specific, temporally extended activity patterns. Our model shows, in the presence of significant background activity, (i) a shift in dopaminergic response from reward to reward-predicting stimuli, (ii) preservation of a response to unexpected rewards, and (iii) a precisely timed below-baseline dip in activity observed when expected rewards are omitted.
Collapse
Affiliation(s)
- Paul Chorley
- Neurodynamics and Consciousness Laboratory, School of Informatics, University of Sussex Brighton, UK
| | | |
Collapse
|
44
|
Parush N, Tishby N, Bergman H. Dopaminergic Balance between Reward Maximization and Policy Complexity. Front Syst Neurosci 2011; 5:22. [PMID: 21603228 PMCID: PMC3093748 DOI: 10.3389/fnsys.2011.00022] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Accepted: 04/20/2011] [Indexed: 11/17/2022] Open
Abstract
Previous reinforcement-learning models of the basal ganglia network have highlighted the role of dopamine in encoding the mismatch between prediction and reality. Far less attention has been paid to the computational goals and algorithms of the main-axis (actor). Here, we construct a top-down model of the basal ganglia with emphasis on the role of dopamine as both a reinforcement learning signal and as a pseudo-temperature signal controlling the general level of basal ganglia excitability and motor vigilance of the acting agent. We argue that the basal ganglia endow the thalamic-cortical networks with the optimal dynamic tradeoff between two constraints: minimizing the policy complexity (cost) and maximizing the expected future reward (gain). We show that this multi-dimensional optimization processes results in an experience-modulated version of the softmax behavioral policy. Thus, as in classical softmax behavioral policies, probability of actions are selected according to their estimated values and the pseudo-temperature, but in addition also vary according to the frequency of previous choices of these actions. We conclude that the computational goal of the basal ganglia is not to maximize cumulative (positive and negative) reward. Rather, the basal ganglia aim at optimization of independent gain and cost functions. Unlike previously suggested single-variable maximization processes, this multi-dimensional optimization process leads naturally to a softmax-like behavioral policy. We suggest that beyond its role in the modulation of the efficacy of the cortico-striatal synapses, dopamine directly affects striatal excitability and thus provides a pseudo-temperature signal that modulates the tradeoff between gain and cost. The resulting experience and dopamine modulated softmax policy can then serve as a theoretical framework to account for the broad range of behaviors and clinical states governed by the basal ganglia and dopamine systems.
Collapse
Affiliation(s)
- Naama Parush
- The Interdisciplinary Center for Neural Computation, The Hebrew University Jerusalem, Israel
| | | | | |
Collapse
|
45
|
Gerdjikov TV, Baker TW, Beninger RJ. Amphetamine-induced enhancement of responding for conditioned reward in rats: interactions with repeated testing. Psychopharmacology (Berl) 2011; 214:891-9. [PMID: 21107536 DOI: 10.1007/s00213-010-2099-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2009] [Accepted: 11/08/2010] [Indexed: 12/28/2022]
Abstract
RATIONALE The mesolimbic dopamine system underlies the ability of reward-related stimuli to control operant behavior. Previous work has shown that amphetamine potentiates operant responding for conditioned rewards (CRs). OBJECTIVES Here, we asked whether the profile of this amphetamine-produced potentiation changes with repeated CR presentation, i.e., as the CR is being extinguished. METHODS Amphetamine (0-1.0 mg/kg, i.p.), administered over four daily sessions using a Latin square design, dose-dependently increased lever pressing for a 'lights-off' stimulus previously paired with food in rats. RESULTS The amphetamine-produced enhancement of responding for CR was significantly modulated with repeated CR exposure: it was strongest on day 1 and became less pronounced in subsequent sessions whereas the CR effect persisted. In further experiments, rats receiving LiCl devaluation of the primary reward failed to show a significant reduction in the amphetamine-produced enhancement of responding for CR. CONCLUSIONS The nature of the dissociable effects of amphetamine on responding for CR versus the CR effect itself remains to be elucidated.
Collapse
Affiliation(s)
- Todor V Gerdjikov
- Department of Psychology, Queen's University, Kingston, ON, K7L 3N6, Canada.
| | | | | |
Collapse
|
46
|
Kim YB, Matthews M, Moghaddam B. Putative γ-aminobutyric acid neurons in the ventral tegmental area have a similar pattern of plasticity as dopamine neurons during appetitive and aversive learning. Eur J Neurosci 2011; 32:1564-72. [PMID: 21040517 DOI: 10.1111/j.1460-9568.2010.07371.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Dopamine influences affective, motor and cognitive processing, and multiple forms of learning and memory. This multifaceted functionality, which operates across long temporal windows, is broader than the narrow and temporally constrained role often ascribed to dopamine neurons as reward prediction error detectors. Given the modulatory nature of dopamine neurotransmission, that dopamine release is activated by both aversive and appetitive stimuli, and that dopamine receptors are often localized extrasynaptically, a role for dopamine in transmitting precise error signals has been questioned. Here we recorded from ventral tegmental area (VTA) neurons, while exposing rats to novel stimuli that were predictive of an appetitive or aversive outcome in the same behavioral session. The VTA contains dopamine and -aminobutyric acid (GABA) neurons that project to striatal and cortical regions and are strongly implicated in learning and affective processing. The response of VTA neurons, regardless of whether they had putative dopamine or GABA waveforms, transformed flexibly as animals learned to associate novel stimuli from different sensory modalities to appetitive or aversive outcomes. Learning the appetitive association led to larger excitatory VTA responses, whereas acquiring the aversive association led to a biphasic response of brief excitation followed by sustained inhibition. These responses shifted rapidly as outcome contingencies changed. These data suggest that VTA neurons interface sensory information with representational memory of aversive and appetitive events. This pattern of plasticity was not selective for putative dopamine neurons and generalized to other cells, suggesting that the temporally precise information transfer from the VTA may be mediated by faster acting GABA neurons.
Collapse
Affiliation(s)
- Yun-Bok Kim
- Department of Neuroscience, University of Pittsburgh, A210 Langley Hall, Pittsburgh, PA 15260, USA
| | | | | |
Collapse
|
47
|
Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 2011; 68:815-34. [PMID: 21144997 DOI: 10.1016/j.neuron.2010.11.022] [Citation(s) in RCA: 1476] [Impact Index Per Article: 105.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/29/2010] [Indexed: 01/18/2023]
Abstract
Midbrain dopamine neurons are well known for their strong responses to rewards and their critical role in positive motivation. It has become increasingly clear, however, that dopamine neurons also transmit signals related to salient but nonrewarding experiences such as aversive and alerting events. Here we review recent advances in understanding the reward and nonreward functions of dopamine. Based on this data, we propose that dopamine neurons come in multiple types that are connected with distinct brain networks and have distinct roles in motivational control. Some dopamine neurons encode motivational value, supporting brain networks for seeking, evaluation, and value learning. Others encode motivational salience, supporting brain networks for orienting, cognition, and general motivation. Both types of dopamine neurons are augmented by an alerting signal involved in rapid detection of potentially important sensory cues. We hypothesize that these dopaminergic pathways for value, salience, and alerting cooperate to support adaptive behavior.
Collapse
Affiliation(s)
- Ethan S Bromberg-Martin
- Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | |
Collapse
|
48
|
Frank MJ, Fossella JA. Neurogenetics and pharmacology of learning, motivation, and cognition. Neuropsychopharmacology 2011; 36:133-52. [PMID: 20631684 PMCID: PMC3055524 DOI: 10.1038/npp.2010.96] [Citation(s) in RCA: 146] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Revised: 06/09/2010] [Accepted: 06/10/2010] [Indexed: 02/07/2023]
Abstract
Many of the individual differences in cognition, motivation, and learning-and the disruption of these processes in neurological conditions-are influenced by genetic factors. We provide an integrative synthesis across human and animal studies, focusing on a recent spate of evidence implicating a role for genes controlling dopaminergic function in frontostriatal circuitry, including COMT, DARPP-32, DAT1, DRD2, and DRD4. These genetic effects are interpreted within theoretical frameworks developed in the context of the broader cognitive and computational neuroscience literature, constrained by data from pharmacological, neuroimaging, electrophysiological, and patient studies. In this framework, genes modulate the efficacy of particular neural computations, and effects of genetic variation are revealed by assays designed to be maximally sensitive to these computations. We discuss the merits and caveats of this approach and outline a number of novel candidate genes of interest for future study.
Collapse
Affiliation(s)
- Michael J Frank
- Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University, Providence, RI 02912-1978, USA.
| | | |
Collapse
|
49
|
Corlett PR, Taylor JR, Wang XJ, Fletcher PC, Krystal JH. Toward a neurobiology of delusions. Prog Neurobiol 2010; 92:345-69. [PMID: 20558235 PMCID: PMC3676875 DOI: 10.1016/j.pneurobio.2010.06.007] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2010] [Revised: 05/06/2010] [Accepted: 06/08/2010] [Indexed: 12/21/2022]
Abstract
Delusions are the false and often incorrigible beliefs that can cause severe suffering in mental illness. We cannot yet explain them in terms of underlying neurobiological abnormalities. However, by drawing on recent advances in the biological, computational and psychological processes of reinforcement learning, memory, and perception it may be feasible to account for delusions in terms of cognition and brain function. The account focuses on a particular parameter, prediction error--the mismatch between expectation and experience--that provides a computational mechanism common to cortical hierarchies, fronto-striatal circuits and the amygdala as well as parietal cortices. We suggest that delusions result from aberrations in how brain circuits specify hierarchical predictions, and how they compute and respond to prediction errors. Defects in these fundamental brain mechanisms can vitiate perception, memory, bodily agency and social learning such that individuals with delusions experience an internal and external world that healthy individuals would find difficult to comprehend. The present model attempts to provide a framework through which we can build a mechanistic and translational understanding of these puzzling symptoms.
Collapse
Affiliation(s)
- P R Corlett
- Department of Psychiatry, Yale University School of Medicine, Connecticut Mental Health Centre, Abraham Ribicoff Research Facility, 34 Park Street, New Haven, CT 06519, USA.
| | | | | | | | | |
Collapse
|
50
|
Bromberg-Martin ES, Matsumoto M, Nakahara H, Hikosaka O. Multiple timescales of memory in lateral habenula and dopamine neurons. Neuron 2010; 67:499-510. [PMID: 20696385 DOI: 10.1016/j.neuron.2010.06.031] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2010] [Indexed: 01/10/2023]
Abstract
Midbrain dopamine neurons are thought to signal predictions about future rewards based on the memory of past rewarding experience. Little is known about the source of their reward memory and the factors that control its timescale. Here we recorded from dopamine neurons, as well as one of their sources of input, the lateral habenula, while animals predicted upcoming rewards based on the past reward history. We found that lateral habenula and dopamine neurons accessed two distinct reward memories: a short-timescale memory expressed at the start of the task and a near-optimal long-timescale memory expressed when a future reward outcome was revealed. The short- and long-timescale memories were expressed in different forms of reward-oriented eye movements. Our data show that the habenula-dopamine pathway contains multiple timescales of memory and provide evidence for their role in motivated behavior.
Collapse
|