1
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
2
|
Floeder JR, Jeong H, Mohebi A, Namboodiri VMK. Mesolimbic dopamine ramps reflect environmental timescales. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.27.587103. [PMID: 38659749 PMCID: PMC11042231 DOI: 10.1101/2024.03.27.587103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Mesolimbic dopamine activity occasionally exhibits ramping dynamics, reigniting debate on theories of dopamine signaling. This debate is ongoing partly because the experimental conditions under which dopamine ramps emerge remain poorly understood. Here, we show that during Pavlovian and instrumental conditioning, mesolimbic dopamine ramps are only observed when the inter-trial interval is short relative to the trial period. These results constrain theories of dopamine signaling and identify a critical variable determining the emergence of dopamine ramps.
Collapse
Affiliation(s)
- Joseph R Floeder
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
| | - Huijeong Jeong
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Ali Mohebi
- Department of Neurology, University of California, San Francisco, CA, USA
| | - Vijay Mohan K Namboodiri
- Neuroscience Graduate Program, University of California, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, CA, USA
- Weill Institute for Neurosciences, Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, CA, USA
| |
Collapse
|
3
|
Glykos V, Fujisawa S. Memory-specific encoding activities of the ventral tegmental area dopamine and GABA neurons. eLife 2024; 12:RP89743. [PMID: 38512339 PMCID: PMC10957172 DOI: 10.7554/elife.89743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2024] Open
Abstract
Although the midbrain dopamine (DA) system plays a crucial role in higher cognitive functions, including updating and maintaining short-term memory, the encoding properties of the somatic spiking activity of ventral tegmental area (VTA) DA neurons for short-term memory computations have not yet been identified. Here, we probed and analyzed the activity of optogenetically identified DA and GABA neurons while mice engaged in short-term memory-dependent behavior in a T-maze task. Single-neuron analysis revealed that significant subpopulations of DA and GABA neurons responded differently between left and right trials in the memory delay. With a series of control behavioral tasks and regression analysis tools, we show that firing rate differences are linked to short-term memory-dependent decisions and cannot be explained by reward-related processes, motivated behavior, or motor-related activities. This evidence provides novel insights into the mnemonic encoding activities of midbrain DA and GABA neurons.
Collapse
Affiliation(s)
- Vasileios Glykos
- Laboratory for Systems Neurophysiology, RIKEN Center for Brain Science, Wako, Japan
- Synapse Biology Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| | - Shigeyoshi Fujisawa
- Laboratory for Systems Neurophysiology, RIKEN Center for Brain Science, Wako, Japan
| |
Collapse
|
4
|
Amo R. Prediction error in dopamine neurons during associative learning. Neurosci Res 2024; 199:12-20. [PMID: 37451506 DOI: 10.1016/j.neures.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 06/18/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
Dopamine neurons have long been thought to facilitate learning by broadcasting reward prediction error (RPE), a teaching signal used in machine learning, but more recent work has advanced alternative models of dopamine's computational role. Here, I revisit this critical issue and review new experimental evidences that tighten the link between dopamine activity and RPE. First, I introduce the recent observation of a gradual backward shift of dopamine activity that had eluded researchers for over a decade. I also discuss several other findings, such as dopamine ramping, that were initially interpreted to conflict but later found to be consistent with RPE. These findings improve our understanding of neural computation in dopamine neurons.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
5
|
Kaźmierczak M, Nicola SM. The Arousal-motor Hypothesis of Dopamine Function: Evidence that Dopamine Facilitates Reward Seeking in Part by Maintaining Arousal. Neuroscience 2022; 499:64-103. [PMID: 35853563 PMCID: PMC9479757 DOI: 10.1016/j.neuroscience.2022.07.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 06/28/2022] [Accepted: 07/12/2022] [Indexed: 10/17/2022]
Abstract
Dopamine facilitates approach to reward via its actions on dopamine receptors in the nucleus accumbens. For example, blocking either D1 or D2 dopamine receptors in the accumbens reduces the proportion of reward-predictive cues to which rats respond with cued approach. Recent evidence indicates that accumbens dopamine also promotes wakefulness and arousal, but the relationship between dopamine's roles in arousal and reward seeking remains unexplored. Here, we show that the ability of systemic or intra-accumbens injections of the D1 antagonist SCH23390 to reduce cued approach to reward depends on the animal's state of arousal. Handling the animal, a manipulation known to increase arousal, was sufficient to reverse the behavioral effects of the antagonist. In addition, SCH23390 reduced spontaneous locomotion and increased time spent in sleep postures, both consistent with reduced arousal, but also increased time spent immobile in postures inconsistent with sleep. In contrast, the ability of the D2 antagonist haloperidol to reduce cued approach was not reversible by handling. Haloperidol reduced spontaneous locomotion but did not increase sleep postures, instead increasing immobility in non-sleep postures. We place these results in the context of the extensive literature on dopamine's contributions to behavior, and propose the arousal-motor hypothesis. This novel synthesis, which proposes that two main functions of dopamine are to promote arousal and facilitate motor behavior, accounts both for our findings and many previous behavioral observations that have led to disparate and conflicting conclusions.
Collapse
Affiliation(s)
- Marcin Kaźmierczak
- Departments of Neuroscience and Psychiatry, Albert Einstein College of Medicine, 1300 Morris Park Ave, Forchheimer 111, Bronx, NY 10461, USA
| | - Saleem M Nicola
- Departments of Neuroscience and Psychiatry, Albert Einstein College of Medicine, 1300 Morris Park Ave, Forchheimer 111, Bronx, NY 10461, USA.
| |
Collapse
|
6
|
Fung BJ, Sutlief E, Hussain Shuler MG. Dopamine and the interdependency of time perception and reward. Neurosci Biobehav Rev 2021; 125:380-391. [PMID: 33652021 PMCID: PMC9062982 DOI: 10.1016/j.neubiorev.2021.02.030] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 02/16/2021] [Accepted: 02/19/2021] [Indexed: 01/14/2023]
Abstract
Time is a fundamental dimension of our perception of the world and is therefore of critical importance to the organization of human behavior. A corpus of work - including recent optogenetic evidence - implicates striatal dopamine as a crucial factor influencing the perception of time. Another stream of literature implicates dopamine in reward and motivation processes. However, these two domains of research have remained largely separated, despite neurobiological overlap and the apothegmatic notion that "time flies when you're having fun". This article constitutes a review of the literature linking time perception and reward, including neurobiological and behavioral studies. Together, these provide compelling support for the idea that time perception and reward processing interact via a common dopaminergic mechanism.
Collapse
Affiliation(s)
- Bowen J Fung
- The Behavioural Insights Team, Suite 3, Level 13/9 Hunter St, Sydney NSW 2000, Australia.
| | - Elissa Sutlief
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Woods Basic Science Building Rm914, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Marshall G Hussain Shuler
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Woods Basic Science Building Rm914, 725 N. Wolfe Street, Baltimore, MD 21205, USA; Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N Wolfe Street, Baltimore, MD 21205, USA.
| |
Collapse
|
7
|
Affiliation(s)
- Dhanjai
- Department of Mathematical and Physical Sciences Concordia University of Edmonton 7128 Ada Blvd NW Edmonton AB T5B 4E4 Canada
- Physical Sciences Department MacEwan University, 10700-104 Avenue Edmonton AB T5 J 4S2 Canada
| | - Nancy Yu
- Physical Sciences Department MacEwan University, 10700-104 Avenue Edmonton AB T5 J 4S2 Canada
| | - Samuel M. Mugo
- Physical Sciences Department MacEwan University, 10700-104 Avenue Edmonton AB T5 J 4S2 Canada
| |
Collapse
|
8
|
Muhammed K, Manohar S, Ben Yehuda M, Chong TTJ, Tofaris G, Lennox G, Bogdanovic M, Hu M, Husain M. Reward sensitivity deficits modulated by dopamine are associated with apathy in Parkinson's disease. Brain 2016; 139:2706-2721. [PMID: 27452600 PMCID: PMC5035817 DOI: 10.1093/brain/aww188] [Citation(s) in RCA: 82] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 05/27/2016] [Accepted: 06/16/2016] [Indexed: 12/12/2022] Open
Abstract
Apathy is a debilitating and under-recognized condition that has a significant impact in many neurodegenerative disorders. In Parkinson's disease, it is now known to contribute to worse outcomes and a reduced quality of life for patients and carers, adding to health costs and extending disease burden. However, despite its clinical importance, there remains limited understanding of mechanisms underlying apathy. Here we investigated if insensitivity to reward might be a contributory factor and examined how this relates to severity of clinical symptoms. To do this we created novel ocular measures that indexed motivation level using pupillary and saccadic response to monetary incentives, allowing reward sensitivity to be evaluated objectively. This approach was tested in 40 patients with Parkinson's disease, 31 elderly age-matched control participants and 20 young healthy volunteers. Thirty patients were examined ON and OFF their dopaminergic medication in two counterbalanced sessions, so that the effect of dopamine on reward sensitivity could be assessed. Pupillary dilation to increasing levels of monetary reward on offer provided quantifiable metrics of motivation in healthy subjects as well as patients. Moreover, pupillary reward sensitivity declined with age. In Parkinson's disease, reduced pupillary modulation by incentives was predictive of apathy severity, and independent of motor impairment and autonomic dysfunction as assessed using overnight heart rate variability measures. Reward sensitivity was further modulated by dopaminergic state, with blunted sensitivity when patients were OFF dopaminergic drugs, both in pupillary response and saccadic peak velocity response to reward. These findings suggest that reward insensitivity may be a contributory mechanism to apathy and provide potential new clinical measures for improved diagnosis and monitoring of apathy.media-1vid110.1093/brain/aww188_video_abstractaww188_video_abstract.
Collapse
Affiliation(s)
- Kinan Muhammed
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK 2 Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Sanjay Manohar
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK 2 Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Michael Ben Yehuda
- 2 Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Trevor T-J Chong
- 3 Department of Cognitive Science, Macquarie University, Sydney, Australia
| | - George Tofaris
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Graham Lennox
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Marko Bogdanovic
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Michele Hu
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Masud Husain
- 1 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK 2 Department of Experimental Psychology, University of Oxford, Oxford, UK
| |
Collapse
|
9
|
Lloyd K, Dayan P. Tamping Ramping: Algorithmic, Implementational, and Computational Explanations of Phasic Dopamine Signals in the Accumbens. PLoS Comput Biol 2015; 11:e1004622. [PMID: 26699940 PMCID: PMC4689534 DOI: 10.1371/journal.pcbi.1004622] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Accepted: 10/25/2015] [Indexed: 11/26/2022] Open
Abstract
Substantial evidence suggests that the phasic activity of dopamine neurons represents reinforcement learning’s temporal difference prediction error. However, recent reports of ramp-like increases in dopamine concentration in the striatum when animals are about to act, or are about to reach rewards, appear to pose a challenge to established thinking. This is because the implied activity is persistently predictable by preceding stimuli, and so cannot arise as this sort of prediction error. Here, we explore three possible accounts of such ramping signals: (a) the resolution of uncertainty about the timing of action; (b) the direct influence of dopamine over mechanisms associated with making choices; and (c) a new model of discounted vigour. Collectively, these suggest that dopamine ramps may be explained, with only minor disturbance, by standard theoretical ideas, though urgent questions remain regarding their proximal cause. We suggest experimental approaches to disentangling which of the proposed mechanisms are responsible for dopamine ramps. Dopamine has long been implicated in reward-motivated behaviour. Theory and experiments suggest that activity of dopamine-containing neurons resembles a temporally-sophisticated prediction error used to learn expectations of future reward. This account would appear to be inconsistent with recent observations of ‘ramps’, i.e., gradual increases in extracellular dopamine concentration prior to the execution of actions or the acquisition of rewards. We explore three different possible explanations of such ramping signals as arising: (a) when subjects experience uncertainty about when actions will be executed; (b) when dopamine itself influences the timecourse of choice; and (c) under a new model in which ‘quasi-tonic’ dopamine signals arise through a form of temporal discounting. We thereby show that dopamine ramps can be integrated with current theories, and also suggest experiments to clarify which mechanisms are involved.
Collapse
Affiliation(s)
- Kevin Lloyd
- Gatsby Computational Neuroscience Unit, London, United Kingdom
- * E-mail:
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, London, United Kingdom
| |
Collapse
|
10
|
Blanchard TC, Strait CE, Hayden BY. Ramping ensemble activity in dorsal anterior cingulate neurons during persistent commitment to a decision. J Neurophysiol 2015; 114:2439-49. [PMID: 26334016 DOI: 10.1152/jn.00711.2015] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 08/27/2015] [Indexed: 11/22/2022] Open
Abstract
We frequently need to commit to a choice to achieve our goals; however, the neural processes that keep us motivated in pursuit of delayed goals remain obscure. We examined ensemble responses of neurons in macaque dorsal anterior cingulate cortex (dACC), an area previously implicated in self-control and persistence, in a task that requires commitment to a choice to obtain a reward. After reward receipt, dACC neurons signaled reward amount with characteristic ensemble firing rate patterns; during the delay in anticipation of the reward, ensemble activity smoothly and gradually came to resemble the postreward pattern. On the subset of risky trials, in which a reward was anticipated with 50% certainty, ramping ensemble activity evolved to the pattern associated with the anticipated reward (and not with the anticipated loss) and then, on loss trials, took on an inverted form anticorrelated with the form associated with a win. These findings enrich our knowledge of reward processing in dACC and may have broader implications for our understanding of persistence and self-control.
Collapse
Affiliation(s)
- Tommy C Blanchard
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York
| | - Caleb E Strait
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York
| | - Benjamin Y Hayden
- Department of Brain and Cognitive Sciences and Center for Visual Science, University of Rochester, Rochester, New York
| |
Collapse
|
11
|
Littman ML. Reinforcement learning improves behaviour from evaluative feedback. Nature 2015; 521:445-51. [PMID: 26017443 DOI: 10.1038/nature14540] [Citation(s) in RCA: 76] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 04/28/2015] [Indexed: 11/09/2022]
Abstract
Reinforcement learning is a branch of machine learning concerned with using experience gained through interacting with the world and evaluative feedback to improve a system's ability to make behavioural decisions. It has been called the artificial intelligence problem in a microcosm because learning algorithms must act autonomously to perform well and achieve their goals. Partly driven by the increasing availability of rich data, recent years have seen exciting advances in the theory and practice of reinforcement learning, including developments in fundamental technical areas such as generalization, planning, exploration and empirical methodology, leading to increasing applicability to real-life problems.
Collapse
Affiliation(s)
- Michael L Littman
- Department of Computer Science, Brown University, Providence, Rhode Island 02912, USA
| |
Collapse
|
12
|
Reduced graphene oxide-carbon dots composite as an enhanced material for electrochemical determination of dopamine. Electrochim Acta 2014. [DOI: 10.1016/j.electacta.2014.02.150] [Citation(s) in RCA: 105] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
13
|
Morita K, Kato A. Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits. Front Neural Circuits 2014; 8:36. [PMID: 24782717 PMCID: PMC3988379 DOI: 10.3389/fncir.2014.00036] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 03/24/2014] [Indexed: 11/13/2022] Open
Abstract
It has been suggested that the midbrain dopamine (DA) neurons, receiving inputs from the cortico-basal ganglia (CBG) circuits and the brainstem, compute reward prediction error (RPE), the difference between reward obtained or expected to be obtained and reward that had been expected to be obtained. These reward expectations are suggested to be stored in the CBG synapses and updated according to RPE through synaptic plasticity, which is induced by released DA. These together constitute the "DA=RPE" hypothesis, which describes the mutual interaction between DA and the CBG circuits and serves as the primary working hypothesis in studying reward learning and value-based decision-making. However, recent work has revealed a new type of DA signal that appears not to represent RPE. Specifically, it has been found in a reward-associated maze task that striatal DA concentration primarily shows a gradual increase toward the goal. We explored whether such ramping DA could be explained by extending the "DA=RPE" hypothesis by taking into account biological properties of the CBG circuits. In particular, we examined effects of possible time-dependent decay of DA-dependent plastic changes of synaptic strengths by incorporating decay of learned values into the RPE-based reinforcement learning model and simulating reward learning tasks. We then found that incorporation of such a decay dramatically changes the model's behavior, causing gradual ramping of RPE. Moreover, we further incorporated magnitude-dependence of the rate of decay, which could potentially be in accord with some past observations, and found that near-sigmoidal ramping of RPE, resembling the observed DA ramping, could then occur. Given that synaptic decay can be useful for flexibly reversing and updating the learned reward associations, especially in case the baseline DA is low and encoding of negative RPE by DA is limited, the observed DA ramping would be indicative of the operation of such flexible reward learning.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo Tokyo, Japan
| | - Ayaka Kato
- Department of Biological Sciences, School of Science, The University of Tokyo Tokyo, Japan
| |
Collapse
|
14
|
Abstract
Centri-voltammetry is a novel method that combines centrifuge with voltammetry. In the present work centri-voltammetric detection of DA has been mad e for the first time.
Collapse
Affiliation(s)
- Sinan Cemgil Sultan
- Mugla Sitki Kocman University
- Faculty of Science
- Chemistry Department
- Kötekli/Muğla, Turkey
| | - Esma Sezer
- Istanbul Teknik University
- Faculty of Science
- Chemistry Department
- Maslak/Istanbul, Turkey
| | - Yudum Tepeli
- Mugla Sitki Kocman University
- Faculty of Science
- Chemistry Department
- Kötekli/Muğla, Turkey
| | - Ulku Anik
- Mugla Sitki Kocman University
- Faculty of Science
- Chemistry Department
- Kötekli/Muğla, Turkey
| |
Collapse
|
15
|
Abstract
Temporal difference learning models of dopamine assert that phasic levels of dopamine encode a reward prediction error. However, this hypothesis has been challenged by recent observations of gradually ramping stratal dopamine levels as a goal is approached. This note describes conditions under which temporal difference learning models predict dopamine ramping. The key idea is representational: a quadratic transformation of proximity to the goal implies approximately linear ramping, as observed experimentally.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, U.S.A.
| |
Collapse
|