101
|
Abstract
A recent study shows that midbrain GABA (inhibitory) neurons code for environmentally predicted rewards. These GABA neurons communicate with dopamine neurons, where the reward prediction is subtracted from delivered reward. Thus, the GABA prediction signal shapes the dopamine reward prediction error signal.
Collapse
|
102
|
Abstract
Besides their fundamental movement function evidenced by Parkinsonian deficits, the basal ganglia are involved in processing closely linked non-motor, cognitive and reward information. This review describes the reward functions of three brain structures that are major components of the basal ganglia or are closely associated with the basal ganglia, namely midbrain dopamine neurons, pedunculopontine nucleus, and striatum (caudate nucleus, putamen, nucleus accumbens). Rewards are involved in learning (positive reinforcement), approach behavior, economic choices and positive emotions. The response of dopamine neurons to rewards consists of an early detection component and a subsequent reward component that reflects a prediction error in economic utility, but is unrelated to movement. Dopamine activations to non-rewarded or aversive stimuli reflect physical impact, but not punishment. Neurons in pedunculopontine nucleus project their axons to dopamine neurons and process sensory stimuli, movements and rewards and reward-predicting stimuli without coding outright reward prediction errors. Neurons in striatum, besides their pronounced movement relationships, process rewards irrespective of sensory and motor aspects, integrate reward information into movement activity, code the reward value of individual actions, change their reward-related activity during learning, and code own reward in social situations depending on whose action produces the reward. These data demonstrate a variety of well-characterized reward processes in specific basal ganglia nuclei consistent with an important function in non-motor aspects of motivated behavior.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3DY, UK.
| |
Collapse
|
103
|
Lak A, Stauffer WR, Schultz W. Dopamine neurons learn relative chosen value from probabilistic rewards. eLife 2016; 5:e18044. [PMID: 27787196 PMCID: PMC5116238 DOI: 10.7554/elife.18044] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2016] [Accepted: 10/25/2016] [Indexed: 02/02/2023] Open
Abstract
Economic theories posit reward probability as one of the factors defining reward value. Individuals learn the value of cues that predict probabilistic rewards from experienced reward frequencies. Building on the notion that responses of dopamine neurons increase with reward probability and expected value, we asked how dopamine neurons in monkeys acquire this value signal that may represent an economic decision variable. We found in a Pavlovian learning task that reward probability-dependent value signals arose from experienced reward frequencies. We then assessed neuronal response acquisition during choices among probabilistic rewards. Here, dopamine responses became sensitive to the value of both chosen and unchosen options. Both experiments showed also the novelty responses of dopamine neurones that decreased as learning advanced. These results show that dopamine neurons acquire predictive value signals from the frequency of experienced rewards. This flexible and fast signal reflects a specific decision variable and could update neuronal decision mechanisms.
Collapse
Affiliation(s)
- Armin Lak
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom,
| | - William R Stauffer
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
104
|
Báez-Mendoza R, van Coeverden CR, Schultz W. A neuronal reward inequity signal in primate striatum. J Neurophysiol 2016; 115:68-79. [PMID: 26378202 PMCID: PMC4760476 DOI: 10.1152/jn.00321.2015] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 09/14/2015] [Indexed: 11/22/2022] Open
Abstract
Primates are social animals, and their survival depends on social interactions with others. Especially important for social interactions and welfare is the observation of rewards obtained by other individuals and the comparison with own reward. The fundamental social decision variable for the comparison process is reward inequity, defined by an asymmetric reward distribution among individuals. An important brain structure for coding reward inequity may be the striatum, a component of the basal ganglia involved in goal-directed behavior. Two rhesus monkeys were seated opposite each other and contacted a touch-sensitive table placed between them to obtain specific magnitudes of reward that were equally or unequally distributed among them. Response times in one of the animals demonstrated differential behavioral sensitivity to reward inequity. A group of neurons in the striatum showed distinct signals reflecting disadvantageous and advantageous reward inequity. These neuronal signals occurred irrespective of, or in conjunction with, own reward coding. These data demonstrate that striatal neurons of macaque monkeys sense the differences between other's and own reward. The neuronal activities are likely to contribute crucial reward information to neuronal mechanisms involved in social interactions.
Collapse
Affiliation(s)
- Raymundo Báez-Mendoza
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - Charlotte R van Coeverden
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
105
|
Rigoli F, Rutledge RB, Dayan P, Dolan RJ. The influence of contextual reward statistics on risk preference. Neuroimage 2015; 128:74-84. [PMID: 26707890 PMCID: PMC4767216 DOI: 10.1016/j.neuroimage.2015.12.016] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Revised: 11/06/2015] [Accepted: 12/12/2015] [Indexed: 11/01/2022] Open
Abstract
Decision theories mandate that organisms should adjust their behaviour in the light of the contextual reward statistics. We tested this notion using a gambling choice task involving distinct contexts with different reward distributions. The best fitting model of subjects' behaviour indicated that the subjective values of options depended on several factors, including a baseline gambling propensity, a gambling preference dependent on reward amount, and a contextual reward adaptation factor. Combining this behavioural model with simultaneous functional magnetic resonance imaging we probed neural responses in three key regions linked to reward and value, namely ventral tegmental area/substantia nigra (VTA/SN), ventromedial prefrontal cortex (vmPFC) and ventral striatum (VST). We show that activity in the VTA/SN reflected contextual reward statistics to the extent that context affected behaviour, activity in the vmPFC represented a value difference between chosen and unchosen options while VST responses reflected a non-linear mapping between the actual objective rewards and their subjective value. The findings highlight a multifaceted basis for choice behaviour with distinct mappings between components of this behaviour and value sensitive brain regions.
Collapse
Affiliation(s)
- Francesco Rigoli
- The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London WC1N 3BG, UK.
| | - Robb B Rutledge
- The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London WC1N 3BG, UK; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, UCL, 17 Queen Square, London WC1N 3AR, UK
| | - Raymond J Dolan
- The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London WC1N 3BG, UK; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
| |
Collapse
|
106
|
Happel MFK. Dopaminergic impact on local and global cortical circuit processing during learning. Behav Brain Res 2015; 299:32-41. [PMID: 26608540 DOI: 10.1016/j.bbr.2015.11.016] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Revised: 11/10/2015] [Accepted: 11/15/2015] [Indexed: 11/17/2022]
Abstract
We have learned to detect, predict and behaviorally respond to important changes in our environment on short and longer time scales. Therefore, brains of humans and higher animals build upon a perceptual and semantic salience stored in their memories mainly generated by associative reinforcement learning. Functionally, the brain needs to extract and amplify a small number of features of sensory input with behavioral relevance to a particular situation in order to guide behavior. In this review, I argue that dopamine action, particularly in sensory cortex, orchestrates layer-dependent local and long-range cortical circuits integrating sensory associated bottom-up and semantically relevant top-down information, respectively. Available evidence reveals that dopamine thereby controls both the selection of perceptually or semantically salient signals as well as feedback processing from higher-order areas in the brain. Sensory cortical dopamine thereby governs the integration of selected sensory information within a behavioral context. This review proposes that dopamine enfolds this function by temporally distinct actions on particular layer-dependent local and global cortical circuits underlying the integration of sensory, and non-sensory cognitive and behavioral variables.
Collapse
Affiliation(s)
- Max F K Happel
- Leibniz Institute for Neurobiology, D-39118 Magdeburg, Germany; Institute of Biology, Otto-von-Guericke-University, D-39120 Magdeburg, Germany.
| |
Collapse
|
107
|
Schultz W, Carelli RM, Wightman RM. Phasic dopamine signals: from subjective reward value to formal economic utility. Curr Opin Behav Sci 2015; 5:147-154. [PMID: 26719853 PMCID: PMC4692271 DOI: 10.1016/j.cobeha.2015.09.006] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Although rewards are physical stimuli and objects, their value for survival and reproduction is subjective. The phasic, neurophysiological and voltammetric dopamine reward prediction error response signals subjective reward value. The signal incorporates crucial reward aspects such as amount, probability, type, risk, delay and effort. Differences of dopamine release dynamics with temporal delay and effort in rodents may derive from methodological issues and require further study. Recent designs using concepts and behavioral tools from experimental economics allow to formally characterize the subjective value signal as economic utility and thus to establish a neuronal value function. With these properties, the dopamine response constitutes a utility prediction error signal.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Regina M Carelli
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3290, United States
| | - R Mark Wightman
- Department of Chemistry and Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3290, United States
| |
Collapse
|
108
|
Stauffer WR, Lak A, Kobayashi S, Schultz W. Components and characteristics of the dopamine reward utility signal. J Comp Neurol 2015; 524:1699-711. [PMID: 26272220 DOI: 10.1002/cne.23880] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 08/12/2015] [Accepted: 08/12/2015] [Indexed: 11/05/2022]
Abstract
Rewards are defined by their behavioral functions in learning (positive reinforcement), approach behavior, economic choices, and emotions. Dopamine neurons respond to rewards with two components, similar to higher order sensory and cognitive neurons. The initial, rapid, unselective dopamine detection component reports all salient environmental events irrespective of their reward association. It is highly sensitive to factors related to reward and thus detects a maximal number of potential rewards. It also senses aversive stimuli but reports their physical impact rather than their aversiveness. The second response component processes reward value accurately and starts early enough to prevent confusion with unrewarded stimuli and objects. It codes reward value as a numeric, quantitative utility prediction error, consistent with formal concepts of economic decision theory. Thus, the dopamine reward signal is fast, highly sensitive and appropriate for driving and updating economic decisions.
Collapse
Affiliation(s)
- William R Stauffer
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Armin Lak
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Shunsuke Kobayashi
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| |
Collapse
|
109
|
Abstract
Rewards are crucial objects that induce learning, approach behavior, choices, and emotions. Whereas emotions are difficult to investigate in animals, the learning function is mediated by neuronal reward prediction error signals which implement basic constructs of reinforcement learning theory. These signals are found in dopamine neurons, which emit a global reward signal to striatum and frontal cortex, and in specific neurons in striatum, amygdala, and frontal cortex projecting to select neuronal populations. The approach and choice functions involve subjective value, which is objectively assessed by behavioral choices eliciting internal, subjective reward preferences. Utility is the formal mathematical characterization of subjective value and a prime decision variable in economic choice theory. It is coded as utility prediction error by phasic dopamine responses. Utility can incorporate various influences, including risk, delay, effort, and social interaction. Appropriate for formal decision mechanisms, rewards are coded as object value, action value, difference value, and chosen value by specific neurons. Although all reward, reinforcement, and decision variables are theoretical constructs, their neuronal signals constitute measurable physical implementations and as such confirm the validity of these concepts. The neuronal reward signals provide guidance for behavior while constraining the free will to act.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
110
|
Balasubramani PP, Chakravarthy VS, Ravindran B, Moustafa AA. A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making. Front Comput Neurosci 2015; 9:76. [PMID: 26136679 PMCID: PMC4469836 DOI: 10.3389/fncom.2015.00076] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 06/02/2015] [Indexed: 01/10/2023] Open
Abstract
There is significant evidence that in addition to reward-punishment based decision making, the Basal Ganglia (BG) contributes to risk-based decision making (Balasubramani et al., 2014). Despite this evidence, little is known about the computational principles and neural correlates of risk computation in this subcortical system. We have previously proposed a reinforcement learning (RL)-based model of the BG that simulates the interactions between dopamine (DA) and serotonin (5HT) in a diverse set of experimental studies including reward, punishment and risk based decision making (Balasubramani et al., 2014). Starting with the classical idea that the activity of mesencephalic DA represents reward prediction error, the model posits that serotoninergic activity in the striatum controls risk-prediction error. Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data. In this work, we expand the earlier model into a detailed network model of the BG and demonstrate the joint contributions of DA-5HT in risk and reward-punishment sensitivity. At the core of the proposed network model is the following insight regarding cellular correlates of value and risk computation. Just as DA D1 receptor (D1R) expressing medium spiny neurons (MSNs) of the striatum were thought to be the neural substrates for value computation, we propose that DA D1R and D2R co-expressing MSNs are capable of computing risk. Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them. Ours is the first model that accounts for the computational possibilities of these co-expressing D1R-D2R MSNs, and describes how DA and 5HT mediate activity in these classes of neurons (D1R-, D2R-, D1R-D2R- MSNs). Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG. The experiments simulated in the present study relates 5HT to risk sensitivity and reward-punishment learning. Furthermore, our model is shown to capture reward-punishment and risk based decision making impairment in Parkinson's Disease (PD). The model predicts that optimizing 5HT levels along with DA medications might be essential for improving the patients' reward-punishment learning deficits.
Collapse
Affiliation(s)
| | | | - Balaraman Ravindran
- Department of Computer Science and Engineering, Indian Institute of Technology Madras Chennai, India
| | - Ahmed A Moustafa
- School of Social Sciences and Technology, Marcs Institute for Brain and Behavior, University of Western Sydney Penrith, NSW, Australia ; Department of Veterans Affairs, New Jersey Health Care System East Orange, NJ, USA
| |
Collapse
|
111
|
Balasubramani PP, Chakravarthy VS, Ali M, Ravindran B, Moustafa AA. Identifying the Basal Ganglia network model markers for medication-induced impulsivity in Parkinson's disease patients. PLoS One 2015; 10:e0127542. [PMID: 26042675 PMCID: PMC4456385 DOI: 10.1371/journal.pone.0127542] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 04/16/2015] [Indexed: 01/23/2023] Open
Abstract
Impulsivity, i.e. irresistibility in the execution of actions, may be prominent in Parkinson's disease (PD) patients who are treated with dopamine precursors or dopamine receptor agonists. In this study, we combine clinical investigations with computational modeling to explore whether impulsivity in PD patients on medication may arise as a result of abnormalities in risk, reward and punishment learning. In order to empirically assess learning outcomes involving risk, reward and punishment, four subject groups were examined: healthy controls, ON medication PD patients with impulse control disorder (PD-ON ICD) or without ICD (PD-ON non-ICD), and OFF medication PD patients (PD-OFF). A neural network model of the Basal Ganglia (BG) that has the capacity to predict the dysfunction of both the dopaminergic (DA) and the serotonergic (5HT) neuromodulator systems was developed and used to facilitate the interpretation of experimental results. In the model, the BG action selection dynamics were mimicked using a utility function based decision making framework, with DA controlling reward prediction and 5HT controlling punishment and risk predictions. The striatal model included three pools of Medium Spiny Neurons (MSNs), with D1 receptor (R) alone, D2R alone and co-expressing D1R-D2R. Empirical studies showed that reward optimality was increased in PD-ON ICD patients while punishment optimality was increased in PD-OFF patients. Empirical studies also revealed that PD-ON ICD subjects had lower reaction times (RT) compared to that of the PD-ON non-ICD patients. Computational modeling suggested that PD-OFF patients have higher punishment sensitivity, while healthy controls showed comparatively higher risk sensitivity. A significant decrease in sensitivity to punishment and risk was crucial for explaining behavioral changes observed in PD-ON ICD patients. Our results highlight the power of computational modelling for identifying neuronal circuitry implicated in learning, and its impairment in PD. The results presented here not only show that computational modelling can be used as a valuable tool for understanding and interpreting clinical data, but they also show that computational modeling has the potential to become an invaluable tool to predict the onset of behavioral changes during disease progression.
Collapse
Affiliation(s)
| | | | - Manal Ali
- School of Medicine, Ain Shams University, Cairo, Egypt
| | - Balaraman Ravindran
- Department of Computer Science and Engineering, Indian Institute of Technology, Madras, Chennai, India
| | - Ahmed A. Moustafa
- Marcs Institute for Brain and Behaviour & School of Social Sciences and Psychology, University of Western Sydney, Penrith, Australia
| |
Collapse
|
112
|
Stauffer WR, Lak A, Bossaerts P, Schultz W. Economic choices reveal probability distortion in macaque monkeys. J Neurosci 2015; 35:3146-54. [PMID: 25698750 PMCID: PMC4331632 DOI: 10.1523/jneurosci.3653-14.2015] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Revised: 11/09/2014] [Accepted: 12/04/2014] [Indexed: 11/21/2022] Open
Abstract
Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing.
Collapse
Affiliation(s)
- William R Stauffer
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom,
| | - Armin Lak
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Peter Bossaerts
- Utah Laboratory for Experimental Economics and Finance, University of Utah, Salt Lake City, Utah 84112, and Faculty of Business and Economics, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| |
Collapse
|
113
|
Marshall AT, Kirkpatrick K. Relative gains, losses, and reference points in probabilistic choice in rats. PLoS One 2015; 10:e0117697. [PMID: 25658448 PMCID: PMC4319772 DOI: 10.1371/journal.pone.0117697] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 12/30/2014] [Indexed: 12/02/2022] Open
Abstract
Theoretical reference points have been proposed to differentiate probabilistic gains from probabilistic losses in humans, but such a phenomenon in non-human animals has yet to be thoroughly elucidated. Three experiments evaluated the effect of reward magnitude on probabilistic choice in rats, seeking to determine reference point use by examining the effect of previous outcome magnitude(s) on subsequent choice behavior. Rats were trained to choose between an outcome that always delivered reward (low-uncertainty choice) and one that probabilistically delivered reward (high-uncertainty). The probability of high-uncertainty outcome receipt and the magnitudes of low-uncertainty and high-uncertainty outcomes were manipulated within and between experiments. Both the low- and high-uncertainty outcomes involved variable reward magnitudes, so that either a smaller or larger magnitude was probabilistically delivered, as well as reward omission following high-uncertainty choices. In Experiments 1 and 2, the between groups factor was the magnitude of the high-uncertainty-smaller (H-S) and high-uncertainty-larger (H-L) outcome, respectively. The H-S magnitude manipulation differentiated the groups, while the H-L magnitude manipulation did not. Experiment 3 showed that manipulating the probability of differential losses as well as the expected value of the low-uncertainty choice produced systematic effects on choice behavior. The results suggest that the reference point for probabilistic gains and losses was the expected value of the low-uncertainty choice. Current theories of probabilistic choice behavior have difficulty accounting for the present results, so an integrated theoretical framework is proposed. Overall, the present results have implications for understanding individual differences and corresponding underlying mechanisms of probabilistic choice behavior.
Collapse
Affiliation(s)
- Andrew T. Marshall
- Department of Psychological Sciences, Kansas State University, Manhattan, Kansas, United States of America
| | - Kimberly Kirkpatrick
- Department of Psychological Sciences, Kansas State University, Manhattan, Kansas, United States of America
| |
Collapse
|
114
|
Dopamine-associated cached values are not sufficient as the basis for action selection. Proc Natl Acad Sci U S A 2014; 111:18357-62. [PMID: 25489094 DOI: 10.1073/pnas.1419770111] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phasic dopamine transmission is posited to act as a critical teaching signal that updates the stored (or "cached") values assigned to reward-predictive stimuli and actions. It is widely hypothesized that these cached values determine the selection among multiple courses of action, a premise that has provided a foundation for contemporary theories of decision making. In the current work we used fast-scan cyclic voltammetry to probe dopamine-associated cached values from cue-evoked dopamine release in the nucleus accumbens of rats performing cost-benefit decision-making paradigms to evaluate critically the relationship between dopamine-associated cached values and preferences. By manipulating the amount of effort required to obtain rewards of different sizes, we were able to bias rats toward preferring an option yielding a high-value reward in some sessions and toward instead preferring an option yielding a low-value reward in others. Therefore, this approach permitted the investigation of dopamine-associated cached values in a context in which reward magnitude and subjective preference were dissociated. We observed greater cue-evoked mesolimbic dopamine release to options yielding the high-value reward even when rats preferred the option yielding the low-value reward. This result identifies a clear mismatch between the ordinal utility of the available options and the rank ordering of their cached values, thereby providing robust evidence that dopamine-associated cached values cannot be the sole determinant of choices in simple economic decision making.
Collapse
|