1
|
Chakroun K, Wiehler A, Wagner B, Mathar D, Ganzer F, van Eimeren T, Sommer T, Peters J. Dopamine regulates decision thresholds in human reinforcement learning in males. Nat Commun 2023; 14:5369. [PMID: 37666865 PMCID: PMC10477234 DOI: 10.1038/s41467-023-41130-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 08/22/2023] [Indexed: 09/06/2023] Open
Abstract
Dopamine fundamentally contributes to reinforcement learning, but recent accounts also suggest a contribution to specific action selection mechanisms and the regulation of response vigour. Here, we examine dopaminergic mechanisms underlying human reinforcement learning and action selection via a combined pharmacological neuroimaging approach in male human volunteers (n = 31, within-subjects; Placebo, 150 mg of the dopamine precursor L-dopa, 2 mg of the D2 receptor antagonist Haloperidol). We found little credible evidence for previously reported beneficial effects of L-dopa vs. Haloperidol on learning from gains and altered neural prediction error signals, which may be partly due to differences experimental design and/or drug dosages. Reinforcement learning drift diffusion models account for learning-related changes in accuracy and response times, and reveal consistent decision threshold reductions under both drugs, in line with the idea that lower dosages of D2 receptor antagonists increase striatal DA release via an autoreceptor-mediated feedback mechanism. These results are in line with the idea that dopamine regulates decision thresholds during reinforcement learning, and may help to bridge action selection and response vigor accounts of dopamine.
Collapse
Affiliation(s)
- Karima Chakroun
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Antonius Wiehler
- Motivation, Brain and Behavior Lab, Paris Brain Institute (ICM), Pitié-Salpêtrière Hospital, Paris, France
| | - Ben Wagner
- Chair of Cognitive Computational Neuroscience, Technical University Dresden, Dresden, Germany
| | - David Mathar
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany
| | - Florian Ganzer
- Integrated Psychiatry Winterthur, Winterthur, Switzerland
| | - Thilo van Eimeren
- Multimodal Neuroimaging Group, Department of Nuclear Medicine, University Medical Center Cologne, Cologne, Germany
| | - Tobias Sommer
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jan Peters
- Institute for Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany.
| |
Collapse
|
2
|
Jansen M, Lockwood PL, Cutler J, de Bruijn ERA. l-DOPA and oxytocin influence the neurocomputational mechanisms of self-benefitting and prosocial reinforcement learning. Neuroimage 2023; 270:119983. [PMID: 36848972 DOI: 10.1016/j.neuroimage.2023.119983] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/03/2023] [Accepted: 02/23/2023] [Indexed: 02/27/2023] Open
Abstract
Humans learn through reinforcement, particularly when outcomes are unexpected. Recent research suggests similar mechanisms drive how we learn to benefit other people, that is, how we learn to be prosocial. Yet the neurochemical mechanisms underlying such prosocial computations remain poorly understood. Here, we investigated whether pharmacological manipulation of oxytocin and dopamine influence the neurocomputational mechanisms underlying self-benefitting and prosocial reinforcement learning. Using a double-blind placebo-controlled cross-over design, we administered intranasal oxytocin (24 IU), dopamine precursor l-DOPA (100 mg + 25 mg carbidopa), or placebo over three sessions. Participants performed a probabilistic reinforcement learning task with potential rewards for themselves, another participant, or no one, during functional magnetic resonance imaging. Computational models of reinforcement learning were used to calculate prediction errors (PEs) and learning rates. Participants behavior was best explained by a model with different learning rates for each recipient, but these were unaffected by either drug. On the neural level, however, both drugs blunted PE signaling in the ventral striatum and led to negative signaling of PEs in the anterior mid-cingulate cortex, dorsolateral prefrontal cortex, inferior parietal gyrus, and precentral gyrus, compared to placebo, and regardless of recipient. Oxytocin (versus placebo) administration was additionally associated with opposing tracking of self-benefitting versus prosocial PEs in dorsal anterior cingulate cortex, insula and superior temporal gyrus. These findings suggest that both l-DOPA and oxytocin induce a context-independent shift from positive towards negative tracking of PEs during learning. Moreover, oxytocin may have opposing effects on PE signaling when learning to benefit oneself versus another.
Collapse
Affiliation(s)
- Myrthe Jansen
- Department of Clinical Psychology, Leiden University, the Netherlands; Leiden Institute for Brain and Cognition (LIBC), Leiden, the Netherlands.
| | - Patricia L Lockwood
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK; Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK; Centre for Developmental Science, School of Psychology, University of Birmingham, UK
| | - Jo Cutler
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK; Institute for Mental Health, School of Psychology, University of Birmingham, Birmingham, UK; Centre for Developmental Science, School of Psychology, University of Birmingham, UK
| | - Ellen R A de Bruijn
- Department of Clinical Psychology, Leiden University, the Netherlands; Leiden Institute for Brain and Cognition (LIBC), Leiden, the Netherlands
| |
Collapse
|
3
|
Soutschek A, Tobler PN. A process model account of the role of dopamine in intertemporal choice. eLife 2023; 12:83734. [PMID: 36884013 PMCID: PMC9995109 DOI: 10.7554/elife.83734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 02/27/2023] [Indexed: 03/09/2023] Open
Abstract
Theoretical accounts disagree on the role of dopamine in intertemporal choice and assume that dopamine either promotes delay of gratification by increasing the preference for larger rewards or that dopamine reduces patience by enhancing the sensitivity to waiting costs. Here, we reconcile these conflicting accounts by providing empirical support for a novel process model according to which dopamine contributes to two dissociable components of the decision process, evidence accumulation and starting bias. We re-analyzed a previously published data set where intertemporal decisions were made either under the D2 antagonist amisulpride or under placebo by fitting a hierarchical drift diffusion model that distinguishes between dopaminergic effects on the speed of evidence accumulation and the starting point of the accumulation process. Blocking dopaminergic neurotransmission not only strengthened the sensitivity to whether a reward is perceived as worth the delay costs during evidence accumulation (drift rate) but also attenuated the impact of waiting costs on the starting point of the evidence accumulation process (bias). In contrast, re-analyzing data from a D1 agonist study provided no evidence for a causal involvement of D1R activation in intertemporal choices. Taken together, our findings support a novel, process-based account of the role of dopamine for cost-benefit decision making, highlight the potential benefits of process-informed analyses, and advance our understanding of dopaminergic contributions to decision making.
Collapse
Affiliation(s)
| | - Philippe N Tobler
- Zurich Center for Neuroeconomics, Department of Economics, University of ZurichZürichSwitzerland
- Neuroscience Center Zurich, University of Zurich, Swiss Federal Institute of Technology ZurichZurichSwitzerland
| |
Collapse
|
4
|
Desch S, Schweinhardt P, Seymour B, Flor H, Becker S. Evidence for dopaminergic involvement in endogenous modulation of pain relief. eLife 2023; 12:e81436. [PMID: 36722857 PMCID: PMC9988263 DOI: 10.7554/elife.81436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 01/31/2023] [Indexed: 02/02/2023] Open
Abstract
Relief of ongoing pain is a potent motivator of behavior, directing actions to escape from or reduce potentially harmful stimuli. Whereas endogenous modulation of pain events is well characterized, relatively little is known about the modulation of pain relief and its corresponding neurochemical basis. Here, we studied pain modulation during a probabilistic relief-seeking task (a 'wheel of fortune' gambling task), in which people actively or passively received reduction of a tonic thermal pain stimulus. We found that relief perception was enhanced by active decisions and unpredictability, and greater in high novelty-seeking trait individuals, consistent with a model in which relief is tuned by its informational content. We then probed the roles of dopaminergic and opioidergic signaling, both of which are implicated in relief processing, by embedding the task in a double-blinded cross-over design with administration of the dopamine precursor levodopa and the opioid receptor antagonist naltrexone. We found that levodopa enhanced each of these information-specific aspects of relief modulation but no significant effects of the opioidergic manipulation. These results show that dopaminergic signaling has a key role in modulating the perception of pain relief to optimize motivation and behavior.
Collapse
Affiliation(s)
- Simon Desch
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
- Clinical Psychology, Department of Experimental Psychology, Heinrich Heine University DüsseldorfDüsseldorfGermany
| | - Petra Schweinhardt
- Integrative Spinal Research, Department of Chiropractic Medicine, Balgrist University Hospital, University of ZurichZurichSwitzerland
| | - Ben Seymour
- Wellcome Centre for Integrative Neuroimaging, John Radcliffe HospitalOxfordUnited Kingdom
| | - Herta Flor
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
| | - Susanne Becker
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg UniversityMannheimGermany
- Clinical Psychology, Department of Experimental Psychology, Heinrich Heine University DüsseldorfDüsseldorfGermany
- Integrative Spinal Research, Department of Chiropractic Medicine, Balgrist University Hospital, University of ZurichZurichSwitzerland
| |
Collapse
|
5
|
Soutschek A, Jetter A, Tobler PN. Towards a Unifying Account of Dopamine’s Role in Cost-Benefit Decision Making. BIOLOGICAL PSYCHIATRY GLOBAL OPEN SCIENCE 2022; 3:179-186. [PMID: 37124350 PMCID: PMC10140448 DOI: 10.1016/j.bpsgos.2022.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 02/25/2022] [Accepted: 02/25/2022] [Indexed: 10/18/2022] Open
Abstract
Dopamine is thought to play a crucial role in cost-benefit decision making, but so far there is no consensus on the precise role of dopamine in decision making. Here, we review the literature on dopaminergic manipulations of cost-benefit decision making in humans and evaluate how well different theoretical accounts explain the existing body of evidence. Reduced D2 stimulation tends to increase the willingness to bear delay and risk costs (i.e., wait for later rewards, take riskier options), while increased D1 and D2 receptor stimulation increases willingness to bear effort costs. We argue that the empirical findings can best be explained by combining the strengths of two theoretical accounts: in cost-benefit decision making, dopamine may play a dual role both in promoting the pursuit of psychologically close options (e.g., sooner and safer rewards) and in computing which costs are acceptable for a reward at stake. Moreover, we identify several limiting factors in the study designs of previous investigations that prevented a fuller understanding of dopamine's role in value-based choice. Together, the proposed theoretical framework and the methodological suggestions for future studies may bring us closer to a unifying account of dopamine in healthy and impaired cost-benefit decision making.
Collapse
|
6
|
Wiehler A, Chakroun K, Peters J. Attenuated Directed Exploration during Reinforcement Learning in Gambling Disorder. J Neurosci 2021; 41:2512-2522. [PMID: 33531415 PMCID: PMC7984586 DOI: 10.1523/jneurosci.1607-20.2021] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 01/18/2021] [Accepted: 01/22/2021] [Indexed: 12/30/2022] Open
Abstract
Gambling disorder (GD) is a behavioral addiction associated with impairments in value-based decision-making and behavioral flexibility and might be linked to changes in the dopamine system. Maximizing long-term rewards requires a flexible trade-off between the exploitation of known options and the exploration of novel options for information gain. This exploration-exploitation trade-off is thought to depend on dopamine neurotransmission. We hypothesized that human gamblers would show a reduction in directed (uncertainty-based) exploration, accompanied by changes in brain activity in a fronto-parietal exploration-related network. Twenty-three frequent, non-treatment seeking gamblers and twenty-three healthy matched controls (all male) performed a four-armed bandit task during functional magnetic resonance imaging (fMRI). Computational modeling using hierarchical Bayesian parameter estimation revealed signatures of directed exploration, random exploration, and perseveration in both groups. Gamblers showed a reduction in directed exploration, whereas random exploration and perseveration were similar between groups. Neuroimaging revealed no evidence for group differences in neural representations of basic task variables (expected value, prediction errors). Our hypothesis of reduced frontal pole (FP) recruitment in gamblers was not supported. Exploratory analyses showed that during directed exploration, gamblers showed reduced parietal cortex and substantia-nigra/ventral-tegmental-area activity. Cross-validated classification analyses revealed that connectivity in an exploration-related network was predictive of group status, suggesting that connectivity patterns might be more predictive of problem gambling than univariate effects. Findings reveal specific reductions of strategic exploration in gamblers that might be linked to altered processing in a fronto-parietal network and/or changes in dopamine neurotransmission implicated in GD.SIGNIFICANCE STATEMENT Wiehler et al. (2021) report that gamblers rely less on the strategic exploration of unknown, but potentially better rewards during reward learning. This is reflected in a related network of brain activity. Parameters of this network can be used to predict the presence of problem gambling behavior in participants.
Collapse
Affiliation(s)
- A Wiehler
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Université de Paris, Paris F-75006, France
- Department of Psychiatry, Service Hospitalo-Universitaire, Groupe Hospitalier Universitaire Paris Psychiatrie & Neurosciences, Paris F-75014, France
| | - K Chakroun
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
| | - J Peters
- Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg 20246, Germany
- Department of Psychology, Biological Psychology, University of Cologne, Cologne 50923, Germany
| |
Collapse
|
7
|
Stimulation of the vagus nerve reduces learning in a go/no-go reinforcement learning task. Eur Neuropsychopharmacol 2020; 35:17-29. [PMID: 32404279 DOI: 10.1016/j.euroneuro.2020.03.023] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 02/06/2020] [Accepted: 03/27/2020] [Indexed: 02/06/2023]
Abstract
When facing decisions to approach rewards or to avoid punishments, we often figuratively go with our gut, and the impact of metabolic states such as hunger on motivation are well documented. However, whether and how vagal feedback signals from the gut influence instrumental actions is unknown. Here, we investigated the effect of non-invasive transcutaneous auricular vagus nerve stimulation (taVNS) vs. sham (randomized cross-over design) on approach and avoidance behavior using an established go/no-go reinforcement learning paradigm in 39 healthy human participants (23 female) after an overnight fast. First, mixed-effects logistic regression analysis of choice accuracy showed that taVNS acutely impaired decision-making, p = .041. Computational reinforcement learning models identified the cause of this as a reduction in the learning rate through taVNS (∆α = -0.092, pboot = .002), particularly after punishment (∆αPun = -0.081, pboot = .012 vs. ∆αRew =-0.031, pboot = .22). However, taVNS had no effect on go biases, Pavlovian response biases or response time. Hence, taVNS appeared to influence learning rather than action execution. These results highlight a novel role of vagal afferent input in modulating reinforcement learning by tuning the learning rate according to homeostatic needs.
Collapse
|
8
|
Effort-based decision making varies by smoking status. Psychopharmacology (Berl) 2020; 237:1081-1090. [PMID: 31900525 PMCID: PMC7125005 DOI: 10.1007/s00213-019-05437-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 12/13/2019] [Indexed: 10/25/2022]
Abstract
RATIONALE A reduced willingness to perform effort based on the magnitude and probability of potential rewards has been associated with diminished dopamine function and may be relevant to chronic drug use. OBJECTIVES Here, we investigated the influence of smoking status on effort-based decisions. We hypothesized that smokers would make fewer high-effort selections than ex-smokers and never-smokers. METHODS Current smokers (n = 25), ex-smokers (≥ 1 year quit, n = 23), and never-smokers (n = 19) completed the Effort Expenditure for Rewards Task in which participants select between low-effort and high-effort options to receive monetary rewards at varying levels of reward magnitude, probability and expected value. RESULTS Overall, participants selected more high-effort options as potential reward magnitude and expected value increased. Smokers did not make fewer high-effort selections overall, but smokers were less sensitive to the changes in magnitude, probability, and expected value compared to never-smokers. Smokers were also less sensitive to the changes in probability and expected value, but not magnitude, compared to ex-smokers. Among smokers and ex-smokers, less nicotine dependence was associated with an increased likelihood of high-effort selections. CONCLUSIONS These results demonstrate the relevance of smoking status to effort-based decisions and suggest that smokers have diminished sensitivity to nondrug reward value. Among ex-smokers, greater pre-existing sensitivity to reward value may have been conducive to smoking cessation, or sensitivity was improved by smoking cessation. Future prospective studies can investigate whether effort-related decision making is predictive of smoking initiation or cessation success. IMPLICATIONS Willingness to perform effort to achieve a goal and sensitivity to changes in reward value are important aspects of motivation. These results showed that smokers have decreased sensitivity to changes in effort-related reward probability and expected value compared to ex-smokers and never-smokers. Potentially, improved sensitivity to rewards among ex-smokers may be a cause or consequence of smoking cessation. These findings may help explain why some smokers are able to achieve long-term abstinence.
Collapse
|
9
|
Hassall CD, McDonald CG, Krigolson OE. Ready, set, explore! Event-related potentials reveal the time-course of exploratory decisions. Brain Res 2019; 1719:183-193. [PMID: 31152692 DOI: 10.1016/j.brainres.2019.05.039] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 05/07/2019] [Accepted: 05/28/2019] [Indexed: 11/30/2022]
Abstract
The decision trade-off between exploiting the known and exploring the unknown has been studied using a variety of approaches and techniques. Surprisingly, electroencephalography (EEG) has been underused in this area of study, even though its high temporal resolution has the potential to reveal the time-course of exploratory decisions. We addressed this issue by recording EEG data while participants tried to win as many points as possible in a two-choice gambling task called a two-armed bandit. After using a computational model to classify responses as either exploitations or explorations, we examined event-related potentials locked to two events preceding decisions to exploit/explore: the arrival of feedback, and the subsequent appearance of the next trial's choice stimuli. In particular, we examined the feedback-locked P300 component, thought to index a phasic release of norepinephrine (a neural interrupt signal), and the reward positivity, thought to index a phasic release of dopamine (a neural prediction error signal). We observed an exploration-dependent enhancement of the P300 only, suggesting a critical role of norepinephrine (but not dopamine) in triggering decisions to explore. Similarly, we examined the N200/P300 components evoked by the appearance of the choice stimuli. In this case, exploration was characterized by an enhancement of the N200, but not P300, a result we attribute to increased response conflict. These results demonstrate the usefulness of combining computational and EEG methodologies, and suggest that exploratory decisions are preceded by two characterizing events: a feedback-locked neural interrupt signal (enhanced P300), and a choice-locked increase in response conflict (enhanced N200).
Collapse
Affiliation(s)
- Cameron D Hassall
- Centre for Biomedical Research, University of Victoria, Victoria, British Columbia V8W 2Y2, Canada
| | - Craig G McDonald
- Department of Psychology, George Mason University, Fairfax, VA 22030, USA
| | - Olave E Krigolson
- Centre for Biomedical Research, University of Victoria, Victoria, British Columbia V8W 2Y2, Canada.
| |
Collapse
|
10
|
L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action. Neuroimage 2018; 186:113-125. [PMID: 30381245 DOI: 10.1016/j.neuroimage.2018.10.075] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 10/25/2018] [Accepted: 10/26/2018] [Indexed: 11/22/2022] Open
Abstract
Dopamine is a key neurotransmitter in action control. However, influential theories of dopamine function make conflicting predictions about the effect of boosting dopamine neurotransmission. Here, we tested if increases in dopamine tone by administration of L-DOPA upregulate reward learning as predicted by reinforcement learning theories, and if increases are specific for deliberative "model-based" control or reflexive "model-free" control. Alternatively, L-DOPA may impair learning as suggested by "value" or "thrift" theories of dopamine. To this end, we employed a two-stage Markov decision-task to investigate the effect of L-DOPA (randomized cross-over) on behavioral control while brain activation was measured using fMRI. L-DOPA led to attenuated model-free control of behavior as indicated by the reduced impact of reward on choice. Increased model-based control was only observed in participants with high working memory capacity. Furthermore, L-DOPA facilitated exploratory behavior, particularly after a stream of wins in the task. Correspondingly, in the brain, L-DOPA decreased the effect of reward at the outcome stage and when the next decision had to be made. Critically, reward-learning rates and prediction error signals were unaffected by L-DOPA, indicating that differences in behavior and brain response to reward were not driven by differences in learning. Taken together, our results suggest that L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action. These findings provide support for the value and thrift accounts of dopamine and call for a refined integration of valuation and action signals in reinforcement learning models.
Collapse
|
11
|
Beeler JA, Mourra D. To Do or Not to Do: Dopamine, Affordability and the Economics of Opportunity. Front Integr Neurosci 2018; 12:6. [PMID: 29487508 PMCID: PMC5816947 DOI: 10.3389/fnint.2018.00006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 01/26/2018] [Indexed: 12/21/2022] Open
Abstract
Five years ago, we introduced the thrift hypothesis of dopamine (DA), suggesting that the primary role of DA in adaptive behavior is regulating behavioral energy expenditure to match the prevailing economic conditions of the environment. Here we elaborate that hypothesis with several new ideas. First, we introduce the concept of affordability, suggesting that costs must necessarily be evaluated with respect to the availability of resources to the organism, which computes a value not only for the potential reward opportunity, but also the value of resources expended. Placing both costs and benefits within the context of the larger economy in which the animal is functioning requires consideration of the different timescales against which to compute resource availability, or average reward rate. Appropriate windows of computation for tracking resources requires corresponding neural substrates that operate on these different timescales. In discussing temporal patterns of DA signaling, we focus on a neglected form of DA plasticity and adaptation, changes in the physical substrate of the DA system itself, such as up- and down-regulation of receptors or release probability. We argue that changes in the DA substrate itself fundamentally alter its computational function, which we propose mediates adaptations to longer temporal horizons and economic conditions. In developing our hypothesis, we focus on DA D2 receptors (D2R), arguing that D2R implements a form of “cost control” in response to the environmental economy, serving as the “brain’s comptroller”. We propose that the balance between the direct and indirect pathway, regulated by relative expression of D1 and D2 DA receptors, implements affordability. Finally, as we review data, we discuss limitations in current approaches that impede fully investigating the proposed hypothesis and highlight alternative, more semi-naturalistic strategies more conducive to neuroeconomic investigations on the role of DA in adaptive behavior.
Collapse
Affiliation(s)
- Jeff A Beeler
- Department of Psychology, Queens College, City University of New York, New York, NY, United States.,CUNY Neuroscience Consortium, The Graduate Center, City University of New York, New York, NY, United States
| | - Devry Mourra
- Department of Psychology, Queens College, City University of New York, New York, NY, United States.,CUNY Neuroscience Consortium, The Graduate Center, City University of New York, New York, NY, United States
| |
Collapse
|
12
|
Abstract
Cognitive control - the ability to override a salient or prepotent action to execute a more deliberate one - is required for flexible, goal-directed behavior, and yet it is subjectively costly: decision-makers avoid allocating control resources, even when doing so affords more valuable outcomes. Dopamine likely offsets effort costs just as it does for physical effort. And yet, dopamine can also promote impulsive action, undermining control. We propose a novel hypothesis that reconciles opposing effects of dopamine on cognitive control: during action selection, striatal dopamine biases benefits relative to costs, but does so preferentially for "proximal" motor and cognitive actions. Considering the nature of instrumental affordances and their dynamics during action selection facilitates a parsimonious interpretation and conserved corticostriatal mechanisms across physical and cognitive domains.
Collapse
Affiliation(s)
- Andrew Westbrook
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Kapittelweg 29, 6525 EN Nijmegen, The Netherlands.,Department of Psychiatry, Radboud University Medical Centre, Nijmegen, The Netherlands.,Department of Cognitive, Linguistics, and Psychological Sciences, Brown University, 190 Thayer Street, Providence, RI, 02912, USA
| | - Michael Frank
- Department of Cognitive, Linguistics, and Psychological Sciences, Brown University, 190 Thayer Street, Providence, RI, 02912, USA.,Brown Institute for Brain Sciences, Brown University, Providence, RI, USA
| |
Collapse
|
13
|
Ramakrishnan A, Byun YW, Rand K, Pedersen CE, Lebedev MA, Nicolelis MAL. Cortical neurons multiplex reward-related signals along with sensory and motor information. Proc Natl Acad Sci U S A 2017; 114:E4841-E4850. [PMID: 28559307 PMCID: PMC5474796 DOI: 10.1073/pnas.1703668114] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Rewards are known to influence neural activity associated with both motor preparation and execution. This influence can be exerted directly upon the primary motor (M1) and somatosensory (S1) cortical areas via the projections from reward-sensitive dopaminergic neurons of the midbrain ventral tegmental areas. However, the neurophysiological manifestation of reward-related signals in M1 and S1 are not well understood. Particularly, it is unclear how the neurons in these cortical areas multiplex their traditional functions related to the control of spatial and temporal characteristics of movements with the representation of rewards. To clarify this issue, we trained rhesus monkeys to perform a center-out task in which arm movement direction, reward timing, and magnitude were manipulated independently. Activity of several hundred cortical neurons was simultaneously recorded using chronically implanted microelectrode arrays. Many neurons (9-27%) in both M1 and S1 exhibited activity related to reward anticipation. Additionally, neurons in these areas responded to a mismatch between the reward amount given to the monkeys and the amount they expected: A lower-than-expected reward caused a transient increase in firing rate in 60-80% of the total neuronal sample, whereas a larger-than-expected reward resulted in a decreased firing rate in 20-35% of the neurons. Moreover, responses of M1 and S1 neurons to reward omission depended on the direction of movements that led to those rewards. These observations suggest that sensorimotor cortical neurons corepresent rewards and movement-related activity, presumably to enable reward-based learning.
Collapse
Affiliation(s)
- Arjun Ramakrishnan
- Department of Neurobiology, Duke University, Durham, NC 27710
- Duke University Center for Neuroengineering, Duke University, Durham, NC 27710
| | - Yoon Woo Byun
- Duke University Center for Neuroengineering, Duke University, Durham, NC 27710
- Department of Biomedical Engineering, Duke University, Durham, NC 27708
| | - Kyle Rand
- Department of Biomedical Engineering, Duke University, Durham, NC 27708
| | - Christian E Pedersen
- Joint Department of Biomedical Engineering, University of North Carolina-Chapel Hill and North Carolina State University, Raleigh, NC 27695
| | - Mikhail A Lebedev
- Department of Neurobiology, Duke University, Durham, NC 27710
- Duke University Center for Neuroengineering, Duke University, Durham, NC 27710
| | - Miguel A L Nicolelis
- Department of Neurobiology, Duke University, Durham, NC 27710;
- Duke University Center for Neuroengineering, Duke University, Durham, NC 27710
- Department of Biomedical Engineering, Duke University, Durham, NC 27708
- Department of Psychology and Neuroscience, Duke University, Durham, NC 27708
- Department of Neurology, Duke University, Durham, NC 27710
- Edmund and Lily Safra International Institute of Neurosciences, Natal 59066060, Brazil
| |
Collapse
|
14
|
Fatigue modulates dopamine availability and promotes flexible choice reversals during decision making. Sci Rep 2017; 7:535. [PMID: 28373651 PMCID: PMC5428685 DOI: 10.1038/s41598-017-00561-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2016] [Accepted: 03/03/2017] [Indexed: 01/28/2023] Open
Abstract
During decisions, animals balance goal achievement and effort management. Despite physical exercise and fatigue significantly affecting the levels of effort that an animal exerts to obtain a reward, their role in effort-based choice and the underlying neurochemistry are incompletely known. In particular, it is unclear whether fatigue influences decision (cost-benefit) strategies flexibly or only post-decision action execution and learning. To answer this question, we trained mice on a T-maze task in which they chose between a high-cost, high-reward arm (HR), which included a barrier, and a low-cost, low-reward arm (LR), with no barrier. The animals were parametrically fatigued immediately before the behavioural tasks by running on a treadmill. We report a sharp choice reversal, from the HR to LR arm, at 80% of their peak workload (PW), which was temporary and specific, as the mice returned to choose the HC when the animals were successively tested at 60% PW or in a two-barrier task. These rapid reversals are signatures of flexible choice. We also observed increased subcortical dopamine levels in fatigued mice: a marker of individual bias to use model-based control in humans. Our results indicate that fatigue levels can be incorporated in flexible cost-benefits computations that improve foraging efficiency.
Collapse
|
15
|
Grimm O, Kaiser S, Plichta MM, Tobler PN. Altered reward anticipation: Potential explanation for weight gain in schizophrenia? Neurosci Biobehav Rev 2017; 75:91-103. [DOI: 10.1016/j.neubiorev.2017.01.029] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Revised: 01/19/2017] [Accepted: 01/23/2017] [Indexed: 01/19/2023]
|
16
|
Kroemer NB, Burrasch C, Hellrung L. To work or not to work: Neural representation of cost and benefit of instrumental action. PROGRESS IN BRAIN RESEARCH 2016; 229:125-157. [PMID: 27926436 DOI: 10.1016/bs.pbr.2016.06.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
By definition, instrumental actions are performed in order to obtain certain goals. Nevertheless, the attainment of goals typically implies obstacles, and response vigor is known to reflect an integration of subjective benefit and cost. Whereas several brain regions have been associated with cost/benefit ratio decision-making, trial-by-trial fluctuations in motivation are not well understood. We review recent evidence supporting the motivational implications of signal fluctuations in the mesocorticolimbic system. As an extension of "set-point" theories of instrumental action, we propose that response vigor is determined by a rapid integration of brain signals that reflect value and cost on a trial-by-trial basis giving rise to an online estimate of utility. Critically, we posit that fluctuations in key nodes of the network can predict deviations in response vigor and that variability in instrumental behavior can be accounted for by models devised from optimal control theory, which incorporate the effortful control of noise. Notwithstanding, the post hoc analysis of signaling dynamics has caveats that can effectively be addressed in future research with the help of two novel fMRI imaging techniques. First, adaptive fMRI paradigms can be used to establish a time-order relationship, which is a prerequisite for causality, by using observed signal fluctuations as triggers for stimulus presentation. Second, real-time fMRI neurofeedback can be employed to induce predefined brain states that may facilitate benefit or cost aspects of instrumental actions. Ultimately, understanding temporal dynamics in brain networks subserving response vigor holds the promise for targeted interventions that could help to readjust the motivational balance of behavior.
Collapse
Affiliation(s)
- N B Kroemer
- Technische Universität Dresden, Dresden, Germany.
| | - C Burrasch
- Technische Universität Dresden, Dresden, Germany; University of Lübeck, Lübeck, Germany
| | - L Hellrung
- Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
17
|
Fuel not fun: Reinterpreting attenuated brain responses to reward in obesity. Physiol Behav 2016; 162:37-45. [PMID: 27085908 DOI: 10.1016/j.physbeh.2016.04.020] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 04/05/2016] [Accepted: 04/12/2016] [Indexed: 12/13/2022]
Abstract
There is a well-established literature linking obesity to altered dopamine signaling and brain response to food-related stimuli. Neuroimaging studies frequently report enhanced responses in dopaminergic regions during food anticipation and decreased responses during reward receipt. This has been interpreted as reflecting anticipatory "reward surfeit", and consummatory "reward deficiency". In particular, attenuated response in the dorsal striatum to primary food rewards is proposed to reflect anhedonia, which leads to overeating in an attempt to compensate for the reward deficit. In this paper, we propose an alternative view. We consider brain response to food-related stimuli in a reinforcement-learning framework, which can be employed to separate the contributions of reward sensitivity and reward-related learning that are typically entangled in the brain response to reward. Consequently, we posit that decreased striatal responses to milkshake receipt reflect reduced reward-related learning rather than reward deficiency or anhedonia because reduced reward sensitivity would translate uniformly into reduced anticipatory and consummatory responses to reward. By re-conceptualizing reward deficiency as a shift in learning about subjective value of rewards, we attempt to reconcile neuroimaging findings with the putative role of dopamine in effort, energy expenditure and exploration and suggest that attenuated brain responses to energy dense foods reflect the "fuel", not the fun entailed by the reward.
Collapse
|
18
|
Predicting childhood effortful control from interactions between early parenting quality and children's dopamine transporter gene haplotypes. Dev Psychopathol 2015; 28:199-212. [PMID: 25924976 DOI: 10.1017/s0954579415000383] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Children's observed effortful control (EC) at 30, 42, and 54 months (n = 145) was predicted from the interaction between mothers' observed parenting with their 30-month-olds and three variants of the solute carrier family C6, member 3 (SLC6A3) dopamine transporter gene (single nucleotide polymorphisms in intron8 and intron13, and a 40 base pair variable number tandem repeat [VNTR] in the 3'-untranslated region [UTR]), as well as haplotypes of these variants. Significant moderating effects were found. Children without the intron8-A/intron13-G, intron8-A/3'-UTR VNTR-10, or intron13-G/3'-UTR VNTR-10 haplotypes (i.e., haplotypes associated with the reduced SLC6A3 gene expression and thus lower dopamine functioning) appeared to demonstrate altered levels of EC as a function of maternal parenting quality, whereas children with these haplotypes demonstrated a similar EC level regardless of the parenting quality. Children with these haplotypes demonstrated a trade-off, such that they showed higher EC, relative to their counterparts without these haplotypes, when exposed to less supportive maternal parenting. The findings revealed a diathesis-stress pattern and suggested that different SLC6A3 haplotypes, but not single variants, might represent different levels of young children's sensitivity/responsivity to early parenting.
Collapse
|
19
|
Neural Mechanisms for Evaluating Environmental Variability in Caenorhabditis elegans. Neuron 2015; 86:428-41. [PMID: 25864633 DOI: 10.1016/j.neuron.2015.03.026] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 01/18/2015] [Accepted: 02/20/2015] [Indexed: 11/21/2022]
Abstract
The ability to evaluate variability in the environment is vital for making optimal behavioral decisions. Here we show that Caenorhabditis elegans evaluates variability in its food environment and modifies its future behavior accordingly. We derive a behavioral model that reveals a critical period over which information about the food environment is acquired and predicts future search behavior. We also identify a pair of high-threshold sensory neurons that encode variability in food concentration and the downstream dopamine-dependent circuit that generates appropriate search behavior upon removal from food. Further, we show that CREB is required in a subset of interneurons and determines the timescale over which the variability is integrated. Interestingly, the variability circuit is a subset of a larger circuit driving search behavior, showing that learning directly modifies the very same neurons driving behavior. Our study reveals how a neural circuit decodes environmental variability to generate contextually appropriate decisions.
Collapse
|
20
|
Reward-based decision making in pathological gambling: The roles of risk and delay. Neurosci Res 2015; 90:3-14. [DOI: 10.1016/j.neures.2014.09.008] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 08/26/2014] [Accepted: 08/26/2014] [Indexed: 01/27/2023]
|