1
|
Smith TR, Southern R, Kirkpatrick K. Mechanisms of impulsive choice: Experiments to explore and models to map the empirical terrain. Learn Behav 2023; 51:355-391. [PMID: 36913144 PMCID: PMC10497727 DOI: 10.3758/s13420-023-00577-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/16/2023] [Indexed: 03/14/2023]
Abstract
Impulsive choice is preference for a smaller-sooner (SS) outcome over a larger-later (LL) outcome when LL choices result in greater reinforcement maximization. Delay discounting is a model of impulsive choice that describes the decaying value of a reinforcer over time, with impulsive choice evident when the empirical choice-delay function is steep. Steep discounting is correlated with multiple diseases and disorders. Thus, understanding the processes underlying impulsive choice is a popular topic for investigation. Experimental research has explored the conditions that moderate impulsive choice, and quantitative models of impulsive choice have been developed that elegantly represent the underlying processes. This review spotlights experimental research in impulsive choice covering human and nonhuman animals across the domains of learning, motivation, and cognition. Contemporary models of delay discounting designed to explain the underlying mechanisms of impulsive choice are discussed. These models focus on potential candidate mechanisms, which include perception, delay and/or reinforcer sensitivity, reinforcement maximization, motivation, and cognitive systems. Although the models collectively explain multiple mechanistic phenomena, there are several cognitive processes, such as attention and working memory, that are overlooked. Future research and model development should focus on bridging the gap between quantitative models and empirical phenomena.
Collapse
|
2
|
Bakst L, McGuire JT. Experience-driven recalibration of learning from surprising events. Cognition 2023; 232:105343. [PMID: 36481590 PMCID: PMC9851993 DOI: 10.1016/j.cognition.2022.105343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 10/13/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022]
Abstract
Different environments favor different patterns of adaptive learning. A surprising event that in one context would accelerate belief updating might, in another context, be downweighted as a meaningless outlier. Here, we investigated whether people would spontaneously regulate the influence of surprise on learning in response to event-by-event experiential feedback. Across two experiments, we examined whether participants performing a perceptual judgment task under spatial uncertainty (n = 29, n = 63) adapted their patterns of predictive gaze according to the informativeness or uninformativeness of surprising events in their current environment. Uninstructed predictive eye movements exhibited a form of metalearning in which surprise came to modulate event-by-event learning rates in opposite directions across contexts. Participants later appropriately readjusted their patterns of adaptive learning when the statistics of the environment underwent an unsignaled reversal. Although significant adjustments occurred in both directions, performance was consistently superior in environments in which surprising events reflected meaningful change, potentially reflecting a bias towards interpreting surprise as informative and/or difficulty ignoring salient outliers. Our results provide evidence for spontaneous, context-appropriate recalibration of the role of surprise in adaptive learning.
Collapse
Affiliation(s)
- Leah Bakst
- Department of Psychological & Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA 02215, USA; Center for Systems Neuroscience, Boston University, 610 Commonwealth Avenue, Boston, MA 02215, USA.
| | - Joseph T McGuire
- Department of Psychological & Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA 02215, USA; Center for Systems Neuroscience, Boston University, 610 Commonwealth Avenue, Boston, MA 02215, USA.
| |
Collapse
|
3
|
Ito KL, Cao L, Reinberg R, Keller B, Monterosso J, Schweighofer N, Liew SL. Validating Habitual and Goal-Directed Decision-Making Performance Online in Healthy Older Adults. Front Aging Neurosci 2021; 13:702810. [PMID: 34267650 PMCID: PMC8276057 DOI: 10.3389/fnagi.2021.702810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/04/2021] [Indexed: 12/02/2022] Open
Abstract
Everyday decision-making is supported by a dual-system of control comprised of parallel goal-directed and habitual systems. Over the past decade, the two-stage Markov decision task has become popularized for its ability to dissociate between goal-directed and habitual decision-making. While a handful of studies have implemented decision-making tasks online, only one study has validated the task by comparing in-person and web-based performance on the two-stage task in children and young adults. To date, no study has validated the dissociation of goal-directed and habitual behaviors in older adults online. Here, we implemented and validated a web-based version of the two-stage Markov task using parameter simulation and recovery and compared behavioral results from online and in-person participation on the two-stage task in both young and healthy older adults. We found no differences in estimated free parameters between online and in-person participation on the two-stage task. Further, we replicate previous findings that young adults are more goal-directed than older adults both in-person and online. Overall, this work demonstrates that the implementation and use of the two-stage Markov decision task for remote participation is feasible in the older adult demographic, which would allow for the study of decision-making with larger and more diverse samples.
Collapse
Affiliation(s)
- Kaori L Ito
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| | - Laura Cao
- Computational Neuro-Rehabilitation Laboratory, Department of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, United States
| | - Renee Reinberg
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| | - Brenton Keller
- Department of Gerontology, University of Southern California, Los Angeles, CA, United States
| | - John Monterosso
- Department of Psychology, University of Southern California, Los Angeles, CA, United States
| | - Nicolas Schweighofer
- Computational Neuro-Rehabilitation Laboratory, Department of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, United States
| | - Sook-Lei Liew
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
4
|
Yates JR, Day HA, Evans KE, Igwe HO, Kappesser JL, Miller AL, Murray CP, Torline BT, Ellis AL, Stacy WL. Effects of d-amphetamine and MK-801 on impulsive choice: Modulation by schedule of reinforcement and delay length. Behav Brain Res 2019; 376:112228. [PMID: 31520689 DOI: 10.1016/j.bbr.2019.112228] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 09/10/2019] [Accepted: 09/10/2019] [Indexed: 01/06/2023]
Abstract
Procedural modifications can modulate drug effects in delay discounting, such as signaling the delay to reinforcement and altering the order in which delays are presented. Although the schedule of reinforcement can alter the rate at which animals discount a reinforcer, research has not determined if animals trained on different schedules of reinforcement are differentially affected by pharmacological manipulations. Similarly, research has not determined if using different delays to reinforcement can modulate drug effects in delay discounting. Male Sprague Dawley rats (n = 36) were split into four groups and were trained in a delay-discounting procedure. The schedule of reinforcement (fixed ratio [FR] 1 vs. FR 10) and delays to reinforcement (0, 5, 10, 20, and 50 s vs. 0, 10, 30, 60, 100 s) were manipulated for each group. Following behavioral training, rats were treated with d-amphetamine (0, 0.25, 0.5, and 1.0 mg/kg) and MK-801 (0, 0.03, and 0.06 mg/kg). Results showed that amphetamine decreased impulsive choice when a FR 1 schedule was used, but only when the short delay sequence was used. Conversely, amphetamine decreased impulsive choice when a FR 10 schedule was used, but only when rats were trained on the long delay sequence. MK-801 decreased impulsive choice in rats trained on a FR 1 schedule, regardless of delay sequence, but did not alter choice in rats trained on a FR 10 schedule. These results show that schedule of reinforcement and delay length can modulate drug effects in delay discounting.
Collapse
Affiliation(s)
- Justin R Yates
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA.
| | - Haley A Day
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Karson E Evans
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Hephzibah O Igwe
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Joy L Kappesser
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Amber L Miller
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Christopher P Murray
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Brett T Torline
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - Alexis L Ellis
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| | - William L Stacy
- Department of Psychological Science, Northern Kentucky University, 1 Nunn Drive, Highland Heights, KY, 41099, USA
| |
Collapse
|
5
|
Howatt BC, Muñoz Torrecillas MJ, Cruz Rambaud S, Takahashi T. A New Analysis on Self-Control in Intertemporal Choice and Mediterranean Dietary Pattern. Front Public Health 2019; 7:165. [PMID: 31316959 PMCID: PMC6611428 DOI: 10.3389/fpubh.2019.00165] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 06/04/2019] [Indexed: 11/22/2022] Open
Abstract
This paper completes Muñoz Torrecillas et al. (1) results and conclusions investigating the relationship between adherence to healthy dietary habits, specifically the Mediterranean Diet (hereinafter, MD), and impulsivity in intertemporal choices. Impulsivity can be defined as the strong preference for small immediate payoffs over larger delayed payoffs, and in the original study this behavior was captured by the parameter k (discount rate of the hyperbolic discount function), calculated using an automated scoring mechanism. Adherence to MD was measured by the KIDMED index and then grouped into three levels: high, medium, and low. While the authors observed that individuals in the high adherence group had the shallowest discounting and individuals in the low adherence group had the steepest discounting, the data were not statistically analyzed in depth. Therefore, the purpose of the present paper is to propose a preliminary quantitative model for this relationship and evaluate its significance. Tests revealed a significant interaction between adherence to MD and magnitude of delayed rewards when predicting discount rates. Specifically, the degree to which impulsivity decreases as adherence to MD increases is strongly influenced by delayed rewards of smaller magnitude. These findings are consistent with the authors' claims that healthy dietary habits may be closely linked with greater self-control when payoffs are small, and thus warrant further examination. The results do not indicate causality though, so future studies could also investigate the directions of this relationship as a means of developing behavioral interventions.
Collapse
Affiliation(s)
- Brian C. Howatt
- Department of Psychological Sciences, Kansas State University, Manhattan, KS, United States
| | | | | | - Taiki Takahashi
- Department of Behavioral Science, Center for Experimental Research in Social Sciences, Hokkaido University, Hokkaido, Japan
| |
Collapse
|
6
|
Hayden BY. Why has evolution not selected for perfect self-control? Philos Trans R Soc Lond B Biol Sci 2019; 374:20180139. [PMID: 30966922 PMCID: PMC6335460 DOI: 10.1098/rstb.2018.0139] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/04/2018] [Indexed: 12/13/2022] Open
Abstract
Self-control refers to the ability to deliberately reject tempting options and instead select ones that produce greater long-term benefits. Although some apparent failures of self-control are, on closer inspection, reward maximizing, at least some self-control failures are clearly disadvantageous and non-strategic. The existence of poor self-control presents an important evolutionary puzzle because there is no obvious reason why good self-control should be more costly than poor self-control. After all, a rock is infinitely patient. I propose that self-control failures result from cases in which well-learned (and thus routinized) decision-making strategies yield suboptimal choices. These mappings persist in the decision-makers' repertoire because they result from learning processes that are adaptive in the broader context, either on the timescale of learning or of evolution. Self-control, then, is a form of cognitive control and the subjective feeling of effort likely reflects the true costs of cognitive control. Poor self-control, in this view, is ultimately a result of bounded optimality. This article is part of the theme issue 'Risk taking and impulsive behaviour: fundamental discoveries, theoretical perspectives and clinical implications.
Collapse
Affiliation(s)
- Benjamin Y. Hayden
- Department of Neuroscience, Center for Magnetic Resonance Research, Center for Neuroengineering, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
7
|
Cuevas Rivera D, Ott F, Markovic D, Strobel A, Kiebel SJ. Context-Dependent Risk Aversion: A Model-Based Approach. Front Psychol 2018; 9:2053. [PMID: 30416474 PMCID: PMC6212575 DOI: 10.3389/fpsyg.2018.02053] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 10/05/2018] [Indexed: 02/03/2023] Open
Abstract
Most research on risk aversion in behavioral science with human subjects has focused on a component of risk aversion that does not adapt itself to context. More recently, studies have explored risk aversion adaptation to changing circumstances in sequential decision-making tasks. It is an open question whether one can identify evidence, at the single subject level, for such risk aversion adaptation. We conducted a behavioral experiment on human subjects, using a sequential decision making task. We developed a model-based approach for estimating the adaptation of risk-taking behavior with single-trial resolution by modeling a subject's goals and internal representation of task contingencies. Using this model-based approach, we estimated the subject-specific adaptation of risk aversion depending on the current task context. We found striking inter-subject variations in the adaptation of risk-taking behavior. We show that these differences can be explained by differences in subjects' internal representations of task contingencies and goals. We discuss that the proposed approach can be adapted to a wide range of experimental paradigms and be used to analyze behavioral measures other than risk aversion.
Collapse
Affiliation(s)
- Darío Cuevas Rivera
- Chair of Neuroimaging, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Florian Ott
- Chair of Neuroimaging, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Dimitrije Markovic
- Chair of Neuroimaging, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Alexander Strobel
- Chair of Differential and Personality Psychology, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Stefan J Kiebel
- Chair of Neuroimaging, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
8
|
Todokoro A, Tanaka SC, Kawakubo Y, Yahata N, Ishii-Takahashi A, Nishimura Y, Kano Y, Ohtake F, Kasai K. Deficient neural activity subserving decision-making during reward waiting time in intertemporal choice in adult attention-deficit hyperactivity disorder. Psychiatry Clin Neurosci 2018; 72:580-590. [PMID: 29687930 DOI: 10.1111/pcn.12668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 04/10/2018] [Accepted: 04/18/2018] [Indexed: 11/30/2022]
Abstract
AIM Impulsivity, which significantly affects social adaptation, is an important target behavioral characteristic in interventions for attention-deficit hyperactivity disorder (ADHD). Typically, people are willing to wait longer to acquire greater rewards. Impulsivity in ADHD may be associated with brain dysfunction in decision-making involving waiting behavior under such situations. We tested the hypothesis that brain circuitry during a period of waiting (i.e., prior to the acquisition of reward) is altered in adults with ADHD. METHODS The participants included 14 medication-free adults with ADHD and 16 healthy controls matched for age, sex, IQ, and handedness. The behavioral task had participants choose between a delayed, larger monetary reward and an immediate, smaller monetary reward, where the reward waiting time actually occurred during functional magnetic resonance imaging measurement. We tested for group differences in the contrast values of blood-oxygen-level dependent signals associated with the length of waiting time, calculated using the parametric modulation method. RESULTS While the two groups did not differ in the time discounting rate, the delay-sensitive contrast values were significantly lower in the caudate and visual cortex in individuals with ADHD. The higher impulsivity scores were significantly associated with lower delay-sensitive contrast values in the caudate and visual cortex. CONCLUSION These results suggest that deficient neural activity affects decision-making involving reward waiting time during intertemporal choice tasks, and provide an explanation for the basis of impulsivity in adult ADHD.
Collapse
Affiliation(s)
- Ayako Todokoro
- Department of Child Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Saori C Tanaka
- ATR Brain Information Communication Research Laboratory Group, Kyoto, Japan
| | - Yuki Kawakubo
- Department of Child Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Noriaki Yahata
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.,Molecular Imaging Center, National Institute of Radiological Sciences, Chiba, Japan
| | - Ayaka Ishii-Takahashi
- Department of Child Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Yukika Nishimura
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.,Japan Agency for Medical Research and Development, Tokyo, Japan
| | - Yukiko Kano
- Department of Child Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Fumio Ohtake
- Institute of Social and Economic Research, Osaka University, Osaka, Japan
| | - Kiyoto Kasai
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.,The International Research Center for Neurointelligence at The University of Tokyo Institutes for Advanced Study, Tokyo, Japan
| |
Collapse
|
9
|
Preliminary evidence of altered neural response during intertemporal choice of losses in adult attention-deficit hyperactivity disorder. Sci Rep 2018; 8:6703. [PMID: 29712945 PMCID: PMC5928218 DOI: 10.1038/s41598-018-24944-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 04/11/2018] [Indexed: 01/09/2023] Open
Abstract
Impulsive behaviours are common symptoms of attention-deficit hyperactivity disorder (ADHD). Although previous studies have suggested functional models of impulsive behaviour, a full explanation of impulsivity in ADHD remains elusive. To investigate the detailed mechanisms behind impulsive behaviour in ADHD, we applied an economic intertemporal choice task involving gains and losses to adults with ADHD and healthy controls and measured brain activity by functional magnetic resonance imaging. In the intertemporal choice of future gains, we observed no behavioural or neural difference between the two groups. In the intertemporal choice of future losses, adults with ADHD exhibited higher discount rates than the control participants. Furthermore, a comparison of brain activity representing the sensitivity of future loss in the two groups revealed significantly lower activity in the striatum and higher activity in the amygdala in adults with ADHD than in controls. Our preliminary findings suggest that an altered size sensitivity to future loss is involved in apparent impulsive choice behaviour in adults with ADHD and shed light on the multifaceted impulsivity underlying ADHD.
Collapse
|
10
|
Seinstra MS, Sellitto M, Kalenscher T. Rate maximization and hyperbolic discounting in human experiential intertemporal decision making. Behav Ecol 2017. [DOI: 10.1093/beheco/arx145] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Maayke Suzanne Seinstra
- Comparative Psychology, Institute of Experimental Psychology, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
| | - Manuela Sellitto
- Comparative Psychology, Institute of Experimental Psychology, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
| | - Tobias Kalenscher
- Comparative Psychology, Institute of Experimental Psychology, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
| |
Collapse
|
11
|
Delay Discounting Rates are Temporally Stable in an Equivalent Present Value Procedure Using Theoretical and Area under the Curve Analyses. PSYCHOLOGICAL RECORD 2017. [DOI: 10.1007/bf03395804] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
12
|
Mitchell SH. Devaluation of Outcomes Due to Their Cost: Extending Discounting Models Beyond Delay. NEBRASKA SYMPOSIUM ON MOTIVATION 2017. [DOI: 10.1007/978-3-319-51721-6_5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
13
|
Tang H, Luo F, Li SH, Li BM. Behavioral representation of cost and benefit balance in rats. Neurosci Lett 2016; 632:175-80. [PMID: 27589889 DOI: 10.1016/j.neulet.2016.08.054] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 08/27/2016] [Accepted: 08/29/2016] [Indexed: 10/21/2022]
Abstract
Decision making is dependent upon individual motivation. Previous studies showed that animals with higher levels of motivation are more likely to invest more time to acquire larger rewards rather than acquiring smaller rewards with less time to wait. However, little is known about how this motivation mediates the cognitive effort animals devote upon making said decisions in detail. In the present study, we investigated the behavioral response in a goal-directed action under a differential reward schedule by training rats to perform a "Do more, get more" (DM-GM) task using a nosepoke operandum when longer nosepoke durations resulted in correspondingly larger rewards. In general, the subjects learned this DM-GM rule and reached a steady behavioral state within 15days. During the training stage, the rats found the most cost-effective action choice and behaved according to that guideline more frequently than other possible actions. In addition, when the cost-benefit ratio changed, the rats again found a new most cost-effective choice to obtain maximum rewards. Our results demonstrate that there is a "balance point" of cost and benefit in rat valuation system and that this "balance point" not only guides the rats to make the appropriate decision, but that this point can be modified upon new situations to choose a newer optimum action plan.
Collapse
Affiliation(s)
- Hua Tang
- Center for Neuropsychiatric Diseases, Institute of Life Science, Nanchang University, Nanchang 330031, China
| | - Fei Luo
- Center for Neuropsychiatric Diseases, Institute of Life Science, Nanchang University, Nanchang 330031, China
| | - Si-Hai Li
- Center for Neuropsychiatric Diseases, Institute of Life Science, Nanchang University, Nanchang 330031, China
| | - Bao-Ming Li
- Center for Neuropsychiatric Diseases, Institute of Life Science, Nanchang University, Nanchang 330031, China.
| |
Collapse
|
14
|
Iigaya K, Story GW, Kurth-Nelson Z, Dolan RJ, Dayan P. The modulation of savouring by prediction error and its effects on choice. eLife 2016; 5. [PMID: 27101365 PMCID: PMC4866828 DOI: 10.7554/elife.13747] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2015] [Accepted: 04/14/2016] [Indexed: 12/04/2022] Open
Abstract
When people anticipate uncertain future outcomes, they often prefer to know their fate in advance. Inspired by an idea in behavioral economics that the anticipation of rewards is itself attractive, we hypothesized that this preference of advance information arises because reward prediction errors carried by such information can boost the level of anticipation. We designed new empirical behavioral studies to test this proposal, and confirmed that subjects preferred advance reward information more strongly when they had to wait for rewards for a longer time. We formulated our proposal in a reinforcement-learning model, and we showed that our model could account for a wide range of existing neuronal and behavioral data, without appealing to ambiguous notions such as an explicit value for information. We suggest that such boosted anticipation significantly drives risk-seeking behaviors, most pertinently in gambling. DOI:http://dx.doi.org/10.7554/eLife.13747.001 People, pigeons and monkeys often want to know in advance whether they will receive a reward in the future. This behaviour is irrational when individuals pay for costly information that makes no difference to an eventual outcome. One explanation is that individuals seek information because anticipating reward has hedonic value (it produces a feeling of pleasure). Consistent with this, pigeons are more likely to seek information when they have to wait longer for the potential reward. However, existing models cannot account for why this anticipation of rewards leads to irrational information-seeking. In many situations, animals are uncertain about what is going to happen. Providing new information can produce a “prediction error” that indexes a discrepancy between what is expected and what actually happens. Iigaya et al. have now developed a mathematical model of information-seeking in which anticipation is boosted by this prediction error. The model accounts for a wide range of previously unexplained data from monkeys and pigeons. It also successfully explains the behaviour of a group of human volunteers from whom Iigaya et al. elicited informational and actual decisions concerning uncertain and delayed rewards. The longer that the participants had to wait for possible rewards, the more avidly they wanted to find out about them. Further research is now needed to investigate the neural underpinnings of anticipation and its boosting by prediction errors. DOI:http://dx.doi.org/10.7554/eLife.13747.002
Collapse
Affiliation(s)
- Kiyohito Iigaya
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Giles W Story
- The Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Zeb Kurth-Nelson
- The Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Raymond J Dolan
- The Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| |
Collapse
|
15
|
|
16
|
Lloyd K, Dayan P. Tamping Ramping: Algorithmic, Implementational, and Computational Explanations of Phasic Dopamine Signals in the Accumbens. PLoS Comput Biol 2015; 11:e1004622. [PMID: 26699940 PMCID: PMC4689534 DOI: 10.1371/journal.pcbi.1004622] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Accepted: 10/25/2015] [Indexed: 11/26/2022] Open
Abstract
Substantial evidence suggests that the phasic activity of dopamine neurons represents reinforcement learning’s temporal difference prediction error. However, recent reports of ramp-like increases in dopamine concentration in the striatum when animals are about to act, or are about to reach rewards, appear to pose a challenge to established thinking. This is because the implied activity is persistently predictable by preceding stimuli, and so cannot arise as this sort of prediction error. Here, we explore three possible accounts of such ramping signals: (a) the resolution of uncertainty about the timing of action; (b) the direct influence of dopamine over mechanisms associated with making choices; and (c) a new model of discounted vigour. Collectively, these suggest that dopamine ramps may be explained, with only minor disturbance, by standard theoretical ideas, though urgent questions remain regarding their proximal cause. We suggest experimental approaches to disentangling which of the proposed mechanisms are responsible for dopamine ramps. Dopamine has long been implicated in reward-motivated behaviour. Theory and experiments suggest that activity of dopamine-containing neurons resembles a temporally-sophisticated prediction error used to learn expectations of future reward. This account would appear to be inconsistent with recent observations of ‘ramps’, i.e., gradual increases in extracellular dopamine concentration prior to the execution of actions or the acquisition of rewards. We explore three different possible explanations of such ramping signals as arising: (a) when subjects experience uncertainty about when actions will be executed; (b) when dopamine itself influences the timecourse of choice; and (c) under a new model in which ‘quasi-tonic’ dopamine signals arise through a form of temporal discounting. We thereby show that dopamine ramps can be integrated with current theories, and also suggest experiments to clarify which mechanisms are involved.
Collapse
Affiliation(s)
- Kevin Lloyd
- Gatsby Computational Neuroscience Unit, London, United Kingdom
- * E-mail:
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, London, United Kingdom
| |
Collapse
|
17
|
|
18
|
McGuire JT, Kable JW. Medial prefrontal cortical activity reflects dynamic re-evaluation during voluntary persistence. Nat Neurosci 2015; 18:760-6. [PMID: 25849988 PMCID: PMC4437670 DOI: 10.1038/nn.3994] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 03/10/2015] [Indexed: 12/14/2022]
Abstract
Deciding how long to keep waiting for future rewards is a nontrivial problem, especially when the timing of rewards is uncertain. We carried out an experiment in which human decision makers waited for rewards in two environments in which reward-timing statistics favored either a greater or lesser degree of behavioral persistence. We found that decision makers adaptively calibrated their level of persistence for each environment. Functional neuroimaging revealed signals that evolved differently during physically identical delays in the two environments, consistent with a dynamic and context-sensitive reappraisal of subjective value. This effect was observed in a region of ventromedial prefrontal cortex that is sensitive to subjective value in other contexts, demonstrating continuity between valuation mechanisms involved in discrete choice and in temporally extended decisions analogous to foraging. Our findings support a model in which voluntary persistence emerges from dynamic cost/benefit evaluation rather than from a control process that overrides valuation mechanisms.
Collapse
Affiliation(s)
- Joseph T. McGuire
- Department of Psychology, University of Pennsylvania, 3720 Walnut St., Philadelphia, PA 19104, USA
| | - Joseph W. Kable
- Department of Psychology, University of Pennsylvania, 3720 Walnut St., Philadelphia, PA 19104, USA
| |
Collapse
|
19
|
Guan S, Cheng L, Fan Y, Li X. Myopic decisions under negative emotions correlate with altered time perception. Front Psychol 2015; 6:468. [PMID: 25941508 PMCID: PMC4400848 DOI: 10.3389/fpsyg.2015.00468] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 03/31/2015] [Indexed: 11/13/2022] Open
Abstract
Previous studies have obtained inconsistent findings about emotional influence on inter-temporal choice (IC). In the present study, we first examined the effect of temporary emotional priming induced by affective pictures in a trial-to-trial paradigm on IC. The results showed that negative priming resulted in much higher percentages of trials during which smaller-but-sooner reward (SS%) were chosen compared with positive and neutral priming. Next, we attempted to explore the possible mechanisms underlying such emotional effects. When participants performed a time reproduction task, mean reaction times in negative priming condition were significantly shorter than those in the other two emotional contexts, which indicated that negative emotional priming led to overestimation of time. Moreover, such overestimation was negatively correlated with performance in the IC task. In contrast, temporary changes of emotional contexts did not alter performances in a Go/NoGo task (including commission errors and omission errors). In sum, our present findings suggested that myopic decisions under negative emotions were associated with altered time perception but not response inhibition.
Collapse
Affiliation(s)
- Shuchen Guan
- Key Laboratory of Brain Functional Genomics, Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Psychology and Cognitive Science, East China Normal University , Shanghai, China
| | - Lu Cheng
- Key Laboratory of Brain Functional Genomics, Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Psychology and Cognitive Science, East China Normal University , Shanghai, China
| | - Ying Fan
- Key Laboratory of Brain Functional Genomics, Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Psychology and Cognitive Science, East China Normal University , Shanghai, China
| | - Xianchun Li
- Key Laboratory of Brain Functional Genomics, Ministry of Education, Shanghai Key Laboratory of Brain Functional Genomics, School of Psychology and Cognitive Science, East China Normal University , Shanghai, China
| |
Collapse
|
20
|
Abstract
Humans exhibit a suite of biases when making economic decisions. We review recent research on the origins of human decision making by examining whether similar choice biases are seen in nonhuman primates, our closest phylogenetic relatives. We propose that comparative studies can provide insight into four major questions about the nature of human choice biases that cannot be addressed by studies of our species alone. First, research with other primates can address the evolution of human choice biases and identify shared versus human-unique tendencies in decision making. Second, primate studies can constrain hypotheses about the psychological mechanisms underlying such biases. Third, comparisons of closely related species can identify when distinct mechanisms underlie related biases by examining evolutionary dissociations in choice strategies. Finally, comparative work can provide insight into the biological rationality of economically irrational preferences.
Collapse
Affiliation(s)
- Laurie R Santos
- Department of Psychology, Yale University, New Haven, Connecticut 06511;
| | | |
Collapse
|
21
|
Voon V. Models of Impulsivity with a Focus on Waiting Impulsivity: Translational Potential for Neuropsychiatric Disorders. CURRENT ADDICTION REPORTS 2014; 1:281-288. [PMID: 25346881 PMCID: PMC4201744 DOI: 10.1007/s40429-014-0036-5] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Waiting impulsivity, also known as premature or anticipatory responding, is well established in preclinical studies through the 5-Choice Serial Reaction Time (5-CSRT) task. Waiting impulsivity is important in disorders of addiction. Preclinical studies suggest a role both as a predictor, and as a consequence, in disorders of addiction. Here we discuss the relationship between the preclinical 5-CSRT and translational fidelity in newly developed translational tasks. Preclinical and clinical literature relevant to premature responding and disorders of addiction are reviewed. Understanding which processes are critical to premature responding is important in understanding the nature of premature responding. Premature responding may also have overlaps with motivational processes, proactive response inhibition, tonic inhibitory processes, and delay discounting.
Collapse
Affiliation(s)
- Valerie Voon
- Department of Psychiatry, University of Cambridge, Cambridge, CB2 0QQ UK ; Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, UK ; Cambridgeshire and Peterborough NHS Foundation Trust, Cambridge, UK
| |
Collapse
|
22
|
Abstract
Humans typically discount future gains more than losses. This phenomenon is referred to as the "sign effect" in experimental and behavioral economics. Although recent studies have reported associations between the sign effect and important social problems, such as obesity and incurring multiple debts, the biological basis for this phenomenon remains poorly understood. Here, we hypothesized that enhanced loss-related neural processing in magnitude and/or delay representation are causes of the sign effect. We examined participants performing intertemporal choice tasks involving future gains or losses and compared the brain activity of those who exhibited the sign effect and those who did not. When predicting future losses, significant differences were apparent between the two participant groups in terms of striatal activity representing delay length and in insular activity representing sensitivity to magnitude. Furthermore, participants with the sign effect exhibited a greater insular response to the magnitude of loss than to that of gain, and also a greater striatal response to the delay of loss than to that of gain. These findings may provide a new biological perspective for the development of novel treatments and preventive measures for social problems associated with the sign effect.
Collapse
|
23
|
Namboodiri VMK, Mihalas S, Marton TM, Hussain Shuler MG. A general theory of intertemporal decision-making and the perception of time. Front Behav Neurosci 2014; 8:61. [PMID: 24616677 PMCID: PMC3937698 DOI: 10.3389/fnbeh.2014.00061] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Accepted: 02/12/2014] [Indexed: 11/25/2022] Open
Abstract
Animals and humans make decisions based on their expected outcomes. Since relevant outcomes are often delayed, perceiving delays and choosing between earlier vs. later rewards (intertemporal decision-making) is an essential component of animal behavior. The myriad observations made in experiments studying intertemporal decision-making and time perception have not yet been rationalized within a single theory. Here we present a theory—Training-Integrated Maximized Estimation of Reinforcement Rate (TIMERR)—that explains a wide variety of behavioral observations made in intertemporal decision-making and the perception of time. Our theory postulates that animals make intertemporal choices to optimize expected reward rates over a limited temporal window which includes a past integration interval—over which experienced reward rate is estimated—as well as the expected delay to future reward. Using this theory, we derive mathematical expressions for both the subjective value of a delayed reward and the subjective representation of the delay. A unique contribution of our work is in finding that the past integration interval directly determines the steepness of temporal discounting and the non-linearity of time perception. In so doing, our theory provides a single framework to understand both intertemporal decision-making and time perception.
Collapse
Affiliation(s)
| | | | - Tanya M Marton
- Department of Neuroscience, Johns Hopkins University Baltimore, MD, USA
| | | |
Collapse
|
24
|
Setogawa T, Mizuhiki T, Matsumoto N, Akizawa F, Shidara M. Self-choice enhances value in reward-seeking in primates. Neurosci Res 2014; 80:45-54. [PMID: 24463226 DOI: 10.1016/j.neures.2014.01.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 12/30/2013] [Accepted: 01/08/2014] [Indexed: 10/25/2022]
Abstract
When an individual chooses one item from two or more alternatives, they compare the values of the expected outcomes. The outcome value can be determined by the associated reward amount, the probability of reward, and the workload required to earn the reward. Rational choice theory states that choices are made to maximize rewards over time, and that the same outcome values lead to an equal likelihood of choices. However, the theory does not distinguish between conditions with the same reward value, even when acquired under different circumstances, and does not always accurately describe real behavior. We have found that allowing a monkey to choose a reward schedule endows the schedule with extra value when compared to performance in an identical schedule that is chosen by another agent (a computer here). This behavior is not consistent with pure rational choice theory. Theoretical analysis using a modified temporal-difference learning model showed an enhanced schedule state value by self-choice. These results suggest that an increased reward value underlies the improved performances by self-choice during reward-seeking behavior.
Collapse
Affiliation(s)
- Tsuyoshi Setogawa
- Doctoral Program in Kansei, Behavioral and Brain Science, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan; Japan Society for the Promotion of Science, Japan
| | - Takashi Mizuhiki
- Doctoral Program in Kansei, Behavioral and Brain Science, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan; Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Narihisa Matsumoto
- Human Technology Research Institute, AIST, 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, Japan
| | - Fumika Akizawa
- Doctoral Program in Kansei, Behavioral and Brain Science, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| | - Munetaka Shidara
- Doctoral Program in Kansei, Behavioral and Brain Science, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan; Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan.
| |
Collapse
|
25
|
Kimura K, Izawa S, Sugaya N, Ogawa N, Yamada KC, Shirotsuki K, Mikami I, Hirata K, Nagano Y, Hasegawa T. The biological effects of acute psychosocial stress on delay discounting. Psychoneuroendocrinology 2013; 38:2300-8. [PMID: 23768971 DOI: 10.1016/j.psyneuen.2013.04.019] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 03/25/2013] [Accepted: 04/29/2013] [Indexed: 01/01/2023]
Abstract
Organisms prefer to receive rewards sooner rather than later because they excessively discount the subjective value of future rewards, a phenomenon called delay discounting. Recent studies have reported an association between cortisol-which is secreted by the hypothalamic-pituitary-adrenal (HPA) axis-and delay discounting. However, no study has examined whether acutely induced psychosocial stress modulates delay discounting. Thus, the present study examined the effect of acute psychosocial stress and its hormonal and inflammatory correlates on the rate of delay discounting. To accomplish this purpose, we assessed the participants' discounting rates using the questionnaire version with inter-temporal choice before and after an acute psychosocial stress task (the Trier Social Stress Test; TSST). The results demonstrated that TSST increased rates of delay discounting in only cortisol responders (not in non-responders), indicating the possible influence of the pathway from the HPA axis to the dopaminergic systems under acute stress. Furthermore, the findings of correlation analysis indicated a U-shaped relationship between baseline level of C-reactive protein and delay discounting rate, suggesting a complex relationship between inflammatory markers and delay discounting rate.
Collapse
Affiliation(s)
- Kenta Kimura
- Center for Applied Psychological Science (CAPS), Kwansei Gakuin University, Hyogo, Japan.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Postreward delays and systematic biases in measures of animal temporal discounting. Proc Natl Acad Sci U S A 2013; 110:15491-6. [PMID: 24003113 DOI: 10.1073/pnas.1310446110] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Intertemporal choice tasks, which pit smaller/sooner rewards against larger/later ones, are frequently used to study time preferences and, by extension, impulsivity and self-control. When used in animals, many trials are strung together in sequence and an adjusting buffer is added after the smaller/sooner option to hold the total duration of each trial constant. Choices of the smaller/sooner option are not reward maximizing and so are taken to indicate that the animal is discounting future rewards. However, if animals fail to correctly factor in the duration of the postreward buffers, putative discounting behavior may instead reflect constrained reward maximization. Here, we report three results consistent with this discounting-free hypothesis. We find that (i) monkeys are insensitive to the association between the duration of postreward delays and their choices; (ii) they are sensitive to the length of postreward delays, although they greatly underestimate them; and (iii) increasing the salience of the postreward delay biases monkeys toward the larger/later option, reducing measured discounting rates. These results are incompatible with standard discounting-based accounts but are compatible with an alternative heuristic model. Our data suggest that measured intertemporal preferences in animals may not reflect impulsivity, or even mental discounting of future options, and that standard human and animal intertemporal choice tasks measure unrelated mental processes.
Collapse
|
27
|
Bixter MT, Luhmann CC. Adaptive intertemporal preferences in foraging-style environments. Front Neurosci 2013; 7:93. [PMID: 23785308 PMCID: PMC3683629 DOI: 10.3389/fnins.2013.00093] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2012] [Accepted: 05/16/2013] [Indexed: 12/05/2022] Open
Abstract
Decision makers often face choices between smaller more immediate rewards and larger more delayed rewards. For example, when foraging for food, animals must choose between actions that have varying costs (e.g., effort, duration, energy expenditure) and varying benefits (e.g., amount of food intake). The combination of these costs and benefits determine what optimal behavior is. In the present study, we employ a foraging-style task to study how humans make reward-based choices in response to the real-time constraints of a dynamic environment. On each trial participants were presented with two rewards that differed in magnitude and in the delay until their receipt. Because the experiment was of a fixed duration, maximizing earnings required decision makers to determine how to trade off the magnitude and the delay associated with the two rewards on each trial. To evaluate the extent to which participants could adapt to the decision environment, specific task characteristics were manipulated, including reward magnitudes (Experiment 1) and the delay between trials (Experiment 2). Each of these manipulations was designed to alter the pattern of choices made by an optimal decision maker. Several findings are of note. First, different choice strategies were observed with the manipulated environmental constraints. Second, despite contextually-appropriate shifts in behavior between conditions in each experiment, choice patterns deviated from theoretical optimality. In particular, the delays associated with the rewards did not exert a consistent influence on choices as required by exponential discounting. Third, decision makers nevertheless performed surprisingly well in all task environments with any deviations from strict optimality not having particularly deleterious effects on earnings. Taken together, these results suggest that human decision makers are capable of exhibiting intertemporal preferences that reflect a variety of environmental constraints.
Collapse
Affiliation(s)
- Michael T. Bixter
- Department of Psychology, Stony Brook UniversityStony Brook, NY, USA
| | | |
Collapse
|
28
|
Paglieri F. THE COSTS OF DELAY: WAITING VERSUS POSTPONING IN INTERTEMPORAL CHOICE. J Exp Anal Behav 2013; 99:362-77. [DOI: 10.1002/jeab.18] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Accepted: 01/09/2013] [Indexed: 11/07/2022]
Affiliation(s)
- Fabio Paglieri
- Istituto di Scienze e Tecnologie della Cognizione, CNR; Rome
| |
Collapse
|
29
|
Abstract
Suppose that the purpose of a movement is to place the body in a more rewarding state. In this framework, slower movements may increase accuracy and therefore improve the probability of acquiring reward, but the longer durations of slow movements produce devaluation of reward. Here we hypothesize that the brain decides the vigor of a movement (duration and velocity) based on the expected discounted reward associated with that movement. We begin by showing that durations of saccades of varying amplitude can be accurately predicted by a model in which motor commands maximize expected discounted reward. This result suggests that reward is temporally discounted even in timescales of tens of milliseconds. One interpretation of temporal discounting is that the true objective of the brain is to maximize the rate of reward-which is equivalent to a specific form of hyperbolic discounting. A consequence of this idea is that the vigor of saccades should change as one alters the intertrial intervals between movements. We find experimentally that in healthy humans, as intertrial intervals are varied, saccade peak velocities and durations change on a trial-by-trial basis precisely as predicted by a model in which the objective is to maximize the rate of reward. Our results are inconsistent with theories in which reward is discounted exponentially. We suggest that there exists a single cost, rate of reward, which provides a unifying principle that may govern control of movements in timescales of milliseconds, as well as decision making in timescales of seconds to years.
Collapse
|
30
|
Talmi D, Pine A. How costs influence decision values for mixed outcomes. Front Neurosci 2012; 6:146. [PMID: 23112758 PMCID: PMC3481112 DOI: 10.3389/fnins.2012.00146] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Accepted: 09/14/2012] [Indexed: 11/30/2022] Open
Abstract
The things that we hold dearest often require a sacrifice, as epitomized in the maxim “no pain, no gain.” But how is the subjective value of outcomes established when they consist of mixtures of costs and benefits? We describe theoretical models for the integration of costs and benefits into a single value, drawing on both the economic and the empirical literatures, with the goal of rendering them accessible to the neuroscience community. We propose two key assays that go beyond goodness of fit for deciding between the dominant additive model and four varieties of interactive models. First, how they model decisions between costs when reward is not on offer; and second, whether they predict changes in reward sensitivity when costs are added to outcomes, and in what direction. We provide a selective review of relevant neurobiological work from a computational perspective, focusing on those studies that illuminate the underlying valuation mechanisms. Cognitive neuroscience has great potential to decide which of the theoretical models is actually employed by our brains, but empirical work has yet to fully embrace this challenge. We hope that future research improves our understanding of how our brain decides whether mixed outcomes are worthwhile.
Collapse
|
31
|
Rigoux L, Guigon E. A model of reward- and effort-based optimal decision making and motor control. PLoS Comput Biol 2012; 8:e1002716. [PMID: 23055916 PMCID: PMC3464194 DOI: 10.1371/journal.pcbi.1002716] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Accepted: 08/10/2012] [Indexed: 11/19/2022] Open
Abstract
Costs (e.g. energetic expenditure) and benefits (e.g. food) are central determinants of behavior. In ecology and economics, they are combined to form a utility function which is maximized to guide choices. This principle is widely used in neuroscience as a normative model of decision and action, but current versions of this model fail to consider how decisions are actually converted into actions (i.e. the formation of trajectories). Here, we describe an approach where decision making and motor control are optimal, iterative processes derived from the maximization of the discounted, weighted difference between expected rewards and foreseeable motor efforts. The model accounts for decision making in cost/benefit situations, and detailed characteristics of control and goal tracking in realistic motor tasks. As a normative construction, the model is relevant to address the neural bases and pathological aspects of decision making and motor control.
Collapse
Affiliation(s)
- Lionel Rigoux
- UPMC Univ Paris 06, UMR 7222, ISIR, Paris, France
- CNRS, UMR 7222, ISIR, Paris, France
| | - Emmanuel Guigon
- UPMC Univ Paris 06, UMR 7222, ISIR, Paris, France
- CNRS, UMR 7222, ISIR, Paris, France
- * E-mail:
| |
Collapse
|
32
|
The temporal derivative of expected utility: a neural mechanism for dynamic decision-making. Neuroimage 2012; 65:223-30. [PMID: 22963852 DOI: 10.1016/j.neuroimage.2012.08.063] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Revised: 07/28/2012] [Accepted: 08/21/2012] [Indexed: 11/20/2022] Open
Abstract
Real world tasks involving moving targets, such as driving a vehicle, are performed based on continuous decisions thought to depend upon the temporal derivative of the expected utility (∂V/∂t), where the expected utility (V) is the effective value of a future reward. However, the neural mechanisms that underlie dynamic decision-making are not well understood. This study investigates human neural correlates of both V and ∂V/∂t using fMRI and a novel experimental paradigm based on a pursuit-evasion game optimized to isolate components of dynamic decision processes. Our behavioral data show that players of the pursuit-evasion game adopt an exponential discounting function, supporting the expected utility theory. The continuous functions of V and ∂V/∂t were derived from the behavioral data and applied as regressors in fMRI analysis, enabling temporal resolution that exceeded the sampling rate of image acquisition, hyper-temporal resolution, by taking advantage of numerous trials that provide rich and independent manipulation of those variables. V and ∂V/∂t were each associated with distinct neural activity. Specifically, ∂V/∂t was associated with anterior and posterior cingulate cortices, superior parietal lobule, and ventral pallidum, whereas V was primarily associated with supplementary motor, pre and post central gyri, cerebellum, and thalamus. The association between the ∂V/∂t and brain regions previously related to decision-making is consistent with the primary role of the temporal derivative of expected utility in dynamic decision-making.
Collapse
|
33
|
Demoto Y, Okada G, Okamoto Y, Kunisato Y, Aoyama S, Onoda K, Munakata A, Nomura M, Tanaka SC, Schweighofer N, Doya K, Yamawaki S. Neural and personality correlates of individual differences related to the effects of acute tryptophan depletion on future reward evaluation. Neuropsychobiology 2012; 65:55-64. [PMID: 22222380 DOI: 10.1159/000328990] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/26/2010] [Accepted: 04/27/2011] [Indexed: 11/19/2022]
Abstract
BACKGROUND/AIMS In general, humans tend to discount the value of delayed reward. An increase in the rate of discounting leads to an inability to select a delayed reward over a smaller immediate reward (reward-delay impulsivity). Although deficits in the serotonergic system are implicated in this reward-delay impulsivity, there is individual variation in response to serotonin depletion. The aim of the present study was to investigate whether the effects of serotonin depletion on the ability to evaluate future reward are affected by individual personality traits or brain activation. METHODS Personality traits were assessed using the NEO-Five Factor Inventory and Temperament and Character Inventory. The central serotonergic levels of 16 healthy volunteers were manipulated by dietary tryptophan depletion. Subjects performed a delayed reward choice task that required the continuous estimation of reward value during functional magnetic resonance imaging scanning. RESULTS Discounting rates were increased in 9 participants, but were unchanged or decreased in 7 participants in response to tryptophan depletion. Participants whose discounting rate was increased by tryptophan depletion had significantly higher neuroticism and lower self-directedness. Furthermore, tryptophan depletion differentially affected the groups in terms of hemodynamic responses to the value of predicted future reward in the right insula. CONCLUSIONS These results suggest that individuals who have high neuroticism and low self-directedness as personality traits are particularly vulnerable to the effect of low serotonin on future reward evaluation accompanied by altered brain activation patterns.
Collapse
Affiliation(s)
- Yoshihiko Demoto
- Division of Frontier Medical Science, Department of Psychiatry and Neurosciences, Graduate School of Biomedical Sciences, Hiroshima University, Higashi-Hiroshima, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
Although delay discounting, the attenuation of the value of future rewards, is a robust finding, the mechanism of discounting is not known. We propose a potential mechanism for delay discounting such that discounting emerges from a search process that is trying to determine what rewards will be available in the future. In this theory, the delay dependence of the discounting of future expected rewards arises from three assumptions. First, that the evaluation of outcomes involves a search process. Second, that the value is assigned to an outcome proportionally to how easy it is to find. Third, that outcomes that are less delayed are typically easier for the search process to find. By relaxing this third assumption (e.g. by assuming that episodically-cued outcomes are easier to find), our model suggests that it is possible to dissociate discounting from delay. Our theory thereby explains the empirical result that discounting is slower to episodically-imagined outcomes, because these outcomes are easier for the search process to find. Additionally, the theory explains why improving cognitive resources such as working memory slows discounting, by improving searches and thereby making rewards easier to find. The three assumptions outlined here are likely to be instantiated during deliberative decision-making, but are unlikely in habitual decision-making. We model two simple implementations of this theory and show that they unify empirical results about the role of cognitive function in delay discounting, and make new neural, behavioral, and pharmacological predictions.
Collapse
Affiliation(s)
- Zeb Kurth-Nelson
- Wellcome Trust Centre for Neuroimaging, University College London, London, UK
| | | | | |
Collapse
|
35
|
Ahn WY, Rass O, Fridberg DJ, Bishara AJ, Forsyth JK, Breier A, Busemeyer JR, Hetrick WP, Bolbecker AR, O'Donnell BF. Temporal discounting of rewards in patients with bipolar disorder and schizophrenia. JOURNAL OF ABNORMAL PSYCHOLOGY 2011; 120:911-21. [PMID: 21875166 DOI: 10.1037/a0023333] [Citation(s) in RCA: 120] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Patients with bipolar disorder (BD) and schizophrenia (SZ) often show decision-making deficits in everyday circumstances. A failure to appropriately weigh immediate versus future consequences of choices may contribute to these deficits. We used the delay discounting task in individuals with BD or SZ to investigate their temporal decision making. Twenty-two individuals with BD, 21 individuals with SZ, and 30 healthy individuals completed the delay discounting task along with neuropsychological measures of working memory and cognitive function. Both BD and SZ groups discounted delayed rewards more steeply than did the healthy group even after controlling for current substance use, age, gender, and employment. Hierarchical multiple regression analyses showed that discounting rate was associated with both diagnostic group and working memory or intelligence scores. In each group, working memory or intelligence scores negatively correlated with discounting rate. The results suggest that (a) both BD and SZ groups value smaller, immediate rewards more than larger, delayed rewards compared with the healthy group and (b) working memory or intelligence is related to temporal decision making in individuals with BD or SZ as well as in healthy individuals.
Collapse
Affiliation(s)
- Woo-Young Ahn
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Onoda K, Okamoto Y, Kunisato Y, Aoyama S, Shishida K, Okada G, Tanaka SC, Schweighofer N, Yamaguchi S, Doya K, Yamawaki S. Inter-individual discount factor differences in reward prediction are topographically associated with caudate activation. Exp Brain Res 2011; 212:593-601. [PMID: 21695536 DOI: 10.1007/s00221-011-2771-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2011] [Accepted: 06/12/2011] [Indexed: 11/25/2022]
Abstract
In general, humans tend to devalue a delayed reward. Such delay discounting is a theoretical and computational concept in which the discount factor influences the time scale of the trade-off between delay of reward and amount of reward. The discount factor relies on the individual's ability to evaluate the future reward. Using functional magnetic resonance imaging, we investigated brain mechanisms for reward valuation at different individual discount factors in a delayed reward choice task. In the task, participants were required to select small/immediate or large/delayed rewards to maximize the total reward over time. The discount factor for each participant individually was calculated from the behavioral data based on an exponential discounting model. The estimated value of a future reward increases as the expected delivery approaches, so the time course of these estimated values was computed based on each individual's discount factor; each was entered into the regression analysis as an explanatory (independent) variable. After the region of interest was narrowed anatomically to the caudate, a peak coordinate was detected in each individual. A correlation analysis revealed that the location of the peak along the dorsal-ventral axis in the right caudate was positively correlated with the discount factor. This implies that individuals who showed a larger discount factor had peak activations in a more dorsal part of the right caudate associated with future reward prediction. This evidence also suggests that a higher ability to delay reward prediction might be related to activation of the more dorsal caudate.
Collapse
Affiliation(s)
- Keiichi Onoda
- Department of Neurology, Shimane University, Shimane, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Pearson JM, Hayden BY, Platt ML. Explicit information reduces discounting behavior in monkeys. Front Psychol 2010; 1:237. [PMID: 21833291 PMCID: PMC3153841 DOI: 10.3389/fpsyg.2010.00237] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2010] [Accepted: 12/14/2010] [Indexed: 11/13/2022] Open
Abstract
Animals are notoriously impulsive in common laboratory experiments, preferring smaller, sooner rewards to larger, delayed rewards even when this reduces average reward rates. By contrast, the same animals often engage in natural behaviors that require extreme patience, such as food caching, stalking prey, and traveling long distances to high-quality food sites. One possible explanation for this discrepancy is that standard laboratory delay discounting tasks artificially inflate impulsivity by subverting animals' common learning strategies. To test this idea, we examined choices made by rhesus macaques in two variants of a standard delay discounting task. In the conventional variant, post-reward delays were uncued and adjusted to render total trial length constant; in the second, all delays were cued explicitly. We found that measured discounting was significantly reduced in the cued task, with discount parameters well below those reported in studies using the standard uncued design. When monkeys had complete information, their decisions were more consistent with a strategy of reward rate maximization. These results indicate that monkeys, and perhaps other animals, are more patient than is normally assumed, and that laboratory measures of delay discounting may overstate impulsivity.
Collapse
Affiliation(s)
- John M. Pearson
- Department of Neurobiology, Duke University School of Medicine and Center for Cognitive NeuroscienceDurham, NC, USA
| | - Benjamin Y. Hayden
- Department of Neurobiology, Duke University School of Medicine and Center for Cognitive NeuroscienceDurham, NC, USA
| | - Michael L. Platt
- Department of Neurobiology, Duke University School of Medicine and Center for Cognitive NeuroscienceDurham, NC, USA
- Department of Evolutionary Anthropology, Duke UniversityDurham, NC, USA
| |
Collapse
|
38
|
Abstract
Decision making consists of choosing among available options on the basis of a valuation of their potential costs and benefits. Most theoretical models of decision making in behavioral economics, psychology, and computer science propose that the desirability of outcomes expected from alternative options can be quantified by utility functions. These utility functions allow a decision maker to assign subjective values to each option under consideration by weighting the likely benefits and costs resulting from an action and to select the one with the highest subjective value. Here, we used model-based neuroimaging to test whether the human brain uses separate valuation systems for rewards (erotic stimuli) associated with different types of costs, namely, delay and effort. We show that humans devalue rewards associated with physical effort in a strikingly similar fashion to those they devalue that are associated with delays, and that a single computational model derived from economics theory can account for the behavior observed in both delay discounting and effort discounting. However, our neuroimaging data reveal that the human brain uses distinct valuation subsystems for different types of costs, reflecting in opposite fashion delayed reward and future energetic expenses. The ventral striatum and the ventromedial prefrontal cortex represent the increasing subjective value of delayed rewards, whereas a distinct network, composed of the anterior cingulate cortex and the anterior insula, represent the decreasing value of the effortful option, coding the expected expense of energy. Together, these data demonstrate that the valuation processes underlying different types of costs can be fractionated at the cerebral level.
Collapse
|
39
|
Nakahara H, Kaveri S. Internal-time temporal difference model for neural value-based decision making. Neural Comput 2010; 22:3062-106. [PMID: 20858126 DOI: 10.1162/neco_a_00049] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The temporal difference (TD) learning framework is a major paradigm for understanding value-based decision making and related neural activities (e.g., dopamine activity). The representation of time in neural processes modeled by a TD framework, however, is poorly understood. To address this issue, we propose a TD formulation that separates the time of the operator (neural valuation processes), which we refer to as internal time, from the time of the observer (experiment), which we refer to as conventional time. We provide the formulation and theoretical characteristics of this TD model based on internal time, called internal-time TD, and explore the possible consequences of the use of this model in neural value-based decision making. Due to the separation of the two times, internal-time TD computations, such as TD error, are expressed differently, depending on both the time frame and time unit. We examine this operator-observer problem in relation to the time representation used in previous TD models. An internal time TD value function exhibits the co-appearance of exponential and hyperbolic discounting at different delays in intertemporal choice tasks. We further examine the effects of internal time noise on TD error, the dynamic construction of internal time, and the modulation of internal time with the internal time hypothesis of serotonin function. We also relate the internal TD formulation to research on interval timing and subjective time.
Collapse
Affiliation(s)
- Hiroyuki Nakahara
- Laboratory for Integrated Theoretical Neuroscience, RIKEN Brain Science Institute,Wako, Saitama, 351-0198 Japan.
| | | |
Collapse
|
40
|
Gentili R, Han CE, Schweighofer N, Papaxanthis C. Motor learning without doing: trial-by-trial improvement in motor performance during mental training. J Neurophysiol 2010; 104:774-83. [PMID: 20538766 DOI: 10.1152/jn.00257.2010] [Citation(s) in RCA: 132] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Although there is converging experimental and clinical evidences suggesting that mental training with motor imagery can improve motor performance, it is unclear how humans can learn movements through mental training despite the lack of sensory feedback from the body and the environment. In a first experiment, we measured the trial-by-trial decrease in durations of executed movements (physical training group) and mentally simulated movements (motor-imagery training group), by means of training on a multiple-target arm-pointing task requiring high accuracy and speed. Movement durations were significantly lower in posttest compared with pretest after both physical and motor-imagery training. Although both the posttraining performance and the rate of learning were smaller in motor-imagery training group than in physical training group, the change in movement duration and the asymptotic movement duration after a hypothetical large number of trials were identical. The two control groups (eye-movement training and rest groups) did not show change in movement duration. In the second experiment, additional kinematic analyses revealed that arm movements were straighter and faster both immediately and 24 h after physical and motor-imagery training. No such improvements were observed in the eye-movement training group. Our results suggest that the brain uses state estimation, provided by internal forward model predictions, to improve motor performance during mental training. Furthermore, our results suggest that mental practice can, at least in young healthy subjects and if given after a short bout of physical practice, be successfully substituted to physical practice to improve motor performance.
Collapse
|
41
|
Kim BW, Kennedy DN, Lehár J, Lee MJ, Blood AJ, Lee S, Perlis RH, Smoller JW, Morris R, Fava M, Breiter HC. Recurrent, robust and scalable patterns underlie human approach and avoidance. PLoS One 2010; 5:e10613. [PMID: 20532247 PMCID: PMC2879576 DOI: 10.1371/journal.pone.0010613] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Accepted: 04/08/2010] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Approach and avoidance behavior provide a means for assessing the rewarding or aversive value of stimuli, and can be quantified by a keypress procedure whereby subjects work to increase (approach), decrease (avoid), or do nothing about time of exposure to a rewarding/aversive stimulus. To investigate whether approach/avoidance behavior might be governed by quantitative principles that meet engineering criteria for lawfulness and that encode known features of reward/aversion function, we evaluated whether keypress responses toward pictures with potential motivational value produced any regular patterns, such as a trade-off between approach and avoidance, or recurrent lawful patterns as observed with prospect theory. METHODOLOGY/PRINCIPAL FINDINGS Three sets of experiments employed this task with beautiful face images, a standardized set of affective photographs, and pictures of food during controlled states of hunger and satiety. An iterative modeling approach to data identified multiple law-like patterns, based on variables grounded in the individual. These patterns were consistent across stimulus types, robust to noise, describable by a simple power law, and scalable between individuals and groups. Patterns included: (i) a preference trade-off counterbalancing approach and avoidance, (ii) a value function linking preference intensity to uncertainty about preference, and (iii) a saturation function linking preference intensity to its standard deviation, thereby setting limits to both. CONCLUSIONS/SIGNIFICANCE These law-like patterns were compatible with critical features of prospect theory, the matching law, and alliesthesia. Furthermore, they appeared consistent with both mean-variance and expected utility approaches to the assessment of risk. Ordering of responses across categories of stimuli demonstrated three properties thought to be relevant for preference-based choice, suggesting these patterns might be grouped together as a relative preference theory. Since variables in these patterns have been associated with reward circuitry structure and function, they may provide a method for quantitative phenotyping of normative and pathological function (e.g., psychiatric illness).
Collapse
Affiliation(s)
- Byoung Woo Kim
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - David N. Kennedy
- Center for Morphometric Analysis, Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Joseph Lehár
- Department of Bioinformatics, Boston University, Boston, Massachusetts, United States of America
| | - Myung Joo Lee
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Anne J. Blood
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Mood and Motor Control Laboratory, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Sang Lee
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Roy H. Perlis
- Depression Clinic and Research Program, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Psychiatric and Neurodevelopmental Genetics Unit of the Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit of the Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Robert Morris
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Maurizio Fava
- Depression Clinic and Research Program, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Hans C. Breiter
- Motivation and Emotion Neuroscience Collaboration (MENC), Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Laboratory of Neuroimaging and Genetics, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Mood and Motor Control Laboratory, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | | |
Collapse
|
42
|
HE JM, HUANG XT, YING KL, LUO YM. The Staged Construction of Temporal Discounting. ACTA PSYCHOLOGICA SINICA 2010. [DOI: 10.3724/sp.j.1041.2010.00474] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
43
|
Voon V, Reynolds B, Brezing C, Gallea C, Skaljic M, Ekanayake V, Fernandez H, Potenza MN, Dolan RJ, Hallett M. Impulsive choice and response in dopamine agonist-related impulse control behaviors. Psychopharmacology (Berl) 2010; 207:645-59. [PMID: 19838863 PMCID: PMC3676926 DOI: 10.1007/s00213-009-1697-y] [Citation(s) in RCA: 159] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/09/2009] [Accepted: 10/02/2009] [Indexed: 01/25/2023]
Abstract
RATIONALE Dopaminergic medication-related impulse control disorders (ICDs) such as pathological gambling and compulsive shopping have been reported in Parkinson's disease (PD). HYPOTHESIS We hypothesized that dopamine agonists (DAs) would be associated with greater impulsive choice or greater discounting of delayed rewards in PD patients with ICDs (PDI). METHODS Fourteen PDI patients, 14 PD controls without ICDs, and 16 medication-free matched normal controls were tested on the Experiential Discounting Task (EDT), a feedback-based intertemporal choice task, spatial working memory, and attentional set shifting. The EDT was used to assess choice impulsivity (hyperbolic K value), reaction time (RT), and decision conflict RT (the RT difference between high conflict and low conflict choices). PDI patients and PD controls were tested on and off DA. RESULTS On the EDT, there was a group by medication interaction effect [F(1,26) = 5.62; p = 0.03] with pairwise analyses demonstrating that DA status was associated with increased impulsive choice in PDI patients (p = 0.02) but not in PD controls (p = 0.37). PDI patients also had faster RT compared to PD controls [F(1,26) = 7.51, p = 0.01]. DA status was associated with shorter RT [F(3,24) = 8.39, p = 0.001] and decision conflict RT [F(1,26) = 6.16, p = 0.02] in PDI patients but not in PD controls. There were no correlations between different measures of impulsivity. PDI patients on DA had greater spatial working memory impairments compared to PD controls on DA (t = 2.13, df = 26, p = 0.04). CONCLUSION Greater impulsive choice, faster RT, faster decision conflict RT, and executive dysfunction may contribute to ICDs in PD.
Collapse
Affiliation(s)
- Valerie Voon
- National Institutes of Health, 10 Center Drive, Bldg 10/Rm 7D37, Bethesda, MD 20892-1428, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Wittmann M, Lovero KL, Lane SD, Paulus MP. Now or later? Striatum and insula activation to immediate versus delayed rewards. ACTA ACUST UNITED AC 2010; 3:15-26. [PMID: 20657814 DOI: 10.1037/a0017252] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Neuroimaging studies on delay discounting tasks that use reward delays ranging from minutes to days have implicated the insula and striatum in the processing of inter-temporal decisions. This study aimed at assessing whether these brain regions would also be involved in decision-making when subjects have to wait through the delays within the range of seconds. Employing functional magnetic resonance imaging (fMRI) in thirteen healthy volunteers, we repeatedly presented monetaryoptions with delays that differed within the range of multiple seconds. Using a region of interest approach, we found significant activation in the bilateral anterior insula and striatum when subjects chose either the immediate (smaller) or delayed (larger) option. In particular, insular activation was observed after the response and the delay, when the outcome of the immediate or the delayed choice was shown. Significantly greater activation was observed in the ventroanterior striatum while subjects chose the immediate, as opposed to the delayed, options, and also after receiving the outcome of waiting through the longer delay option. The evidence presented here indicates that both the ventral striatum and the insula are involved in the processing of choosing delay options as well as the consequences of choices with delays in the seconds' range.
Collapse
Affiliation(s)
- Marc Wittmann
- Department of Psychiatry, University of California San Diego, Veterans Affairs San Diego Healthcare System, San Diego
| | | | | | | |
Collapse
|
45
|
Kurth-Nelson Z, Redish AD. Temporal-difference reinforcement learning with distributed representations. PLoS One 2009; 4:e7362. [PMID: 19841749 PMCID: PMC2760757 DOI: 10.1371/journal.pone.0007362] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 09/04/2009] [Indexed: 11/18/2022] Open
Abstract
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting "micro-Agents", each of which has a separate discounting factor (gamma). Each microAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (delta) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each microAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments.
Collapse
Affiliation(s)
- Zeb Kurth-Nelson
- Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota, United States of America
| | | |
Collapse
|
46
|
Mazur JE, Biondi DR. Delay-amount tradeoffs in choices by pigeons and rats: hyperbolic versus exponential discounting. J Exp Anal Behav 2009; 91:197-211. [PMID: 19794834 PMCID: PMC2648524 DOI: 10.1901/jeab.2009.91-197] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2008] [Accepted: 09/15/2008] [Indexed: 11/22/2022]
Abstract
An adjusting-delay procedure was used to study the choices of pigeons and rats when both delay and amount of reinforcement were varied. In different conditions, the choice alternatives included one versus two reinforcers, one versus three reinforcers, and three versus two reinforcers. The delay to one alternative (the standard alternative) was kept constant in a condition, and the delay to the other (the adjusting alternative) was increased or decreased many times a session so as to estimate an indifference point--a delay at which the two alternatives were chosen about equally often. Indifference functions were constructed by plotting the adjusting delay as a function of the standard delay for each pair of reinforcer amounts. The experiments were designed to test the prediction of a hyperbolic decay equation that the slopes of the indifference functions should increase as the ratio of the two reinforcer amounts increased. Consistent with the hyperbolic equation, the slopes of the indifference functions depended on the ratios of the two reinforcer amounts for both pigeons and rats. These results were not compatible with an exponential decay equation, which predicts slopes of 1 regardless of the reinforcer amounts. Combined with other data, these findings provide further evidence that delay discounting is well described by a hyperbolic equation for both species, but not by an exponential equation. Quantitative differences in the y-intercepts of the indifference functions from the two species suggested that the rate at which reinforcer strength decreases with increasing delay may be four or five times slower for rats than for pigeons.
Collapse
Affiliation(s)
- James E Mazur
- Department, Southern Connecticut State University, New Haven, CT 06515, USA.
| | | |
Collapse
|
47
|
Time and decision making in humans. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2009; 8:509-24. [PMID: 19033245 DOI: 10.3758/cabn.8.4.509] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Decision making requires evaluating alternatives that differ on a number of attributes. During this evaluation process, selection of options depends on the duration of the options, the duration of the expected delay for realizing the options, and the time available to reach a decision. This article reviews the relationship between time and decision making in humans with respect to this evaluation process. Moreover, the role of psychological time, as compared with physical time, is accentuated. Five topics have been selected that illustrate how time and mental representations of time affect decision making. These are (1) the duration of options, (2) temporal decision making, (3) the time between having made a decision and experiencing the consequences of that decision, (4) the temporal perspective of decision makers, and (5) the duration of the decision process. The discussion of each topic is supplemented by suggestions for further research. It is shown that psychological time is often neglected in human decision making but seems to play an important role in the making of choices.
Collapse
|
48
|
Bissmarck F, Nakahara H, Doya K, Hikosaka O. Combining modalities with different latencies for optimal motor control. J Cogn Neurosci 2009; 20:1966-79. [PMID: 18416676 DOI: 10.1162/jocn.2008.20133] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Feedback signals may be of different modality, latency, and accuracy. To learn and control motor tasks, the feedback available may be redundant, and it would not be necessary to rely on every accessible feedback loop. Which feedback loops should then be utilized? In this article, we propose that the latency is a critical factor to determine which signals will be influential at different learning stages. We use a computational framework to study the role of feedback modules with different latencies in optimal motor control. Instead of explicit gating between modules, the reinforcement learning algorithm learns to rely on the more useful module. We tested our paradigm for two different implementations, which confirmed our hypothesis. In the first, we examined how feedback latency affects the competitiveness of two identical modules. In the second, we examined an example of visuomotor sequence learning, where a plastic, faster somatosensory module interacts with a preacquired, slower visual module. We found that the overall performance depended on the latency of the faster module alone, whereas the relative latency determines the independence of the faster from the slower. In the second implementation, the somatosensory module with shorter latency overtook the slower visual module, and realized better overall performance. The visual module played different roles in early and late learning. First, it worked as a guide for the exploration of the somatosensory module. Then, when learning had converged, it contributed to robustness against system noise and external perturbations. Overall, these results demonstrate that our framework successfully learns to utilize the most useful available feedback for optimal control.
Collapse
Affiliation(s)
- Fredrik Bissmarck
- Computational Neuroscience Labs, ATR International, 2-2-2 Hikaridai Keihanna Science City, Seika, Soraku, Kyoto, Japan.
| | | | | | | |
Collapse
|
49
|
Gregorios-Pippas L, Tobler PN, Schultz W. Short-term temporal discounting of reward value in human ventral striatum. J Neurophysiol 2009; 101:1507-23. [PMID: 19164109 PMCID: PMC2666398 DOI: 10.1152/jn.90730.2008] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Delayed rewards lose their value for economic decisions and constitute weaker reinforcers for learning. Temporal discounting of reward value already occurs within a few seconds in animals, which allows investigations of the underlying neurophysiological mechanisms. However, it is difficult to relate these mechanisms to human discounting behavior, which is usually studied over days and months and may engage different brain processes. Our study aimed to bridge the gap by using very short delays and measuring human functional magnetic resonance responses in one of the key reward centers of the brain, the ventral striatum. We used psychometric methods to assess subjective timing and valuation of monetary rewards with delays of 4.0-13.5 s. We demonstrated hyperbolic and exponential decreases of striatal responses to reward predicting stimuli within this time range, irrespective of changes in reward rate. Lower reward magnitudes induced steeper behavioral and striatal discounting. By contrast, striatal responses following the delivery of reward reflected the uncertainty in subjective timing associated with delayed rewards rather than value discounting. These data suggest that delays of a few seconds affect the neural processing of predicted reward value in the ventral striatum and engage the temporal sensitivity of reward responses. Comparisons with electrophysiological animal data suggest that ventral striatal reward discounting may involve dopaminergic and orbitofrontal inputs.
Collapse
Affiliation(s)
- Lucy Gregorios-Pippas
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | | | | |
Collapse
|
50
|
Minamimoto T, La Camera G, Richmond BJ. Measuring and modeling the interaction among reward size, delay to reward, and satiation level on motivation in monkeys. J Neurophysiol 2009; 101:437-47. [PMID: 18987119 PMCID: PMC2637024 DOI: 10.1152/jn.90959.2008] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2008] [Accepted: 11/03/2008] [Indexed: 11/22/2022] Open
Abstract
Motivation is usually inferred from the likelihood or the intensity with which behavior is carried out. It is sensitive to external factors (e.g., the identity, amount, and timing of a rewarding outcome) and internal factors (e.g., hunger or thirst). We trained macaque monkeys to perform a nonchoice instrumental task (a sequential red-green color discrimination) while manipulating two external factors: reward size and delay-to-reward. We also inferred the state of one internal factor, level of satiation, by monitoring the accumulated reward. A visual cue indicated the forthcoming reward size and delay-to-reward in each trial. The fraction of trials completed correctly by the monkeys increased linearly with reward size and was hyperbolically discounted by delay-to-reward duration, relations that are similar to those found in free operant and choice tasks. The fraction of correct trials also decreased progressively as a function of the satiation level. Similar (albeit noiser) relations were obtained for reaction times. The combined effect of reward size, delay-to-reward, and satiation level on the proportion of correct trials is well described as a multiplication of the effects of the single factors when each factor is examined alone. These results provide a quantitative account of the interaction of external and internal factors on instrumental behavior, and allow us to extend the concept of subjective value of a rewarding outcome, usually confined to external factors, to account also for slow changes in the internal drive of the subject.
Collapse
Affiliation(s)
- Takafumi Minamimoto
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institute Health, Department of Health and Human Services, Bethesda, MD 20892-4415, USA
| | | | | |
Collapse
|