1
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward, but not negative value of aversive stimuli, to dopamine neurons. Neuron 2024; 112:1001-1019.e6. [PMID: 38278147 PMCID: PMC10957320 DOI: 10.1016/j.neuron.2023.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 11/10/2023] [Accepted: 12/21/2023] [Indexed: 01/28/2024]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs), but the mechanisms underlying RPE computation, particularly the contributions of different neurotransmitters, remain poorly understood. Here, we used a genetically encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons in mice. We found that glutamate inputs exhibit virtually all of the characteristics of RPE rather than conveying a specific component of RPE computation, such as reward or expectation. Notably, whereas glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli into more positive responses, whereas excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
2
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward but not negative value of aversive stimuli to dopamine neurons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566472. [PMID: 37986868 PMCID: PMC10659341 DOI: 10.1101/2023.11.09.566472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs) but the mechanisms underlying RPE computation, particularly contributions of different neurotransmitters, remain poorly understood. Here we used a genetically-encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons. We found that glutamate inputs exhibit virtually all of the characteristics of RPE, rather than conveying a specific component of RPE computation such as reward or expectation. Notably, while glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli toward more positive responses, while excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
3
|
Parker NF, Baidya A, Cox J, Haetzel LM, Zhukovskaya A, Murugan M, Engelhard B, Goldman MS, Witten IB. Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning. Cell Rep 2022; 39:110756. [PMID: 35584665 PMCID: PMC9218875 DOI: 10.1016/j.celrep.2022.110756] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 02/18/2022] [Accepted: 04/07/2022] [Indexed: 11/25/2022] Open
Abstract
How are actions linked with subsequent outcomes to guide choices? The nucleus accumbens, which is implicated in this process, receives glutamatergic inputs from the prelimbic cortex and midline regions of the thalamus. However, little is known about whether and how representations differ across these input pathways. By comparing these inputs during a reinforcement learning task in mice, we discovered that prelimbic cortical inputs preferentially represent actions and choices, whereas midline thalamic inputs preferentially represent cues. Choice-selective activity in the prelimbic cortical inputs is organized in sequences that persist beyond the outcome. Through computational modeling, we demonstrate that these sequences can support the neural implementation of reinforcement-learning algorithms, in both a circuit model based on synaptic plasticity and one based on neural dynamics. Finally, we test and confirm a prediction of our circuit models by direct manipulation of nucleus accumbens input neurons.
Collapse
Affiliation(s)
- Nathan F Parker
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Avinash Baidya
- Center for Neuroscience, University of California, Davis, Davis, CA 95616, USA; Department of Physics and Astronomy, University of California, Davis, Davis, CA 95616, USA
| | - Julia Cox
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Laura M Haetzel
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Anna Zhukovskaya
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Malavika Murugan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Ben Engelhard
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Mark S Goldman
- Center for Neuroscience, University of California, Davis, Davis, CA 95616, USA; Department of Neurobiology, Physiology and Behavior, University of California, Davis, Davis, CA 95616, USA; Department of Ophthalmology and Vision Science, University of California, Davis, Davis, CA 95616, USA.
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Psychology, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
4
|
Abstract
Resilience - a key topic in clinical science and practice - still lacks a clear conceptualization that integrates its evolutionary and human-specific features, refrains from exclusive focus on fear physiology, incorporates a developmental approach, and, most importantly, is not based on the negation (i.e., absence of symptoms following trauma). Building on the initial condition of mammals, whose brain matures in the context of the mother's body and caregiving behavior, we argue that systems and processes that participate in tuning the brain to the social ecology and adapting to its hardships mark the construct of resilience. These include the oxytocin system, the affiliative brain, and biobehavioral synchrony, all characterized by great flexibility across phylogenesis and ontogenesis. Three core features of resilience are outlined: plasticity, sociality and meaning. Mechanisms of sociality by which coordinated action supports diversity, endurance and adaptation are described across animal evolution. Humans' biobehavioral synchrony matures from maternal attuned behavior in the postpartum to adult-adult relationships of empathy, perspective-taking and intimacy, and extends from the mother-child relationship to other affiliative bonds throughout life, charting a fundamental trajectory in the development of resilience. Findings from three high-risk cohorts, each tapping a distinct disruption to maternal-infant bonding (prematurity, maternal depression, and early life stress/trauma), and followed from birth to adolescence/young adulthood, demonstrate how components of the neurobiology of affiliation confer resilience and uniquely shape the social brain.
Collapse
Affiliation(s)
- Ruth Feldman
- Interdisciplinary CenterHerzliyaIsrael,Yale Child Study CenterUniversity of YaleNew HavenCTUSA
| |
Collapse
|
5
|
Aggarwal M, Akamine Y, Liu AW, Wickens JR. The nucleus accumbens and inhibition in the ventral tegmental area play a causal role in the Kamin blocking effect. Eur J Neurosci 2020; 52:3087-3109. [DOI: 10.1111/ejn.14732] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 03/04/2020] [Accepted: 03/26/2020] [Indexed: 01/02/2023]
Affiliation(s)
- Mayank Aggarwal
- Neurobiology Research Unit Okinawa Institute of Science and Technologys Graduate University Kunigami Okinawa Japan
| | - Yumiko Akamine
- Neurobiology Research Unit Okinawa Institute of Science and Technologys Graduate University Kunigami Okinawa Japan
| | - Andrew W. Liu
- Neurobiology Research Unit Okinawa Institute of Science and Technologys Graduate University Kunigami Okinawa Japan
| | - Jeffery R. Wickens
- Neurobiology Research Unit Okinawa Institute of Science and Technologys Graduate University Kunigami Okinawa Japan
| |
Collapse
|
6
|
Pace-Schott EF, Amole MC, Aue T, Balconi M, Bylsma LM, Critchley H, Demaree HA, Friedman BH, Gooding AEK, Gosseries O, Jovanovic T, Kirby LA, Kozlowska K, Laureys S, Lowe L, Magee K, Marin MF, Merner AR, Robinson JL, Smith RC, Spangler DP, Van Overveld M, VanElzakker MB. Physiological feelings. Neurosci Biobehav Rev 2019; 103:267-304. [DOI: 10.1016/j.neubiorev.2019.05.002] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 03/27/2019] [Accepted: 05/03/2019] [Indexed: 12/20/2022]
|
7
|
Donahoe JW. Behavior analysis and neuroscience: Complementary disciplines. J Exp Anal Behav 2017; 107:301-320. [DOI: 10.1002/jeab.251] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 02/07/2017] [Indexed: 11/10/2022]
|
8
|
The Neurobiology of Human Attachments. Trends Cogn Sci 2017; 21:80-99. [DOI: 10.1016/j.tics.2016.11.007] [Citation(s) in RCA: 380] [Impact Index Per Article: 54.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 11/14/2016] [Accepted: 11/22/2016] [Indexed: 02/07/2023]
|
9
|
Tian J, Huang R, Cohen JY, Osakada F, Kobak D, Machens CK, Callaway EM, Uchida N, Watabe-Uchida M. Distributed and Mixed Information in Monosynaptic Inputs to Dopamine Neurons. Neuron 2016; 91:1374-1389. [PMID: 27618675 DOI: 10.1016/j.neuron.2016.08.018] [Citation(s) in RCA: 139] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 06/28/2016] [Accepted: 07/25/2016] [Indexed: 01/29/2023]
Abstract
Dopamine neurons encode the difference between actual and predicted reward, or reward prediction error (RPE). Although many models have been proposed to account for this computation, it has been difficult to test these models experimentally. Here we established an awake electrophysiological recording system, combined with rabies virus and optogenetic cell-type identification, to characterize the firing patterns of monosynaptic inputs to dopamine neurons while mice performed classical conditioning tasks. We found that each variable required to compute RPE, including actual and predicted reward, was distributed in input neurons in multiple brain areas. Further, many input neurons across brain areas signaled combinations of these variables. These results demonstrate that even simple arithmetic computations such as RPE are not localized in specific brain areas but, rather, distributed across multiple nodes in a brain-wide network. Our systematic method to examine both activity and connectivity revealed unexpected redundancy for a simple computation in the brain.
Collapse
Affiliation(s)
- Ju Tian
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Ryan Huang
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Jeremiah Y Cohen
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Fumitaka Osakada
- Systems Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA; Laboratory of Cellular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya 464-8601, Japan
| | - Dmitry Kobak
- Champalimaud Neuroscience Programme, Champalimaud Centre for the Unknown, Lisbon 1400-038, Portugal
| | - Christian K Machens
- Champalimaud Neuroscience Programme, Champalimaud Centre for the Unknown, Lisbon 1400-038, Portugal
| | - Edward M Callaway
- Systems Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
10
|
Li Y, Lindemann C, Goddard MJ, Hyland BI. Complex Multiplexing of Reward-Cue- and Licking-Movement-Related Activity in Single Midline Thalamus Neurons. J Neurosci 2016; 36:3567-78. [PMID: 27013685 PMCID: PMC6601730 DOI: 10.1523/jneurosci.1107-15.2016] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2015] [Revised: 02/14/2016] [Accepted: 02/17/2016] [Indexed: 12/30/2022] Open
Abstract
Midline thalamus is implicated in linking visceral and exteroceptive sensory information with behavior. However, whether neuronal activity is modulated with temporal precision by cues and actions in real time is unknown. Using single-neuron recording and a Pavlovian visual-cue/liquid-reward association task in rats, we discovered phasic responses to sensory cues, appropriately timed to modify information processing in output targets, as well as tonic modulations within and between trials that were differentially reward modulated, which may have distinct arousal functions. Many of the cue-responsive neurons also responded to repetitive licks, consistent with sensorimotor integration. Further, some lick-related neurons were activated only by the first rewarded lick and only if that lick were also part of a conditioned response sequence initiated earlier, consistent with binding action decisions to their ensuing outcome. This rich repertoire of responses provides electrophysiological evidence for midline thalamus as a site of complex information integration for reward-mediated behavior. SIGNIFICANCE STATEMENT Disparate brain circuits are involved in sensation, movement, and reward information. These must interact in order for the relationships between cues, actions, and outcomes to be learned. We found that responses of single neurons in midline thalamus to sensory cues are increased when associated with reward. This output may amplify similar signals generated in parallel by the dopamine system. In addition, some neurons coded a three-factor decision in which the neuron fired only if there was a movement, if it was the first one after the reward becoming available, and if it was part of a sequence triggered in response to a preceding cue. These data highlight midline thalamus as an important node integrating multiple types of information for linking sensation, actions, and rewards.
Collapse
Affiliation(s)
- Yuhong Li
- Department of Physiology, Otago School of Medical Sciences, Brain Health Research Centre, University of Otago, and the Brain Research New Zealand Centre of Research Excellence, Dunedin 9054, New Zealand
| | - Christoph Lindemann
- Department of Physiology, Otago School of Medical Sciences, Brain Health Research Centre, University of Otago, and the Brain Research New Zealand Centre of Research Excellence, Dunedin 9054, New Zealand
| | - Matthew J Goddard
- Department of Physiology, Otago School of Medical Sciences, Brain Health Research Centre, University of Otago, and the Brain Research New Zealand Centre of Research Excellence, Dunedin 9054, New Zealand
| | - Brian I Hyland
- Department of Physiology, Otago School of Medical Sciences, Brain Health Research Centre, University of Otago, and the Brain Research New Zealand Centre of Research Excellence, Dunedin 9054, New Zealand
| |
Collapse
|
11
|
Keiflin R, Janak PH. Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry. Neuron 2016; 88:247-63. [PMID: 26494275 DOI: 10.1016/j.neuron.2015.08.037] [Citation(s) in RCA: 201] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Midbrain dopamine (DA) neurons are proposed to signal reward prediction error (RPE), a fundamental parameter in associative learning models. This RPE hypothesis provides a compelling theoretical framework for understanding DA function in reward learning and addiction. New studies support a causal role for DA-mediated RPE activity in promoting learning about natural reward; however, this question has not been explicitly tested in the context of drug addiction. In this review, we integrate theoretical models with experimental findings on the activity of DA systems, and on the causal role of specific neuronal projections and cell types, to provide a circuit-based framework for probing DA-RPE function in addiction. By examining error-encoding DA neurons in the neural network in which they are embedded, hypotheses regarding circuit-level adaptations that possibly contribute to pathological error signaling and addiction can be formulated and tested.
Collapse
Affiliation(s)
- Ronald Keiflin
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Patricia H Janak
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA; Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
| |
Collapse
|
12
|
Zhang S, Hu S, Chao HH, Li CSR. Resting-State Functional Connectivity of the Locus Coeruleus in Humans: In Comparison with the Ventral Tegmental Area/Substantia Nigra Pars Compacta and the Effects of Age. Cereb Cortex 2015. [PMID: 26223261 DOI: 10.1093/cercor/bhv172] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The locus coeruleus (LC) provides the primary noradrenergic inputs to the cerebral cortex. Despite numerous animal studies documenting the functions of the LC, research in humans is hampered by the small volume of this midbrain nucleus. Here, we took advantage of a probabilistic template, explored the cerebral functional connectivity of the LC with resting-state fMRI data of 250 healthy adults, and verified the findings by accounting for physiological noise in another data set. In addition, we contrasted connectivities of the LC and the ventral tegmental area/substantia nigra pars compacta. The results highlighted both shared and distinct connectivity of these 2 midbrain structures, as well as an opposite pattern of connectivity to bilateral amygdala, pulvinar, and right anterior insula. Additionally, LC connectivity to the fronto-parietal cortex and the cerebellum increases with age and connectivity to the visual cortex decreases with age. These findings may facilitate studies of the role of the LC in arousal, saliency responses and cognitive motor control and in the behavioral and cognitive manifestations during healthy and disordered aging. Although the first to demonstrate whole-brain LC connectivity, these findings need to be confirmed with high-resolution imaging.
Collapse
Affiliation(s)
| | | | - Herta H Chao
- Department of Internal Medicine, Yale University, New Haven, CT 06519, USA Medical Service, VA Connecticut Health Care System, West Haven, CT 06516, USA
| | - Chiang-Shan R Li
- Department of Psychiatry Department of Neurobiology Interdepartmental Neuroscience Program, Yale University, New Haven, CT 06520, USA Connecticut Mental Health Center, New Haven, CT 06519, USA
| |
Collapse
|
13
|
Morita K, Kawaguchi Y. Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning. Eur J Neurosci 2015; 42:2003-21. [PMID: 26095906 PMCID: PMC5034842 DOI: 10.1111/ejn.12994] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 06/11/2015] [Accepted: 06/17/2015] [Indexed: 12/12/2022]
Abstract
There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value‐based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represents reward‐prediction error. Although (ii) constitutes a critical assumption of (i), it remains elusive how (ii) holds given (i), with the basal‐ganglia influence on the dopamine neurons. Here we present a computational neural‐circuit model that potentially resolves this issue. Based on the latest analyses of the heterogeneous corticostriatal neurons and connections, our model posits that the direct and indirect pathways, respectively, represent the values of upcoming and previous actions, and up‐regulate and down‐regulate the dopamine neurons via the basal‐ganglia output nuclei. This explains how the difference between the upcoming and previous values, which constitutes the core of reward‐prediction error, is calculated. Simultaneously, it predicts that blockade of the direct/indirect pathway causes a negative/positive shift of reward‐prediction error and thereby impairs learning from positive/negative error, i.e. appetitive/aversive learning. Through simulation of reward‐reversal learning and punishment‐avoidance learning, we show that our model could indeed account for the experimentally observed features that are suggested to support notion (i) and could also provide predictions on neural activity. We also present a behavioral prediction of our model, through simulation of inter‐temporal choice, on how the balance between the two pathways relates to the subject's time preference. These results indicate that our model, incorporating the heterogeneity of the cortical influence on the basal ganglia, is expected to provide a closed‐circuit mechanistic understanding of appetitive/aversive learning.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
| | - Yasuo Kawaguchi
- Division of Cerebral Circuitry, National Institute for Physiological Sciences, Okazaki, Japan.,Department of Physiological Sciences, SOKENDAI (The Graduate University for Advanced Studies), Okazaki, Japan.,Japan Science and Technology Agency, Core Research for Evolutional Science and Technology, Tokyo, Japan
| |
Collapse
|
14
|
Chen C, Takahashi T, Nakagawa S, Inoue T, Kusumi I. Reinforcement learning in depression: A review of computational research. Neurosci Biobehav Rev 2015; 55:247-67. [PMID: 25979140 DOI: 10.1016/j.neubiorev.2015.05.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Revised: 04/20/2015] [Accepted: 05/04/2015] [Indexed: 01/05/2023]
Abstract
Despite being considered primarily a mood disorder, major depressive disorder (MDD) is characterized by cognitive and decision making deficits. Recent research has employed computational models of reinforcement learning (RL) to address these deficits. The computational approach has the advantage in making explicit predictions about learning and behavior, specifying the process parameters of RL, differentiating between model-free and model-based RL, and the computational model-based functional magnetic resonance imaging and electroencephalography. With these merits there has been an emerging field of computational psychiatry and here we review specific studies that focused on MDD. Considerable evidence suggests that MDD is associated with impaired brain signals of reward prediction error and expected value ('wanting'), decreased reward sensitivity ('liking') and/or learning (be it model-free or model-based), etc., although the causality remains unclear. These parameters may serve as valuable intermediate phenotypes of MDD, linking general clinical symptoms to underlying molecular dysfunctions. We believe future computational research at clinical, systems, and cellular/molecular/genetic levels will propel us toward a better understanding of the disease.
Collapse
Affiliation(s)
- Chong Chen
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan.
| | - Taiki Takahashi
- Department of Behavioral Science/Center for Experimental Research in Social Sciences, Hokkaido University, Sapporo 060-0810, Japan
| | - Shin Nakagawa
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| | - Takeshi Inoue
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| | - Ichiro Kusumi
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| |
Collapse
|
15
|
Kim KU, Huh N, Jang Y, Lee D, Jung MW. Effects of fictive reward on rat's choice behavior. Sci Rep 2015; 5:8040. [PMID: 25623929 PMCID: PMC4894400 DOI: 10.1038/srep08040] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 12/29/2014] [Indexed: 11/24/2022] Open
Abstract
Choices of humans and non-human primates are influenced by both actually experienced and fictive outcomes. To test whether this is also the case in rodents, we examined rat's choice behavior in a binary choice task in which variable magnitudes of actual and fictive rewards were delivered. We found that the animal's choice was significantly influenced by the magnitudes of both actual and fictive rewards in the previous trial. A model-based analysis revealed, however, that the effect of fictive reward was more transient and influenced mostly the choice in the next trial, whereas the effect of actual reward was more sustained, consistent with incremental learning of action values. Our results suggest that the capacity to modify future choices based on fictive outcomes might be shared by many different animal species, but fictive outcomes are less effective than actual outcomes in the incremental value learning system.
Collapse
Affiliation(s)
- Ko-Un Kim
- 1] Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea [2] Neuroscience Laboratory, Institute for Medical Sciences, Ajou University School of Medicine, Suwon 443-721, Korea [3] Neuroscience Graduate Program, Ajou University School of Medicine, Suwon 443-721, Korea
| | - Namjung Huh
- 1] Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea [2] Neuroscience Laboratory, Institute for Medical Sciences, Ajou University School of Medicine, Suwon 443-721, Korea
| | - Yunsil Jang
- 1] Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea [2] Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea
| | - Daeyeol Lee
- Department of Neurobiology, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Min Whan Jung
- 1] Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea [2] Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea [3] Neuroscience Laboratory, Institute for Medical Sciences, Ajou University School of Medicine, Suwon 443-721, Korea [4] Neuroscience Graduate Program, Ajou University School of Medicine, Suwon 443-721, Korea
| |
Collapse
|
16
|
Szegletes L, Forstner B. Applications of Modern HCIs in Adaptive Mobile Learning. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2014. [DOI: 10.20965/jaciii.2014.p0311] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Our paper shows how the evolution of HCI devices progresses as a mobile learning tool. Mobile devices provide interesting applications for cognitive infocommunication. Our principal objective is to assist in developing educational games on these devices. Working with different educational institutes, we designed a flexible biofeedback-controlled self-rewarding framework. Several promising approaches and methods are proposed outside the box of educational games in this paper. The attention of players is regulated by changing rewards. We show both how educational games can be improved and how adaptive entertainment games may be developed in the near future.
Collapse
|
17
|
Nakanishi S, Hikida T, Yawata S. Distinct dopaminergic control of the direct and indirect pathways in reward-based and avoidance learning behaviors. Neuroscience 2014; 282:49-59. [PMID: 24769227 DOI: 10.1016/j.neuroscience.2014.04.026] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Revised: 04/01/2014] [Accepted: 04/16/2014] [Indexed: 01/13/2023]
Abstract
The nucleus accumbens (NAc) plays a pivotal role in reward and aversive learning and learning flexibility. Outputs of the NAc are transmitted through two parallel routes termed the direct and indirect pathways and controlled by the dopamine (DA) neurotransmitter. To explore how reward-based and avoidance learning is controlled in the NAc of the mouse, we developed the reversible neurotransmission-blocking (RNB) technique, in which transmission of each pathway could be selectively and reversibly blocked by the pathway-specific expression of transmission-blocking tetanus toxin and the asymmetric RNB technique, in which one side of the NAc was blocked by the RNB technique and the other intact side was pharmacologically manipulated by a transmitter agonist or antagonist. Our studies demonstrated that the activation of D1 receptors in the direct pathway and the inactivation of D2 receptors in the indirect pathway are key determinants that distinctly control reward-based and avoidance learning, respectively. The D2 receptor inactivation is also critical for flexibility of reward learning. Furthermore, reward and aversive learning is regulated by a set of common downstream receptors and signaling cascades, all of which are involved in the induction of long-term potentiation at cortico-accumbens synapses of the two pathways. In this article, we review our studies that specify the regulatory mechanisms of each pathway in learning behavior and propose a mechanistic model to explain how dynamic DA modulation promotes selection of actions that achieve reward-seeking outcomes and avoid aversive ones. The biological significance of the network organization consisting of two parallel transmission pathways is also discussed from the point of effective and prompt selection of neural outcomes in the neural network.
Collapse
Affiliation(s)
- S Nakanishi
- Department of Systems Biology, Osaka Bioscience Institute, 6-2-4 Furuedai, Suita, Osaka 565-0874, Japan.
| | - T Hikida
- Medical Innovation Center, Kyoto University Graduate School of Medicine, 53, Shogoin Kawahara-chou, Sakyo-ku, Kyoto 606-8507, Japan
| | - S Yawata
- Department of Systems Biology, Osaka Bioscience Institute, 6-2-4 Furuedai, Suita, Osaka 565-0874, Japan
| |
Collapse
|
18
|
Aquili L, Liu AW, Shindou M, Shindou T, Wickens JR. Behavioral flexibility is increased by optogenetic inhibition of neurons in the nucleus accumbens shell during specific time segments. Learn Mem 2014; 21:223-31. [PMID: 24639489 PMCID: PMC3966536 DOI: 10.1101/lm.034199.113] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Behavioral flexibility is vital for survival in an environment of changing contingencies. The nucleus accumbens may play an important role in behavioral flexibility, representing learned stimulus–reward associations in neural activity during response selection and learning from results. To investigate the role of nucleus accumbens neural activity in behavioral flexibility, we used light-activated halorhodopsin to inhibit nucleus accumbens shell neurons during specific time segments of a bar-pressing task requiring a win–stay/lose–shift strategy. We found that optogenetic inhibition during action selection in the time segment preceding a lever press had no effect on performance. However, inhibition occurring in the time segment during feedback of results—whether rewards or nonrewards—reduced the errors that occurred after a change in contingency. Our results demonstrate critical time segments during which nucleus accumbens shell neurons integrate feedback into subsequent responses. Inhibiting nucleus accumbens shell neurons in these time segments, during reinforced performance or after a change in contingencies, increases lose–shift behavior. We propose that the activity of nucleus shell accumbens shell neurons in these time segments plays a key role in integrating knowledge of results into subsequent behavior, as well as in modulating lose–shift behavior when contingencies change.
Collapse
Affiliation(s)
- Luca Aquili
- Okinawa Institute of Science and Technology Graduate University, Neurobiology Research Unit, Onna-son, Japan 904-0495
| | | | | | | | | |
Collapse
|
19
|
Baudonnat M, Huber A, David V, Walton ME. Heads for learning, tails for memory: reward, reinforcement and a role of dopamine in determining behavioral relevance across multiple timescales. Front Neurosci 2013; 7:175. [PMID: 24130514 PMCID: PMC3795326 DOI: 10.3389/fnins.2013.00175] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Accepted: 09/09/2013] [Indexed: 11/13/2022] Open
Abstract
Dopamine has long been tightly associated with aspects of reinforcement learning and motivation in simple situations where there are a limited number of stimuli to guide behavior and constrained range of outcomes. In naturalistic situations, however, there are many potential cues and foraging strategies that could be adopted, and it is critical that animals determine what might be behaviorally relevant in such complex environments. This requires not only detecting discrepancies with what they have recently experienced, but also identifying similarities with past experiences stored in memory. Here, we review what role dopamine might play in determining how and when to learn about the world, and how to develop choice policies appropriate to the situation faced. We discuss evidence that dopamine is shaped by motivation and memory and in turn shapes reward-based memory formation. In particular, we suggest that hippocampal-striatal-dopamine networks may interact to determine how surprising the world is and to either inhibit or promote actions at time of behavioral uncertainty.
Collapse
Affiliation(s)
- Mathieu Baudonnat
- Department of Experimental Psychology, University of Oxford Oxford, UK
| | | | | | | |
Collapse
|
20
|
Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior. J Neurosci 2013; 33:8866-90. [PMID: 23678129 DOI: 10.1523/jneurosci.4614-12.2013] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Humans and animals take actions quickly when they expect that the actions lead to reward, reflecting their motivation. Injection of dopamine receptor antagonists into the striatum has been shown to slow such reward-seeking behavior, suggesting that dopamine is involved in the control of motivational processes. Meanwhile, neurophysiological studies have revealed that phasic response of dopamine neurons appears to represent reward prediction error, indicating that dopamine plays central roles in reinforcement learning. However, previous attempts to elucidate the mechanisms of these dopaminergic controls have not fully explained how the motivational and learning aspects are related and whether they can be understood by the way the activity of dopamine neurons itself is controlled by their upstream circuitries. To address this issue, we constructed a closed-circuit model of the corticobasal ganglia system based on recent findings regarding intracortical and corticostriatal circuit architectures. Simulations show that the model could reproduce the observed distinct motivational effects of D1- and D2-type dopamine receptor antagonists. Simultaneously, our model successfully explains the dopaminergic representation of reward prediction error as observed in behaving animals during learning tasks and could also explain distinct choice biases induced by optogenetic stimulation of the D1 and D2 receptor-expressing striatal neurons. These results indicate that the suggested roles of dopamine in motivational control and reinforcement learning can be understood in a unified manner through a notion that the indirect pathway of the basal ganglia represents the value of states/actions at a previous time point, an empirically driven key assumption of our model.
Collapse
|
21
|
Ferguson SM, Phillips PEM, Roth BL, Wess J, Neumaier JF. Direct-pathway striatal neurons regulate the retention of decision-making strategies. J Neurosci 2013; 33:11668-76. [PMID: 23843534 PMCID: PMC3724555 DOI: 10.1523/jneurosci.4783-12.2013] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Revised: 06/04/2013] [Accepted: 06/07/2013] [Indexed: 01/06/2023] Open
Abstract
The dorsal striatum has been implicated in reward-based decision making, but the role played by specific striatal circuits in these processes is essentially unknown. Using cell phenotype-specific viral vectors to express engineered G-protein-coupled DREADD (designer receptors exclusively activated by designer drugs) receptors, we enhanced Gi/o- or Gs-protein-mediated signaling selectively in direct-pathway (striatonigral) neurons of the dorsomedial striatum in Long-Evans rats during discrete periods of training of a high versus low reward-discrimination task. Surprisingly, these perturbations had no impact on reward preference, task performance, or improvement of performance during training. However, we found that transiently increasing Gi/o signaling during training significantly impaired the retention of task strategies used to maximize reward obtainment during subsequent preference testing, whereas increasing Gs signaling produced the opposite effect and significantly enhanced the encoding of a high-reward preference in this decision-making task. Thus, the fact that the endurance of this improved performance was significantly altered over time-long after these neurons were manipulated-indicates that it is under bidirectional control of canonical G-protein-mediated signaling in striatonigral neurons during training. These data demonstrate that cAMP-dependent signaling in direct-pathway neurons play a well-defined role in reward-related behavior; that is, they modulate the plasticity required for the retention of task-specific information that is used to improve performance on future renditions of the task.
Collapse
Affiliation(s)
- Susan M. Ferguson
- Center for Integrative Brain Research, Seattle Children's Research Institute, Seattle, Washington 98101
- Departments of Psychiatry and Behavioral Sciences and
| | - Paul E. M. Phillips
- Departments of Psychiatry and Behavioral Sciences and
- Pharmacology, University of Washington, Seattle, Washington 98195
| | - Bryan L. Roth
- Department of Pharmacology, Division of Chemical Biology and National Institute of Mental Health Psychoactive Drug Screening Program, University of North Carolina Medical School, Chapel Hill, North Carolina 27599, and
| | - Jürgen Wess
- Laboratory of Bioorganic Chemistry, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892
| | - John F. Neumaier
- Departments of Psychiatry and Behavioral Sciences and
- Pharmacology, University of Washington, Seattle, Washington 98195
| |
Collapse
|
22
|
O'Doherty JP. Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation. Eur J Neurosci 2013; 35:987-90. [PMID: 22487029 DOI: 10.1111/j.1460-9568.2012.08074.x] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Neural computational accounts of reward-learning have been dominated by the hypothesis that dopamine neurons behave like a reward-prediction error and thus facilitate reinforcement learning in striatal target neurons. While this framework is consistent with a lot of behavioral and neural evidence, this theory fails to account for a number of behavioral and neurobiological observations. In this special issue of EJN we feature a combination of theoretical and experimental papers highlighting some of the explanatory challenges faced by simple reinforcement-learning models and describing some of the ways in which the framework is being extended in order to address these challenges.
Collapse
|
23
|
Limongi R, Sutherland SC, Zhu J, Young ME, Habib R. Temporal prediction errors modulate cingulate-insular coupling. Neuroimage 2013; 71:147-57. [PMID: 23333417 DOI: 10.1016/j.neuroimage.2012.12.078] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Revised: 12/04/2012] [Accepted: 12/29/2012] [Indexed: 11/30/2022] Open
Abstract
Prediction error (i.e., the difference between the expected and the actual event's outcome) mediates adaptive behavior. Activity in the anterior mid-cingulate cortex (aMCC) and in the anterior insula (aINS) is associated with the commission of prediction errors under uncertainty. We propose a dynamic causal model of effective connectivity (i.e., neuronal coupling) between the aMCC, the aINS, and the striatum in which the task context drives activity in the aINS and the temporal prediction errors modulate extrinsic cingulate-insular connections. With functional magnetic resonance imaging, we scanned 15 participants when they performed a temporal prediction task. They observed visual animations and predicted when a stationary ball began moving after being contacted by another moving ball. To induced uncertainty-driven prediction errors, we introduced spatial gaps and temporal delays between the balls. Classical and Bayesian fMRI analyses provided evidence to support that the aMCC-aINS system along with the striatum not only responds when humans predict whether a dynamic event occurs but also when it occurs. Our results reveal that the insula is the entry port of a three-region pathway involved in the processing of temporal predictions. Moreover, prediction errors rather than attentional demands, task difficulty, or task duration exert an influence in the aMCC-aINS system. Prediction errors debilitate the effect of the aMCC on the aINS. Finally, our computational model provides a way forward to characterize the physiological parallel of temporal prediction errors elicited in dynamic tasks.
Collapse
Affiliation(s)
- Roberto Limongi
- Southern Illinois University Carbondale, USA; Venezuelan Institute for Scientific Research, Venezuela.
| | | | | | | | | |
Collapse
|
24
|
Sullivan BT, Johnson L, Rothkopf CA, Ballard D, Hayhoe M. The role of uncertainty and reward on eye movements in a virtual driving task. J Vis 2012; 12:19. [PMID: 23262151 DOI: 10.1167/12.13.19] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Eye movements during natural tasks are well coordinated with ongoing task demands and many variables could influence gaze strategies. Sprague and Ballard (2003) proposed a gaze-scheduling model that uses a utility-weighted uncertainty metric to prioritize fixations on task-relevant objects and predicted that human gaze should be influenced by both reward structure and task-relevant uncertainties. To test this conjecture, we tracked the eye movements of participants in a simulated driving task where uncertainty and implicit reward (via task priority) were varied. Participants were instructed to simultaneously perform a Follow Task where they followed a lead car at a specific distance and a Speed Task where they drove at an exact speed. We varied implicit reward by instructing the participants to emphasize one task over the other and varied uncertainty in the Speed Task with the presence or absence of uniform noise added to the car's velocity. Subjects' gaze data were classified for the image content near fixation and segmented into looks. Gaze measures, including look proportion, duration and interlook interval, showed that drivers more closely monitor the speedometer if it had a high level of uncertainty, but only if it was also associated with high task priority or implicit reward. The interaction observed appears to be an example of a simple mechanism whereby the reduction of visual uncertainty is gated by behavioral relevance. This lends qualitative support for the primary variables controlling gaze allocation proposed in the Sprague and Ballard model.
Collapse
Affiliation(s)
- Brian T Sullivan
- Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA.
| | | | | | | | | |
Collapse
|
25
|
Morita K, Morishima M, Sakai K, Kawaguchi Y. Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways. Trends Neurosci 2012; 35:457-67. [PMID: 22658226 DOI: 10.1016/j.tins.2012.04.009] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Revised: 04/25/2012] [Accepted: 04/25/2012] [Indexed: 11/25/2022]
Abstract
Midbrain dopamine neurons supposedly encode reward prediction error, but how error signals are computed remains elusive. Here, we propose a mechanism based on recent findings regarding corticostriatal circuits. Specifically, we propose that two distinct subpopulations of corticostriatal neurons differentially represent the animal's current and previous states/actions through unidirectional connectivity from one subpopulation to the other and strong recurrent excitation that exists only within the recipient subpopulation. These corticostriatal subpopulations selectively connect to the direct and indirect pathways of the basal ganglia, such that the temporal difference between the values of current and previous states/actions--the core of the error signal--can be computed. Our hypothesis suggests a unified view of basal ganglia functions and has important clinical implications.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.
| | | | | | | |
Collapse
|