51
|
Funamizu A, Kuhn B, Doya K. Neural substrate of dynamic Bayesian inference in the cerebral cortex. Nat Neurosci 2016; 19:1682-1689. [PMID: 27643432 DOI: 10.1038/nn.4390] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 08/08/2016] [Indexed: 12/11/2022]
Abstract
Dynamic Bayesian inference allows a system to infer the environmental state under conditions of limited sensory observation. Using a goal-reaching task, we found that posterior parietal cortex (PPC) and adjacent posteromedial cortex (PM) implemented the two fundamental features of dynamic Bayesian inference: prediction of hidden states using an internal state transition model and updating the prediction with new sensory evidence. We optically imaged the activity of neurons in mouse PPC and PM layers 2, 3 and 5 in an acoustic virtual-reality system. As mice approached a reward site, anticipatory licking increased even when sound cues were intermittently presented; this was disturbed by PPC silencing. Probabilistic population decoding revealed that neurons in PPC and PM represented goal distances during sound omission (prediction), particularly in PPC layers 3 and 5, and prediction improved with the observation of cue sounds (updating). Our results illustrate how cerebral cortex realizes mental simulation using an action-dependent dynamic model.
Collapse
Affiliation(s)
- Akihiro Funamizu
- Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Tancha, Onna-son, Kunigami, Okinawa, Japan
- Optical Neuroimaging Unit, Okinawa Institute of Science and Technology Graduate University, Tancha, Onna-son, Kunigami, Okinawa, Japan
| | - Bernd Kuhn
- Optical Neuroimaging Unit, Okinawa Institute of Science and Technology Graduate University, Tancha, Onna-son, Kunigami, Okinawa, Japan
| | - Kenji Doya
- Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Tancha, Onna-son, Kunigami, Okinawa, Japan
| |
Collapse
|
52
|
Striatal Activity and Reward Relativity: Neural Signals Encoding Dynamic Outcome Valuation. eNeuro 2016; 3:eN-NWR-0022-16. [PMID: 27822506 PMCID: PMC5089537 DOI: 10.1523/eneuro.0022-16.2016] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Revised: 10/06/2016] [Accepted: 10/07/2016] [Indexed: 11/21/2022] Open
Abstract
The striatum is a key brain region involved in reward processing. Striatal activity has been linked to encoding reward magnitude and integrating diverse reward outcome information. Recent work has supported the involvement of striatum in the valuation of outcomes. The present work extends this idea by examining striatal activity during dynamic shifts in value that include different levels and directions of magnitude disparity. A novel task was used to produce diverse relative reward effects on a chain of instrumental action. Rats (Rattus norvegicus) were trained to respond to cues associated with specific outcomes varying by food pellet magnitude. Animals were exposed to single-outcome sessions followed by mixed-outcome sessions, and neural activity was compared among identical outcome trials from the different behavioral contexts. Results recording striatal activity show that neural responses to different task elements reflect incentive contrast as well as other relative effects that involve generalization between outcomes or possible influences of outcome variety. The activity that was most prevalent was linked to food consumption and post-food consumption periods. Relative encoding was sensitive to magnitude disparity. A within-session analysis showed strong contrast effects that were dependent upon the outcome received in the immediately preceding trial. Significantly higher numbers of responses were found in ventral striatum linked to relative outcome effects. Our results support the idea that relative value can incorporate diverse relationships, including comparisons from specific individual outcomes to general behavioral contexts. The striatum contains these diverse relative processes, possibly enabling both a higher information yield concerning value shifts and a greater behavioral flexibility.
Collapse
|
53
|
Striosome-dendron bouquets highlight a unique striatonigral circuit targeting dopamine-containing neurons. Proc Natl Acad Sci U S A 2016; 113:11318-11323. [PMID: 27647894 DOI: 10.1073/pnas.1613337113] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The dopamine systems of the brain powerfully influence movement and motivation. We demonstrate that striatonigral fibers originating in striosomes form highly unusual bouquet-like arborizations that target bundles of ventrally extending dopamine-containing dendrites and clusters of their parent nigral cell bodies. Retrograde tracing showed that these clustered cell bodies in turn project to the striatum as part of the classic nigrostriatal pathway. Thus, these striosome-dendron formations, here termed "striosome-dendron bouquets," likely represent subsystems with the nigro-striato-nigral loop that are affected in human disorders including Parkinson's disease. Within the bouquets, expansion microscopy resolved many individual striosomal fibers tightly intertwined with the dopamine-containing dendrites and also with afferents labeled by glutamatergic, GABAergic, and cholinergic markers and markers for astrocytic cells and fibers and connexin 43 puncta. We suggest that the striosome-dendron bouquets form specialized integrative units within the dopamine-containing nigral system. Given evidence that striosomes receive input from cortical regions related to the control of mood and motivation and that they link functionally to reinforcement and decision-making, the striosome-dendron bouquets could be critical to dopamine-related function in health and disease.
Collapse
|
54
|
Lintz MJ, Felsen G. Basal ganglia output reflects internally-specified movements. eLife 2016; 5. [PMID: 27377356 PMCID: PMC4970866 DOI: 10.7554/elife.13833] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 07/04/2016] [Indexed: 01/27/2023] Open
Abstract
How movements are selected is a fundamental question in systems neuroscience. While many studies have elucidated the sensorimotor transformations underlying stimulus-guided movements, less is known about how internal goals - critical drivers of goal-directed behavior - guide movements. The basal ganglia are known to bias movement selection according to value, one form of internal goal. Here, we examine whether other internal goals, in addition to value, also influence movements via the basal ganglia. We designed a novel task for mice that dissociated equally rewarded internally-specified and stimulus-guided movements, allowing us to test how each engaged the basal ganglia. We found that activity in the substantia nigra pars reticulata, a basal ganglia output, predictably differed preceding internally-specified and stimulus-guided movements. Incorporating these results into a simple model suggests that internally-specified movements may be facilitated relative to stimulus-guided movements by basal ganglia processing.
Collapse
Affiliation(s)
- Mario J Lintz
- Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, United States.,Neuroscience Program, University of Colorado School of Medicine, Aurora, United States.,Medical Scientist Training Program, University of Colorado School of Medicine, Aurora, United States
| | - Gidon Felsen
- Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, United States.,Neuroscience Program, University of Colorado School of Medicine, Aurora, United States.,Medical Scientist Training Program, University of Colorado School of Medicine, Aurora, United States
| |
Collapse
|
55
|
Neuronal activity in dorsomedial and dorsolateral striatum under the requirement for temporal credit assignment. Sci Rep 2016; 6:27056. [PMID: 27245401 PMCID: PMC4887996 DOI: 10.1038/srep27056] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 05/13/2016] [Indexed: 11/17/2022] Open
Abstract
To investigate neural processes underlying temporal credit assignment in the striatum, we recorded neuronal activity in the dorsomedial and dorsolateral striatum (DMS and DLS, respectively) of rats performing a dynamic foraging task in which a choice has to be remembered until its outcome is revealed for correct credit assignment. Choice signals appeared sequentially, initially in the DMS and then in the DLS, and they were combined with action value and reward signals in the DLS when choice outcome was revealed. Unlike in conventional dynamic foraging tasks, neural signals for chosen value were elevated in neither brain structure. These results suggest that dynamics of striatal neural signals related to evaluating choice outcome might differ drastically depending on the requirement for temporal credit assignment. In a behavioral context requiring temporal credit assignment, the DLS, but not the DMS, might be in charge of updating the value of chosen action by integrating choice, action value, and reward signals together.
Collapse
|
56
|
Ricker JM, Hatch JD, Powers DD, Cromwell HC. Fractionating choice: A study on reward discrimination, preference, and relative valuation in the rat (Rattus norvegicus). ACTA ACUST UNITED AC 2016; 130:174-86. [PMID: 27078079 DOI: 10.1037/com0000034] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Choice behavior combines discrimination between distinctive outcomes, preference for specific outcomes and relative valuation of comparable outcomes. Previous work has focused on 1 component (i.e., preference) disregarding other influential processes that might provide a more complete understanding. Animal models of choice have been explored primarily utilizing extensive training, limited freedom for multiple decisions and sparse behavioral measures constrained to a single phase of motivated action. The present study used a paradigm that combines different elements of previous methods with the goal to distinguish among components of choice and explore how well components match predictions based on risk-sensitive foraging strategies. In order to analyze discrimination and relative valuation, it was necessary to have an option that shifted and an option that remained constant. Shifting outcomes among weeks included a change in single-option outcome (0 to 1 to 2 pellets) or a change in mixed-option outcome (0 or 5 to 0 or 3 to 0 or 1 pellets). Constant outcomes among weeks were also mixed-option (0 or 3 pellets) or single-option (1 pellet). Shifting single-option outcomes among weeks led to better discrimination, more robust preference and significant incentive contrast effects for the alternative outcome. Shifting multioptions altered choice components and led to dissociations among discrimination, preference, and reduced contrast effects. During extinction, all components were impacted with the greatest deficits during the shifting mixed-option outcome sessions. Results suggest choice behavior can be optimized for 1 component but suboptimal for others depending upon the complexity of alterations in outcome value between options. (PsycINFO Database Record
Collapse
Affiliation(s)
- Joshua M Ricker
- J.P. Scott Center for Neuroscience, Mind and Behavior, Bowling Green State University
| | - Justin D Hatch
- J.P. Scott Center for Neuroscience, Mind and Behavior, Bowling Green State University
| | - Daniel D Powers
- J.P. Scott Center for Neuroscience, Mind and Behavior, Bowling Green State University
| | - Howard Casey Cromwell
- J.P. Scott Center for Neuroscience, Mind and Behavior, Bowling Green State University
| |
Collapse
|
57
|
Wang KS, Smith DV, Delgado MR. Using fMRI to study reward processing in humans: past, present, and future. J Neurophysiol 2016; 115:1664-78. [PMID: 26740530 DOI: 10.1152/jn.00333.2015] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 01/04/2016] [Indexed: 01/10/2023] Open
Abstract
Functional magnetic resonance imaging (fMRI) is a noninvasive tool used to probe cognitive and affective processes. Although fMRI provides indirect measures of neural activity, the advent of fMRI has allowed for1) the corroboration of significant animal findings in the human brain, and2) the expansion of models to include more common human attributes that inform behavior. In this review, we briefly consider the neural basis of the blood oxygenation level dependent signal to set up a discussion of how fMRI studies have applied it in examining cognitive models in humans and the promise of using fMRI to advance such models. Specifically, we illustrate the contribution that fMRI has made to the study of reward processing, focusing on the role of the striatum in encoding reward-related learning signals that drive anticipatory and consummatory behaviors. For instance, we discuss how fMRI can be used to link neural signals (e.g., striatal responses to rewards) to individual differences in behavior and traits. While this functional segregation approach has been constructive to our understanding of reward-related functions, many fMRI studies have also benefitted from a functional integration approach that takes into account how interconnected regions (e.g., corticostriatal circuits) contribute to reward processing. We contend that future work using fMRI will profit from using a multimodal approach, such as combining fMRI with noninvasive brain stimulation tools (e.g., transcranial electrical stimulation), that can identify causal mechanisms underlying reward processing. Consequently, advancements in implementing fMRI will promise new translational opportunities to inform our understanding of psychopathologies.
Collapse
Affiliation(s)
- Kainan S Wang
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey; and
| | - David V Smith
- Department of Psychology, Rutgers University, Newark, New Jersey
| | - Mauricio R Delgado
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, New Jersey; and Department of Psychology, Rutgers University, Newark, New Jersey
| |
Collapse
|
58
|
Memory Systems of the Basal Ganglia. ACTA ACUST UNITED AC 2016. [DOI: 10.1016/b978-0-12-802206-1.00035-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
59
|
Ito M, Doya K. Parallel Representation of Value-Based and Finite State-Based Strategies in the Ventral and Dorsal Striatum. PLoS Comput Biol 2015; 11:e1004540. [PMID: 26529522 PMCID: PMC4631489 DOI: 10.1371/journal.pcbi.1004540] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Accepted: 09/08/2015] [Indexed: 12/05/2022] Open
Abstract
Previous theoretical studies of animal and human behavioral learning have focused on the dichotomy of the value-based strategy using action value functions to predict rewards and the model-based strategy using internal models to predict environmental states. However, animals and humans often take simple procedural behaviors, such as the “win-stay, lose-switch” strategy without explicit prediction of rewards or states. Here we consider another strategy, the finite state-based strategy, in which a subject selects an action depending on its discrete internal state and updates the state depending on the action chosen and the reward outcome. By analyzing choice behavior of rats in a free-choice task, we found that the finite state-based strategy fitted their behavioral choices more accurately than value-based and model-based strategies did. When fitted models were run autonomously with the same task, only the finite state-based strategy could reproduce the key feature of choice sequences. Analyses of neural activity recorded from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) identified significant fractions of neurons in all three subareas for which activities were correlated with individual states of the finite state-based strategy. The signal of internal states at the time of choice was found in DMS, and for clusters of states was found in VS. In addition, action values and state values of the value-based strategy were encoded in DMS and VS, respectively. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum. The neural mechanism of decision-making, a cognitive process to select one action among multiple possibilities, is a fundamental issue in neuroscience. Previous studies have revealed the roles of the cerebral cortex and the basal ganglia in decision-making, by assuming that subjects take a value-based reinforcement learning strategy, in which the expected reward for each action candidate is updated. However, animals and humans often use simple procedural strategies, such as “win-stay, lose-switch.” In this study, we consider a finite state-based strategy, in which a subject acts depending on its discrete internal state and updates the state based on reward feedback. We found that the finite state-based strategy could reproduce the choice behavior of rats in a binary choice task with higher accuracy than the value-based strategy. Interestingly, neuronal activity in the striatum, a crucial brain region for reward-based learning, encoded information regarding both strategies. These results suggest that both the value-based strategy and the finite state-based strategy are implemented in the striatum.
Collapse
Affiliation(s)
- Makoto Ito
- Okinawa Institute of Science and Technology Graduate University, Onna-son Okinawa, Japan
- * E-mail:
| | - Kenji Doya
- Okinawa Institute of Science and Technology Graduate University, Onna-son Okinawa, Japan
| |
Collapse
|
60
|
Pezzulo G, Rigoli F, Friston K. Active Inference, homeostatic regulation and adaptive behavioural control. Prog Neurobiol 2015; 134:17-35. [PMID: 26365173 PMCID: PMC4779150 DOI: 10.1016/j.pneurobio.2015.09.001] [Citation(s) in RCA: 289] [Impact Index Per Article: 32.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Revised: 07/20/2015] [Accepted: 09/08/2015] [Indexed: 11/30/2022]
Abstract
We review a theory of homeostatic regulation and adaptive behavioural control within the Active Inference framework. Our aim is to connect two research streams that are usually considered independently; namely, Active Inference and associative learning theories of animal behaviour. The former uses a probabilistic (Bayesian) formulation of perception and action, while the latter calls on multiple (Pavlovian, habitual, goal-directed) processes for homeostatic and behavioural control. We offer a synthesis these classical processes and cast them as successive hierarchical contextualisations of sensorimotor constructs, using the generative models that underpin Active Inference. This dissolves any apparent mechanistic distinction between the optimization processes that mediate classical control or learning. Furthermore, we generalize the scope of Active Inference by emphasizing interoceptive inference and homeostatic regulation. The ensuing homeostatic (or allostatic) perspective provides an intuitive explanation for how priors act as drives or goals to enslave action, and emphasises the embodied nature of inference.
Collapse
Affiliation(s)
- Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| | | | | |
Collapse
|
61
|
Malhotra S, Cross RW, Zhang A, van der Meer MAA. Ventral striatal gamma oscillations are highly variable from trial to trial, and are dominated by behavioural state, and only weakly influenced by outcome value. Eur J Neurosci 2015; 42:2818-32. [DOI: 10.1111/ejn.13069] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2014] [Revised: 09/03/2015] [Accepted: 09/07/2015] [Indexed: 01/10/2023]
Affiliation(s)
- Sushant Malhotra
- Department of Biology and Centre for Theoretical Neuroscience; University of Waterloo; Ontario Canada
- Systems Design Engineering; University of Waterloo; Ontario Canada
| | - Rob W. Cross
- Department of Biology and Centre for Theoretical Neuroscience; University of Waterloo; Ontario Canada
| | - Anqi Zhang
- Program in Neuroscience; McGill University; Montreal Quebec Canada
| | - Matthijs A. A. van der Meer
- Department of Biology and Centre for Theoretical Neuroscience; University of Waterloo; Ontario Canada
- Department of Psychological and Brain Sciences; Dartmouth College; Hanover NH 03755 USA
| |
Collapse
|
62
|
Balleine BW, Dezfouli A, Ito M, Doya K. Hierarchical control of goal-directed action in the cortical–basal ganglia network. Curr Opin Behav Sci 2015. [DOI: 10.1016/j.cobeha.2015.06.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|