Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Morita K, Morishima M, Sakai K, Kawaguchi Y. Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways. Trends Neurosci 2012;35:457-67. [PMID: 22658226 DOI: 10.1016/j.tins.2012.04.009] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Revised: 04/25/2012] [Accepted: 04/25/2012] [Indexed: 11/25/2022]

For:	Morita K, Morishima M, Sakai K, Kawaguchi Y. Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways. Trends Neurosci 2012;35:457-67. [PMID: 22658226 DOI: 10.1016/j.tins.2012.04.009] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Revised: 04/25/2012] [Accepted: 04/25/2012] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Blackwell KT, Doya K. Enhancing reinforcement learning models by including direct and indirect pathways improves performance on striatal dependent tasks. PLoS Comput Biol 2023;19:e1011385. [PMID: 37594982 PMCID: PMC10479916 DOI: 10.1371/journal.pcbi.1011385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/05/2023] [Accepted: 07/25/2023] [Indexed: 08/20/2023] Open

Abstract

A major advance in understanding learning behavior stems from experiments showing that reward learning requires dopamine inputs to striatal neurons and arises from synaptic plasticity of cortico-striatal synapses. Numerous reinforcement learning models mimic this dopamine-dependent synaptic plasticity by using the reward prediction error, which resembles dopamine neuron firing, to learn the best action in response to a set of cues. Though these models can explain many facets of behavior, reproducing some types of goal-directed behavior, such as renewal and reversal, require additional model components. Here we present a reinforcement learning model, TD2Q, which better corresponds to the basal ganglia with two Q matrices, one representing direct pathway neurons (G) and another representing indirect pathway neurons (N). Unlike previous two-Q architectures, a novel and critical aspect of TD2Q is to update the G and N matrices utilizing the temporal difference reward prediction error. A best action is selected for N and G using a softmax with a reward-dependent adaptive exploration parameter, and then differences are resolved using a second selection step applied to the two action probabilities. The model is tested on a range of multi-step tasks including extinction, renewal, discrimination; switching reward probability learning; and sequence learning. Simulations show that TD2Q produces behaviors similar to rodents in choice and sequence learning tasks, and that use of the temporal difference reward prediction error is required to learn multi-step tasks. Blocking the update rule on the N matrix blocks discrimination learning, as observed experimentally. Performance in the sequence learning task is dramatically improved with two matrices. These results suggest that including additional aspects of basal ganglia physiology can improve the performance of reinforcement learning models, better reproduce animal behaviors, and provide insight as to the role of direct- and indirect-pathway striatal neurons.

Collapse

Gao Y. A computational model of learning flexible navigation in a maze by layout-conforming replay of place cells. Front Comput Neurosci 2023;17:1053097. [PMID: 36846726 PMCID: PMC9947252 DOI: 10.3389/fncom.2023.1053097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 01/16/2023] [Indexed: 02/11/2023] Open

Shimomura K, Kato A, Morita K. Rigid reduced successor representation as a potential mechanism for addiction. Eur J Neurosci 2021;53:3768-3790. [PMID: 33840120 PMCID: PMC8252639 DOI: 10.1111/ejn.15227] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Revised: 03/30/2021] [Accepted: 04/07/2021] [Indexed: 12/14/2022]

Gilbertson T, Steele D. Tonic dopamine, uncertainty and basal ganglia action selection. Neuroscience 2021;466:109-124. [PMID: 34015370 DOI: 10.1016/j.neuroscience.2021.05.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 05/04/2021] [Accepted: 05/08/2021] [Indexed: 11/29/2022]

Inglis JB, Valentin VV, Ashby FG. Modulation of Dopamine for Adaptive Learning: A Neurocomputational Model. COMPUTATIONAL BRAIN & BEHAVIOR 2021;4:34-52. [PMID: 34151186 PMCID: PMC8210637 DOI: 10.1007/s42113-020-00083-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Roh M, Lee H, Seo H, Lim CS, Park P, Choi JE, Kwak JH, Lee J, Kaang BK, McHugh TJ, Lee K. Perseverative stereotypic behavior of Epac2 KO mice in a reward-based decision making task. Neurosci Res 2020;161:8-17. [PMID: 33007326 DOI: 10.1016/j.neures.2020.08.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 08/14/2020] [Accepted: 08/21/2020] [Indexed: 11/18/2022]

Erdeniz B, Done J. Towards Automaticity in Reinforcement Learning: A Model-Based Functional Magnetic Resonance Imaging Study. ACTA ACUST UNITED AC 2020;57:98-107. [PMID: 32550774 DOI: 10.29399/npa.24772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 11/11/2019] [Indexed: 11/07/2022]

Abstract

Introduction

Previous studies showed that over the course of learning many neurons in the medial prefrontal cortex adapt their firing rate towards the options with highest predicted value reward but it was showed that during later learning trials the brain switches to a more automatic processing mode governed by the basal ganglia. Based on this evidence, we hypothesized that during the early learning trials the predicted values of chosen options will be coded by a goal directed system in the medial frontal cortex but during the late trials the predicted values will be coded by the habitual learning system in the dorsal striatum.

Methods

In this study, using a 3 Tesla functional magnetic resonance imaging scanner (fMRI), blood oxygen level dependent signal (BOLD) data was collected whilst participants (N=12) performed a reinforcement learning task. The task consisted of instrumental conditioning trials wherein each trial a participant choose one of the two available options in order to win or avoid losing money. In addition to that, depending on the experimental condition, participants received either monetary reward (gain money), monetary penalty (lose money) or neural outcome.

Results

Using model-based analysis for functional magnetic resonance imaging (fMRI) event related designs; region of interest (ROI) analysis was performed to nucleus accumbens, medial frontal cortex, caudate nucleus, putamen and globus pallidus internal and external segments. In order to compare the difference in brain activity for early (goal directed) versus late learning (habitual, automatic) trials, separate ROI analyses were performed for each anatomical sub-region. For the reward condition, we found significant activity in the medial frontal cortex (p<0.05) only for early learning trials but activity is shifted to bilateral putamen (p<0.05) during later trials. However, for the loss condition no significant activity was found for early trials except globus pallidus internal segment showed a significant activity (p<0.05) for later trials.

Conclusion

We found that during reinforcement learning activation in the brain shifted from the medial frontal regions to dorsal regions of the striatum. These findings suggest that there are two separable (early goal directed and late habitual) learning systems in the brain.

Collapse

Balasubramani PP, Chakravarthy VS. Bipolar oscillations between positive and negative mood states in a computational model of Basal Ganglia. Cogn Neurodyn 2019;14:181-202. [PMID: 32226561 DOI: 10.1007/s11571-019-09564-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 10/28/2019] [Accepted: 11/15/2019] [Indexed: 12/14/2022] Open

Abstract

Bipolar disorder is characterized by mood swings-oscillations between manic and depressive states. The swings (oscillations) mark the length of an episode in a patient's mood cycle (period), and can vary from hours to years. The proposed modeling study uses decision making framework to investigate the role of basal ganglia network in generating bipolar oscillations. In this model, the basal ganglia system performs a two-arm bandit task in which one of the arms (action responses) leads to a positive outcome, while the other leads to a negative outcome. We explore the dynamics of key reward and risk related parameters in the system while the model agent receives various outcomes. Particularly, we study the system using a model that represents the fast dynamics of decision making, and a module to capture the slow dynamics that describe the variation of some meta-parameters of fast dynamics over long time scales. The model is cast at three levels of abstraction: (1) a two-dimensional dynamical system model, that is a simple two variable model capable of showing bistability for rewarding and punitive outcomes; (2) a phenomenological basal ganglia model, to extend the implications from the reduced model to a cortico-basal ganglia setup; (3) a detailed network model of basal ganglia, that incorporates detailed cellular level models for a more realistic understanding. In healthy conditions, the model chooses positive action and avoids negative one, whereas under bipolar conditions, the model exhibits slow oscillations in its choice of positive or negative outcomes, reminiscent of bipolar oscillations. Phase-plane analyses on the simple reduced dynamical system with two variables reveal the essential parameters that generate pathological 'bipolar-like' oscillations. Phenomenological and network models of the basal ganglia extend that logic, and interpret bipolar oscillations in terms of the activity of dopaminergic and serotonergic projections on the cortico-basal ganglia network dynamics. The network's dysfunction, specifically in terms of reward and risk sensitivity, is shown to be responsible for the pathological bipolar oscillations. The study proposes a computational model that explores the effects of impaired serotonergic neuromodulation on the dynamics of the cortico basal ganglia network, and relates this impairment to abstract mood states (manic and depressive episodes) and oscillations of bipolar disorder.

Collapse

Saiki A, Sakai Y, Fukabori R, Soma S, Yoshida J, Kawabata M, Yawo H, Kobayashi K, Kimura M, Isomura Y. In Vivo Spiking Dynamics of Intra- and Extratelencephalic Projection Neurons in Rat Motor Cortex. Cereb Cortex 2019;28:1024-1038. [PMID: 28137723 DOI: 10.1093/cercor/bhx012] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Accepted: 01/11/2017] [Indexed: 12/15/2022] Open

Kawaguchi Y, Otsuka T, Morishima M, Ushimaru M, Kubota Y. Control of excitatory hierarchical circuits by parvalbumin-FS basket cells in layer 5 of the frontal cortex: insights for cortical oscillations. J Neurophysiol 2019;121:2222-2236. [PMID: 30995139 PMCID: PMC6620693 DOI: 10.1152/jn.00778.2018] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

The cortex contains multiple neuron types with specific connectivity and functions. Recent progress has provided a better understanding of the interactions of these neuron types as well as their output organization particularly for the frontal cortex, with implications for the circuit mechanisms underlying cortical oscillations that have cognitive functions. Layer 5 pyramidal cells (PCs) in the frontal cortex comprise two major subtypes: crossed-corticostriatal (CCS) and corticopontine (CPn) cells. Functionally, CCS and CPn cells exhibit similar phase-dependent firing during gamma waves but participate in two distinct subnetworks that are linked unidirectionally from CCS to CPn cells. GABAergic parvalbumin-expressing fast-spiking (PV-FS) cells, necessary for gamma oscillation, innervate PCs, with stronger and global inhibition to somata and weaker and localized inhibitions to dendritic shafts/spines. While PV-FS cells form reciprocal connections with both CCS and CPn cells, the excitation from CPn to PV-FS cells exhibits short-term synaptic dynamics conducive for oscillation induction. The electrical coupling between PV-FS cells facilitates spike synchronization among PV-FS cells receiving common excitatory inputs from local PCs and inhibits other PV-FS cells via electrically communicated spike afterhyperpolarizations. These connectivity characteristics can promote synchronous firing in the local networks of CPn cells and firing of some CCS cells by anode-break excitation. Thus subsets of L5 CCS and CPn cells within different levels of connection hierarchy exhibit coordinated activity via their common connections with PV-FS cells, and the resulting PC output drives diverse neuronal targets in cortical layer 1 and the striatum with specific temporal precision, expanding the computational power of the cortical network.

Collapse

Pallidal circuits for aversive motivation and learning. Curr Opin Behav Sci 2019. [DOI: 10.1016/j.cobeha.2018.09.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Deperrois N, Moiseeva V, Gutkin B. Minimal Circuit Model of Reward Prediction Error Computations and Effects of Nicotinic Modulations. Front Neural Circuits 2019;12:116. [PMID: 30687021 PMCID: PMC6336136 DOI: 10.3389/fncir.2018.00116] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 12/14/2018] [Indexed: 11/29/2022] Open

Morita K, Kawaguchi Y. A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine. Front Neural Circuits 2019;12:111. [PMID: 30687019 PMCID: PMC6338031 DOI: 10.3389/fncir.2018.00111] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 11/29/2018] [Indexed: 01/07/2023] Open

Abstract

The hypothesis that the basal-ganglia direct and indirect pathways represent goodness (or benefit) and badness (or cost) of options, respectively, explains a wide range of phenomena. However, this hypothesis, named the Opponent Actor Learning (OpAL), still has limitations. Structurally, the OpAL model does not incorporate differentiation of the two types of cortical inputs to the basal-ganglia pathways received from intratelencephalic (IT) and pyramidal-tract (PT) neurons. Functionally, the OpAL model does not describe the temporal-difference (TD)-type reward-prediction-error (RPE), nor explains how RPE is calculated in the circuitry connecting to the DA neurons. In fact, there is a different hypothesis on the basal-ganglia pathways and DA, named the Cortico-Striatal-Temporal-Difference (CS-TD) model. The CS-TD model differentiates the IT and PT inputs, describes the TD-type RPE, and explains how TD-RPE is calculated. However, a critical difficulty in this model lies in its assumption that DA induces the same direction of plasticity in both direct and indirect pathways, which apparently contradicts the experimentally observed opposite effects of DA on these pathways. Here, we propose a new hypothesis that integrates the OpAL and CS-TD models. Specifically, we propose that the IT-basal-ganglia pathways represent goodness/badness of current options while the PT-indirect pathway represents the overall value of the previously chosen option, and both of these have influence on the DA neurons, through the basal-ganglia output, so that a variant of TD-RPE is calculated. A key assumption is that opposite directions of plasticity are induced upon phasic activation of DA neurons in the IT-indirect pathway and PT-indirect pathway because of different profiles of IT and PT inputs. Specifically, at PT→indirect-pathway-medium-spiny-neuron (iMSN) synapses, sustained glutamatergic inputs generate rich adenosine, which allosterically prevents DA-D2 receptor signaling and instead favors adenosine-A2A receptor signaling. Then, phasic DA-induced phasic adenosine, which reflects TD-RPE, causes long-term synaptic potentiation. In contrast, at IT→iMSN synapses where adenosine is scarce, phasic DA causes long-term synaptic depression via D2 receptor signaling. This new Opponency and Temporal-Difference (OTD) model provides unique predictions, part of which is potentially in line with recently reported activity patterns of neurons in the globus pallidus externus on the indirect pathway.

Collapse

Kawaguchi Y. Pyramidal Cell Subtypes and Their Synaptic Connections in Layer 5 of Rat Frontal Cortex. Cereb Cortex 2018;27:5755-5771. [PMID: 29028949 DOI: 10.1093/cercor/bhx252] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 09/06/2017] [Indexed: 12/31/2022] Open

Morishima M, Kobayashi K, Kato S, Kobayashi K, Kawaguchi Y. Segregated Excitatory-Inhibitory Recurrent Subnetworks in Layer 5 of the Rat Frontal Cortex. Cereb Cortex 2018;27:5846-5857. [PMID: 29045559 PMCID: PMC5905586 DOI: 10.1093/cercor/bhx276] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

A Neural Circuit Mechanism for the Involvements of Dopamine in Effort-Related Choices: Decay of Learned Values, Secondary Effects of Depletion, and Calculation of Temporal Difference Error. eNeuro 2018;5:eN-NWR-0021-18. [PMID: 29468191 PMCID: PMC5820541 DOI: 10.1523/eneuro.0021-18.2018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 01/11/2018] [Indexed: 12/17/2022] Open

The “highs and lows” of the human brain on dopaminergics: Evidence from neuropharmacology. Neurosci Biobehav Rev 2017. [DOI: 10.1016/j.neubiorev.2017.06.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Negwer M, Schubert D. Talking Convergence: Growing Evidence Links FOXP2 and Retinoic Acid in Shaping Speech-Related Motor Circuitry. Front Neurosci 2017;11:19. [PMID: 28179876 PMCID: PMC5263127 DOI: 10.3389/fnins.2017.00019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 01/10/2017] [Indexed: 01/30/2023] Open

Hart G, Balleine BW. Consolidation of Goal-Directed Action Depends on MAPK/ERK Signaling in Rodent Prelimbic Cortex. J Neurosci 2016;36:11974-11986. [PMID: 27881782 PMCID: PMC6604924 DOI: 10.1523/jneurosci.1772-16.2016] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Revised: 09/19/2016] [Accepted: 10/03/2016] [Indexed: 02/02/2023] Open

Abstract

The prelimbic prefrontal cortex (PL) has consistently been found to be necessary for the acquisition of goal-directed actions in rodents. Nevertheless, the specific cellular processes underlying this learning remain unknown. We assessed changes in learning-related expression of mitogen-activated protein kinase/extracellular signal-related kinase (MAPK/ERK1/2) phosphorylation (pERK) in layers 2-3 and 5-6 of the anterior and posterior PL and in the population of neurons projecting to posterior dorsomedial striatum (pDMS), also implicated in goal-directed learning. Rats were given either a single session of training to press a lever for a pellet reward or yoked reward deliveries without instrumental training and assessed 5 or 60 min after training for pERK expression. Relative to yoked training, instrumental training produced an increase in pERK expression in all regions of the PL both at 5 and 60 min, and this was accompanied by an increase in nuclear pERK expression in the posterior PL in rats given instrumental training. pDMS-projecting neurons showed a transient increase in pERK expression in posterior layer 5-6 projection neurons after 5 min, and a delayed increase in anterior layer 2-3 neurons after 60 min, suggesting that ERK expression in the PL is necessary for the consolidation of goal-directed learning. Consistent with this claim, we found that rats trained on two lever press actions for distinct outcomes and then infused with the MEK inhibitor PD98059 into the PL immediately after training failed to acquire specific action-outcome associations, suggesting that persistent pERK signaling in the PL is necessary for goal-directed learning.

SIGNIFICANCE STATEMENT

The prelimbic cortex is implicated in goal-directed learning in rodents; however, it is unclear whether it is involved in the consolidation of this learning, and what cellular processes are involved. We used pERK as a marker of activity-related synaptic plasticity to assess learning-induced changes in distinct layers and neuronal populations of the prelimbic prefrontal cortex (PL). Training produced long-lasting upregulation of pERK throughout the PL and specifically within neurons that project to the pDMS, another region critical for goal-directed learning. Next, we demonstrated that pERK signaling in the PL was necessary for the consolidation of goal-directed learning. Together, these results indicate that instrumental training induces ERK signaling in distinct layers and populations in the PL and this signaling underlies the consolidation of goal-directed learning.

Collapse

Kato A, Morita K. Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation. PLoS Comput Biol 2016;12:e1005145. [PMID: 27736881 PMCID: PMC5063413 DOI: 10.1371/journal.pcbi.1005145] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 09/14/2016] [Indexed: 12/12/2022] Open

Abstract

It has been suggested that dopamine (DA) represents reward-prediction-error (RPE) defined in reinforcement learning and therefore DA responds to unpredicted but not predicted reward. However, recent studies have found DA response sustained towards predictable reward in tasks involving self-paced behavior, and suggested that this response represents a motivational signal. We have previously shown that RPE can sustain if there is decay/forgetting of learned-values, which can be implemented as decay of synaptic strengths storing learned-values. This account, however, did not explain the suggested link between tonic/sustained DA and motivation. In the present work, we explored the motivational effects of the value-decay in self-paced approach behavior, modeled as a series of ‘Go’ or ‘No-Go’ selections towards a goal. Through simulations, we found that the value-decay can enhance motivation, specifically, facilitate fast goal-reaching, albeit counterintuitively. Mathematical analyses revealed that underlying potential mechanisms are twofold: (1) decay-induced sustained RPE creates a gradient of ‘Go’ values towards a goal, and (2) value-contrasts between ‘Go’ and ‘No-Go’ are generated because while chosen values are continually updated, unchosen values simply decay. Our model provides potential explanations for the key experimental findings that suggest DA's roles in motivation: (i) slowdown of behavior by post-training blockade of DA signaling, (ii) observations that DA blockade severely impairs effortful actions to obtain rewards while largely sparing seeking of easily obtainable rewards, and (iii) relationships between the reward amount, the level of motivation reflected in the speed of behavior, and the average level of DA. These results indicate that reinforcement learning with value-decay, or forgetting, provides a parsimonious mechanistic account for the DA's roles in value-learning and motivation. Our results also suggest that when biological systems for value-learning are active even though learning has apparently converged, the systems might be in a state of dynamic equilibrium, where learning and forgetting are balanced.

Dopamine (DA) has been suggested to have two reward-related roles: (1) representing reward-prediction-error (RPE), and (2) providing motivational drive. Role(1) is based on the physiological results that DA responds to unpredicted but not predicted reward, whereas role(2) is supported by the pharmacological results that blockade of DA signaling causes motivational impairments such as slowdown of self-paced behavior. So far, these two roles are considered to be played by two different temporal patterns of DA signals: role(1) by phasic signals and role(2) by tonic/sustained signals. However, recent studies have found sustained DA signals with features indicative of both roles (1) and (2), complicating this picture. Meanwhile, whereas synaptic/circuit mechanisms for role(1), i.e., how RPE is calculated in the upstream of DA neurons and how RPE-dependent update of learned-values occurs through DA-dependent synaptic plasticity, have now become clarified, mechanisms for role(2) remain unclear. In this work, we modeled self-paced behavior by a series of ‘Go’ or ‘No-Go’ selections in the framework of reinforcement-learning assuming DA's role(1), and demonstrated that incorporation of decay/forgetting of learned-values, which is presumably implemented as decay of synaptic strengths storing learned-values, provides a potential unified mechanistic account for the DA's two roles, together with its various temporal patterns.

Collapse

Niv Y, Langdon A. Reinforcement learning with Marr. Curr Opin Behav Sci 2016;11:67-73. [PMID: 27408906 PMCID: PMC4939081 DOI: 10.1016/j.cobeha.2016.04.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Tian J, Huang R, Cohen JY, Osakada F, Kobak D, Machens CK, Callaway EM, Uchida N, Watabe-Uchida M. Distributed and Mixed Information in Monosynaptic Inputs to Dopamine Neurons. Neuron 2016;91:1374-1389. [PMID: 27618675 DOI: 10.1016/j.neuron.2016.08.018] [Citation(s) in RCA: 139] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 06/28/2016] [Accepted: 07/25/2016] [Indexed: 01/29/2023]

Corticostriatal circuit mechanisms of value-based action selection: Implementation of reinforcement learning algorithms and beyond. Behav Brain Res 2016;311:110-121. [DOI: 10.1016/j.bbr.2016.05.017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 05/02/2016] [Accepted: 05/06/2016] [Indexed: 01/20/2023]

Berthet P, Lindahl M, Tully PJ, Hellgren-Kotaleski J, Lansner A. Functional Relevance of Different Basal Ganglia Pathways Investigated in a Spiking Model with Reward Dependent Plasticity. Front Neural Circuits 2016;10:53. [PMID: 27493625 PMCID: PMC4954853 DOI: 10.3389/fncir.2016.00053] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 07/06/2016] [Indexed: 11/13/2022] Open

The functional logic of corticostriatal connections. Brain Struct Funct 2016;222:669-706. [PMID: 27412682 PMCID: PMC5334428 DOI: 10.1007/s00429-016-1250-9] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 06/06/2016] [Indexed: 01/09/2023]

Abstract

Unidirectional connections from the cortex to the matrix of the corpus striatum initiate the cortico-basal ganglia (BG)-thalamocortical loop, thought to be important in momentary action selection and in longer-term fine tuning of behavioural repertoire; a discrete set of striatal compartments, striosomes, has the complementary role of registering or anticipating reward that shapes corticostriatal plasticity. Re-entrant signals traversing the cortico-BG loop impact predominantly frontal cortices, conveyed through topographically ordered output channels; by contrast, striatal input signals originate from a far broader span of cortex, and are far more divergent in their termination. The term 'disclosed loop' is introduced to describe this organisation: a closed circuit that is open to outside influence at the initial stage of cortical input. The closed circuit component of corticostriatal afferents is newly dubbed 'operative', as it is proposed to establish the bid for action selection on the part of an incipient cortical action plan; the broader set of converging corticostriatal afferents is described as contextual. A corollary of this proposal is that every unit of the striatal volume, including the long, C-shaped tail of the caudate nucleus, should receive a mandatory component of operative input, and hence include at least one area of BG-recipient cortex amongst the sources of its corticostriatal afferents. Individual operative afferents contact twin classes of GABAergic striatal projection neuron (SPN), distinguished by their neurochemical character, and onward circuitry. This is the basis of the classic direct and indirect pathway model of the cortico-BG loop. Each pathway utilises a serial chain of inhibition, with two such links, or three, providing positive and negative feedback, respectively. Operative co-activation of direct and indirect SPNs is, therefore, pictured to simultaneously promote action, and to restrain it. The balance of this rival activity is determined by the contextual inputs, which summarise the external and internal sensory environment, and the state of ongoing behavioural priorities. Notably, the distributed sources of contextual convergence upon a striatal locus mirror the transcortical network harnessed by the origin of the operative input to that locus, thereby capturing a similar set of contingencies relevant to determining action. The disclosed loop formulation of corticostriatal and subsequent BG loop circuitry, as advanced here, refines the operating rationale of the classic model and allows the integration of more recent anatomical and physiological data, some of which can appear at variance with the classic model. Equally, it provides a lucid functional context for continuing cellular studies of SPN biophysics and mechanisms of synaptic plasticity.

Collapse

Kubota Y, Karube F, Nomura M, Kawaguchi Y. The Diversity of Cortical Inhibitory Synapses. Front Neural Circuits 2016;10:27. [PMID: 27199670 PMCID: PMC4842771 DOI: 10.3389/fncir.2016.00027] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Accepted: 03/29/2016] [Indexed: 12/03/2022] Open

Neural circuitry involved in quitting after repeated failures: role of the cingulate and temporal parietal junction. Sci Rep 2016;6:24713. [PMID: 27097529 PMCID: PMC4838821 DOI: 10.1038/srep24713] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 03/31/2016] [Indexed: 11/26/2022] Open

Benarroch EE. Intrinsic circuits of the striatum. Neurology 2016;86:1531-42. [DOI: 10.1212/wnl.0000000000002599] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open

Keiflin R, Janak PH. Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry. Neuron 2016;88:247-63. [PMID: 26494275 DOI: 10.1016/j.neuron.2015.08.037] [Citation(s) in RCA: 201] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Gerfen C, Bolam J. The Neuroanatomical Organization of the Basal Ganglia. HANDBOOK OF BEHAVIORAL NEUROSCIENCE 2016. [DOI: 10.1016/b978-0-12-802206-1.00001-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Temporal Structure of Neuronal Activity among Cortical Neuron Subtypes during Slow Oscillations in Anesthetized Rats. J Neurosci 2015;35:11988-2001. [PMID: 26311779 DOI: 10.1523/jneurosci.5074-14.2015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Abstract

UNLABELLED

Slow-wave oscillations, the predominant brain rhythm during sleep, are composed of Up/Down cycles. Depolarizing Up-states involve activity in layer 5 (L5) of the neocortex, but it is unknown how diverse subtypes of neurons within L5 participate in generating and maintaining Up-states. Here we compare the in vivo firing patterns of corticopontine (CPn) pyramidal cells, crossed-corticostriatal (CCS) pyramidal cells, and fast-spiking (FS) GABAergic neurons in the rat frontal cortex, with those of thalamocortical neurons during Up/Down cycles in the anesthetized condition. During the transition from Down- to Up-states, increased activity in these neurons was highly temporally structured, with spiking occurring first in thalamocortical neurons, followed by cortical FS cells, CCS cells, and, finally, CPn cells. Activity in some FS, CCS, and CPn neurons occurred in phase with Up-nested gamma rhythms, with FS neurons showing phase delay relative to pyramidal neurons. These results suggest that thalamic and cortical pyramidal neurons are activated in a specific temporal sequence during Up/Down cycles, but cortical pyramidal cells are activated at a similar gamma phase. In addition to Up-state firing specificity, CCS and CPn cells exhibited differences in activity during cortical desynchronization, further indicating projection- and state-dependent information processing within L5.

SIGNIFICANCE STATEMENT

Patterned activity in neocortical electroencephalograms, including slow waves and gamma oscillations, is thought to reflect the organized activity of neocortical neurons that comprises many specialized neuron subtypes. We found that the timing of action potentials during slow waves in individual cortical neurons was correlated with their laminar positions and axonal targets. Within gamma cycles nested in the slow-wave depolarization, cortical pyramidal cells fired earlier than did interneurons. At the start of slow-wave depolarizations, activity in thalamic neurons receiving inhibition from the basal ganglia occurred earlier than activity in cortical neurons. Together, these findings reveal a temporally ordered pattern of output from diverse neuron subtypes in the frontal cortex and related thalamic nuclei during neocortical oscillations.

Collapse

Hassan A, Benarroch EE. Heterogeneity of the midbrain dopamine system: Implications for Parkinson disease. Neurology 2015;85:1795-805. [PMID: 26475693 DOI: 10.1212/wnl.0000000000002137] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Morita K, Kawaguchi Y. Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning. Eur J Neurosci 2015;42:2003-21. [PMID: 26095906 PMCID: PMC5034842 DOI: 10.1111/ejn.12994] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 06/11/2015] [Accepted: 06/17/2015] [Indexed: 12/12/2022]

Abstract

There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value‐based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represents reward‐prediction error. Although (ii) constitutes a critical assumption of (i), it remains elusive how (ii) holds given (i), with the basal‐ganglia influence on the dopamine neurons. Here we present a computational neural‐circuit model that potentially resolves this issue. Based on the latest analyses of the heterogeneous corticostriatal neurons and connections, our model posits that the direct and indirect pathways, respectively, represent the values of upcoming and previous actions, and up‐regulate and down‐regulate the dopamine neurons via the basal‐ganglia output nuclei. This explains how the difference between the upcoming and previous values, which constitutes the core of reward‐prediction error, is calculated. Simultaneously, it predicts that blockade of the direct/indirect pathway causes a negative/positive shift of reward‐prediction error and thereby impairs learning from positive/negative error, i.e. appetitive/aversive learning. Through simulation of reward‐reversal learning and punishment‐avoidance learning, we show that our model could indeed account for the experimentally observed features that are suggested to support notion (i) and could also provide predictions on neural activity. We also present a behavioral prediction of our model, through simulation of inter‐temporal choice, on how the balance between the two pathways relates to the subject's time preference. These results indicate that our model, incorporating the heterogeneity of the cortical influence on the basal ganglia, is expected to provide a closed‐circuit mechanistic understanding of appetitive/aversive learning.

Collapse

Baladron J, Hamker FH. A spiking neural network based on the basal ganglia functional anatomy. Neural Netw 2015;67:1-13. [DOI: 10.1016/j.neunet.2015.03.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 01/29/2015] [Accepted: 03/03/2015] [Indexed: 10/23/2022]

Balasubramani PP, Chakravarthy VS, Ravindran B, Moustafa AA. A network model of basal ganglia for understanding the roles of dopamine and serotonin in reward-punishment-risk based decision making. Front Comput Neurosci 2015;9:76. [PMID: 26136679 PMCID: PMC4469836 DOI: 10.3389/fncom.2015.00076] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 06/02/2015] [Indexed: 01/10/2023] Open

Abstract

There is significant evidence that in addition to reward-punishment based decision making, the Basal Ganglia (BG) contributes to risk-based decision making (Balasubramani et al., 2014). Despite this evidence, little is known about the computational principles and neural correlates of risk computation in this subcortical system. We have previously proposed a reinforcement learning (RL)-based model of the BG that simulates the interactions between dopamine (DA) and serotonin (5HT) in a diverse set of experimental studies including reward, punishment and risk based decision making (Balasubramani et al., 2014). Starting with the classical idea that the activity of mesencephalic DA represents reward prediction error, the model posits that serotoninergic activity in the striatum controls risk-prediction error. Our prior model of the BG was an abstract model that did not incorporate anatomical and cellular-level data. In this work, we expand the earlier model into a detailed network model of the BG and demonstrate the joint contributions of DA-5HT in risk and reward-punishment sensitivity. At the core of the proposed network model is the following insight regarding cellular correlates of value and risk computation. Just as DA D1 receptor (D1R) expressing medium spiny neurons (MSNs) of the striatum were thought to be the neural substrates for value computation, we propose that DA D1R and D2R co-expressing MSNs are capable of computing risk. Though the existence of MSNs that co-express D1R and D2R are reported by various experimental studies, prior existing computational models did not include them. Ours is the first model that accounts for the computational possibilities of these co-expressing D1R-D2R MSNs, and describes how DA and 5HT mediate activity in these classes of neurons (D1R-, D2R-, D1R-D2R- MSNs). Starting from the assumption that 5HT modulates all MSNs, our study predicts significant modulatory effects of 5HT on D2R and co-expressing D1R-D2R MSNs which in turn explains the multifarious functions of 5HT in the BG. The experiments simulated in the present study relates 5HT to risk sensitivity and reward-punishment learning. Furthermore, our model is shown to capture reward-punishment and risk based decision making impairment in Parkinson's Disease (PD). The model predicts that optimizing 5HT levels along with DA medications might be essential for improving the patients' reward-punishment learning deficits.

Collapse

Fujiyama F, Takahashi S, Karube F. Morphological elucidation of basal ganglia circuits contributing reward prediction. Front Neurosci 2015;9:6. [PMID: 25698913 PMCID: PMC4318281 DOI: 10.3389/fnins.2015.00006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 01/08/2015] [Indexed: 11/26/2022] Open

Vo K, Rutledge RB, Chatterjee A, Kable JW. Dorsal striatum is necessary for stimulus-value but not action-value learning in humans. ACTA ACUST UNITED AC 2014;137:3129-35. [PMID: 25273995 DOI: 10.1093/brain/awu277] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Cohen MX. A neural microcircuit for cognitive conflict detection and signaling. Trends Neurosci 2014;37:480-90. [DOI: 10.1016/j.tins.2014.06.004] [Citation(s) in RCA: 248] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 05/17/2014] [Accepted: 06/05/2014] [Indexed: 11/25/2022]

Robinson JD, Howard CD, Pastuzyn ED, Byers DL, Keefe KA, Garris PA. Methamphetamine-induced neurotoxicity disrupts pharmacologically evoked dopamine transients in the dorsomedial and dorsolateral striatum. Neurotox Res 2014;26:152-67. [PMID: 24562969 PMCID: PMC4071119 DOI: 10.1007/s12640-014-9459-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Revised: 01/10/2014] [Accepted: 02/04/2014] [Indexed: 11/30/2022]

Abstract

Phasic dopamine (DA) signaling, during which burst firing by DA neurons generates short-lived elevations in extracellular DA in terminal fields called DA transients, is implicated in reinforcement learning. Disrupted phasic DA signaling is proposed to link DA depletions and cognitive-behavioral impairment in methamphetamine (METH)-induced neurotoxicity. Here, we further investigated this disruption by assessing effects of METH pretreatment on DA transients elicited by a drug cocktail of raclopride, a D2 DA receptor antagonist, and nomifensine, an inhibitor of the dopamine transporter (DAT). One advantage of this approach is that pharmacological activation provides a large, high-quality data set of transients elicited by endogenous burst firing of DA neurons for analysis of regional differences and neurotoxicity. These pharmacologically evoked DA transients were measured in the dorsomedial (DM) and dorsolateral (DL) striatum of urethane-anesthetized rats by fast-scan cyclic voltammetry. Electrically evoked DA levels were also recorded to quantify DA release and uptake, and DAT binding was determined by means of autoradiography to index DA denervation. Pharmacologically evoked DA transients in intact animals exhibited a greater amplitude and frequency and shorter duration in the DM compared to the DL striatum, despite similar pre- and post-drug assessments of DA release and uptake in both sub-regions as determined from the electrically evoked DA signals. METH pretreatment reduced transient activity. The most prominent effect of METH pretreatment on transients across striatal sub-region was decreased amplitude, which mirrored decreased DAT binding and was accompanied by decreased DA release. Overall, these results identify marked intrastriatal differences in the activity of DA transients that appear independent of presynaptic mechanisms for DA release and uptake and further support disrupted phasic DA signaling mediated by decreased DA release in rats with METH-induced neurotoxicity.

Collapse

Microstimulation of the human substantia nigra alters reinforcement learning. J Neurosci 2014;34:6887-95. [PMID: 24828643 DOI: 10.1523/jneurosci.5445-13.2014] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Balasubramani PP, Chakravarthy VS, Ravindran B, Moustafa AA. An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning. Front Comput Neurosci 2014;8:47. [PMID: 24795614 PMCID: PMC3997037 DOI: 10.3389/fncom.2014.00047] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Accepted: 03/30/2014] [Indexed: 11/29/2022] Open

Morita K, Kato A. Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits. Front Neural Circuits 2014;8:36. [PMID: 24782717 PMCID: PMC3988379 DOI: 10.3389/fncir.2014.00036] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 03/24/2014] [Indexed: 11/13/2022] Open

Abstract

It has been suggested that the midbrain dopamine (DA) neurons, receiving inputs from the cortico-basal ganglia (CBG) circuits and the brainstem, compute reward prediction error (RPE), the difference between reward obtained or expected to be obtained and reward that had been expected to be obtained. These reward expectations are suggested to be stored in the CBG synapses and updated according to RPE through synaptic plasticity, which is induced by released DA. These together constitute the "DA=RPE" hypothesis, which describes the mutual interaction between DA and the CBG circuits and serves as the primary working hypothesis in studying reward learning and value-based decision-making. However, recent work has revealed a new type of DA signal that appears not to represent RPE. Specifically, it has been found in a reward-associated maze task that striatal DA concentration primarily shows a gradual increase toward the goal. We explored whether such ramping DA could be explained by extending the "DA=RPE" hypothesis by taking into account biological properties of the CBG circuits. In particular, we examined effects of possible time-dependent decay of DA-dependent plastic changes of synaptic strengths by incorporating decay of learned values into the RPE-based reinforcement learning model and simulating reward learning tasks. We then found that incorporation of such a decay dramatically changes the model's behavior, causing gradual ramping of RPE. Moreover, we further incorporated magnitude-dependence of the rate of decay, which could potentially be in accord with some past observations, and found that near-sigmoidal ramping of RPE, resembling the observed DA ramping, could then occur. Given that synaptic decay can be useful for flexibly reversing and updating the learned reward associations, especially in case the baseline DA is low and encoding of negative RPE by DA is limited, the observed DA ramping would be indicative of the operation of such flexible reward learning.

Collapse

Morita K. Differential cortical activation of the striatal direct and indirect pathway cells: reconciling the anatomical and optogenetic results by using a computational method. J Neurophysiol 2014;112:120-46. [PMID: 24598515 DOI: 10.1152/jn.00625.2013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

The corticostriatal system is considered to be crucially involved in learning and action selection. Anatomical studies have shown that two types of corticostriatal neurons, intratelencephalic (IT) and pyramidal tract (PT) cells, preferentially project to dopamine D1 or D2 receptor-expressing striatal projection neurons, respectively. In contrast, an optogenetic study has shown that stimulation of IT axons evokes comparable responses in D1 and D2 cells and that stimulation of PT axons evokes larger responses in D1 cells. Since the optogenetic study applied brief stimulation only, however, the overall impacts of repetitive inputs remain unclear. Moreover, the apparent contradiction between the anatomical and optogenetic results remains to be resolved. I addressed these issues by using a computational approach. Specifically, I constructed a model of striatal response to cortical inputs, with parameters regarding short-term synaptic plasticity and anatomical connection strength for each connection type. Under the constraint of the optogenetic results, I then explored the parameters that best explain the previously reported paired-pulse ratio of response in D1 and D2 cells to cortical and intrastriatal stimulations, which presumably recruit different compositions of IT and PT fibers. The results indicate that 1) IT→D1 and PT→D2 connections are anatomically stronger than IT→D2 and PT→D1 connections, respectively, consistent with the previous findings, and that 2) IT→D1 and PT→D2 synapses entail short-term facilitation, whereas IT→D2 and PT→D1 synapses would basically show depression, and thereby 3) repetitive IT or PT inputs have larger overall impacts on D1 or D2 cells, respectively, supporting a recently proposed hypothesis on the roles of corticostriatal circuits in reinforcement learning.

Collapse

GENSAT BAC cre-recombinase driver lines to study the functional organization of cerebral cortical and basal ganglia circuits. Neuron 2014;80:1368-83. [PMID: 24360541 DOI: 10.1016/j.neuron.2013.10.016] [Citation(s) in RCA: 418] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2013] [Indexed: 12/12/2022]

Granato A, De Giorgio A. Alterations of neocortical pyramidal neurons: turning points in the genesis of mental retardation. Front Pediatr 2014;2:86. [PMID: 25157343 PMCID: PMC4127660 DOI: 10.3389/fped.2014.00086] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 07/25/2014] [Indexed: 11/20/2022] Open

Seger CA. The visual corticostriatal loop through the tail of the caudate: circuitry and function. Front Syst Neurosci 2013;7:104. [PMID: 24367300 PMCID: PMC3853932 DOI: 10.3389/fnsys.2013.00104] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Accepted: 11/18/2013] [Indexed: 12/17/2022] Open

Oswald MJ, Tantirigama MLS, Sonntag I, Hughes SM, Empson RM. Diversity of layer 5 projection neurons in the mouse motor cortex. Front Cell Neurosci 2013;7:174. [PMID: 24137110 PMCID: PMC3797544 DOI: 10.3389/fncel.2013.00174] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 09/18/2013] [Indexed: 12/18/2022] Open

Ueta Y, Hirai Y, Otsuka T, Kawaguchi Y. Direction- and distance-dependent interareal connectivity of pyramidal cell subpopulations in the rat frontal cortex. Front Neural Circuits 2013;7:164. [PMID: 24137111 PMCID: PMC3797542 DOI: 10.3389/fncir.2013.00164] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Accepted: 09/23/2013] [Indexed: 11/16/2022] Open

Abstract

The frontal cortex plays an important role in the initiation and execution of movements via widespread projections to various cortical and subcortical areas. Layer 2/3 (L2/3) pyramidal cells in the frontal cortex send axons mainly to other ipsilateral/contralateral cortical areas. Subpopulations of layer 5 (L5) pyramidal cells that selectively project to the pontine nuclei or to the contralateral cortex [commissural (COM) cells] also target diverse and sometimes overlapping ipsilateral cortical areas. However, little is known about target area-dependent participation in ipsilateral corticocortical (iCC) connections by subclasses of L2/3 and L5 projection neurons. To better understand the functional hierarchy between cortical areas, we compared iCC connectivity between the secondary motor cortex (M2) and adjacent areas, such as the orbitofrontal and primary motor cortices, and distant non-frontal areas, such as the perirhinal and posterior parietal cortices. We particularly assessed the laminar distribution of iCC cells and fibers, and identified the subtypes of pyramidal cells participating in those projections. For connections between M2 and frontal areas, L2/3 and L5 cells in both areas contributed to reciprocal projections, which can be viewed as “bottom-up” or “top-down” on the basis of their differential targeting of cortical lamina. In connections between M2 and non-frontal areas, neurons participating in bottom-up and top-down projections were segregated into the different layers: bottom-up projections arose primarily from L2/3 cells, while top-down projections were dominated by L5 COM cells. These findings suggest that selective participation in iCC connections by pyramidal cell subtypes lead to directional connectivity between M2 and other cortical areas. Based on these findings, we propose a provisional unified framework of interareal hierarchy within the frontal cortex, and discuss the interaction of local circuits with long-range interareal connections.

Collapse

Cheyne D, Ferrari P. MEG studies of motor cortex gamma oscillations: evidence for a gamma "fingerprint" in the brain? Front Hum Neurosci 2013;7:575. [PMID: 24062675 PMCID: PMC3774986 DOI: 10.3389/fnhum.2013.00575] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 08/27/2013] [Indexed: 02/02/2023] Open

Striedter GF. Bird brains and tool use: beyond instrumental conditioning. BRAIN, BEHAVIOR AND EVOLUTION 2013;82:55-67. [PMID: 23979456 DOI: 10.1159/000352003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]