1
|
Piquet R, Faugère A, Parkes SL. A hippocampo-cortical pathway detects changes in the validity of an action as a predictor of reward. Curr Biol 2024; 34:24-35.e4. [PMID: 38101404 DOI: 10.1016/j.cub.2023.11.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/06/2023] [Accepted: 11/16/2023] [Indexed: 12/17/2023]
Abstract
Much research has been dedicated to understanding the psychological and neural bases of goal-directed action, yet the relationship between context and goal-directed action is not well understood. Here, we used excitotoxic lesions, chemogenetics, and circuit-specific manipulations to demonstrate the role of the ventral hippocampus (vHPC) in contextual learning that supports sensitivity to action-outcome contingencies, a hallmark of goal-directed action. We found that chemogenetic inhibition of the ventral, but not dorsal, hippocampus attenuated sensitivity to instrumental contingency degradation. We then tested the hypothesis that this deficit was due to an inability to discern the relative validity of the action compared with the context as a predictor of reward. Using latent inhibition and Pavlovian context conditioning, we confirm that degradation of action-outcome contingencies relies on intact context-outcome learning and show that this learning is dependent on vHPC. Finally, we show that chemogenetic inhibition of vHPC terminals in the medial prefrontal cortex also impairs both instrumental contingency degradation and context-outcome learning. These results implicate a hippocampo-cortical pathway in adapting to changes in instrumental contingencies and indicate that the psychological basis of this deficit is an inability to learn the predictive value of the context. Our findings contribute to a broader understanding of the neural bases of goal-directed action and its contextual regulation.
Collapse
Affiliation(s)
- Robin Piquet
- Univ. Bordeaux, CNRS, INCIA, UMR 5287, 33000 Bordeaux, France
| | | | - Shauna L Parkes
- Univ. Bordeaux, CNRS, INCIA, UMR 5287, 33000 Bordeaux, France.
| |
Collapse
|
2
|
Blackwell KT, Doya K. Enhancing reinforcement learning models by including direct and indirect pathways improves performance on striatal dependent tasks. PLoS Comput Biol 2023; 19:e1011385. [PMID: 37594982 PMCID: PMC10479916 DOI: 10.1371/journal.pcbi.1011385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/05/2023] [Accepted: 07/25/2023] [Indexed: 08/20/2023] Open
Abstract
A major advance in understanding learning behavior stems from experiments showing that reward learning requires dopamine inputs to striatal neurons and arises from synaptic plasticity of cortico-striatal synapses. Numerous reinforcement learning models mimic this dopamine-dependent synaptic plasticity by using the reward prediction error, which resembles dopamine neuron firing, to learn the best action in response to a set of cues. Though these models can explain many facets of behavior, reproducing some types of goal-directed behavior, such as renewal and reversal, require additional model components. Here we present a reinforcement learning model, TD2Q, which better corresponds to the basal ganglia with two Q matrices, one representing direct pathway neurons (G) and another representing indirect pathway neurons (N). Unlike previous two-Q architectures, a novel and critical aspect of TD2Q is to update the G and N matrices utilizing the temporal difference reward prediction error. A best action is selected for N and G using a softmax with a reward-dependent adaptive exploration parameter, and then differences are resolved using a second selection step applied to the two action probabilities. The model is tested on a range of multi-step tasks including extinction, renewal, discrimination; switching reward probability learning; and sequence learning. Simulations show that TD2Q produces behaviors similar to rodents in choice and sequence learning tasks, and that use of the temporal difference reward prediction error is required to learn multi-step tasks. Blocking the update rule on the N matrix blocks discrimination learning, as observed experimentally. Performance in the sequence learning task is dramatically improved with two matrices. These results suggest that including additional aspects of basal ganglia physiology can improve the performance of reinforcement learning models, better reproduce animal behaviors, and provide insight as to the role of direct- and indirect-pathway striatal neurons.
Collapse
Affiliation(s)
- Kim T Blackwell
- Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia, United States of America
| | - Kenji Doya
- Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| |
Collapse
|
3
|
Wikenheiser AM, Schoenbaum G. Over the river, through the woods: cognitive maps in the hippocampus and orbitofrontal cortex. Nat Rev Neurosci 2016; 17:513-23. [PMID: 27256552 PMCID: PMC5541258 DOI: 10.1038/nrn.2016.56] [Citation(s) in RCA: 200] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The hippocampus and the orbitofrontal cortex (OFC) both have important roles in cognitive processes such as learning, memory and decision making. Nevertheless, research on the OFC and hippocampus has proceeded largely independently, and little consideration has been given to the importance of interactions between these structures. Here, evidence is reviewed that the hippocampus and OFC encode parallel, but interactive, cognitive 'maps' that capture complex relationships between cues, actions, outcomes and other features of the environment. A better understanding of the interactions between the OFC and hippocampus is important for understanding the neural bases of flexible, goal-directed decision making.
Collapse
Affiliation(s)
- Andrew M Wikenheiser
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, Maryland 21224, USA
| | - Geoffrey Schoenbaum
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, Maryland 21224, USA; the Department of Anatomy and Neurobiology, University of Maryland, Baltimore, Maryland 21201, USA; and the Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21205, USA
| |
Collapse
|
4
|
MacLaren DAA, Wilson DIG, Winn P. Selective lesions of the cholinergic neurons within the posterior pedunculopontine do not alter operant learning or nicotine sensitization. Brain Struct Funct 2015; 221:1481-97. [PMID: 25586659 DOI: 10.1007/s00429-014-0985-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 11/30/2014] [Indexed: 02/02/2023]
Abstract
Cholinergic neurons within the pedunculopontine tegmental nucleus have been implicated in a range of functions, including behavioral state control, attention, and modulation of midbrain and basal ganglia systems. Previous experiments with excitotoxic lesions have found persistent learning impairment and altered response to nicotine following lesion of the posterior component of the PPTg (pPPTg). These effects have been attributed to disrupted input to midbrain dopamine systems, particularly the ventral tegmental area. The pPPTg contains a dense collection of cholinergic neurons and also large numbers of glutamatergic and GABAergic neurons. Because these interdigitated populations of neurons are all susceptible to excitotoxins, the effects of such lesions cannot be attributed to one neuronal population. We wished to assess whether the learning impairments and altered responses to nicotine in excitotoxic PPTg-lesioned rats were due to loss of cholinergic neurons within the pPPTg. Selective depletion of cholinergic pPPTg neurons is achievable with the fusion toxin Dtx-UII, which targets UII receptors expressed only by cholinergic neurons in this region. Rats bearing bilateral lesions of cholinergic pPPTg neurons (>90% ChAT+ neuronal loss) displayed no deficits in the learning or performance of fixed and variable ratio schedules of reinforcement for pellet reward. Separate rats with the same lesions had a normal locomotor response to nicotine and furthermore sensitized to repeated administration of nicotine at the same rate as sham controls. Previously seen changes in these behaviors following excitotoxic pPPTg lesions cannot be attributed solely to loss of cholinergic neurons. These findings indicate that non-cholinergic neurons within the pPPTg are responsible for the learning deficits and altered responses to nicotine seen after excitotoxic lesions. The functions of cholinergic neurons may be related to behavioral state control and attention rather than learning.
Collapse
Affiliation(s)
- Duncan A A MacLaren
- Strathclyde Institute of Pharmacy and Biomedical Sciences, 161 Cathedral Street, Glasgow, G4 0RE, UK. .,School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife, KY16 9JP, UK.
| | - David I G Wilson
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife, KY16 9JP, UK
| | - Philip Winn
- Strathclyde Institute of Pharmacy and Biomedical Sciences, 161 Cathedral Street, Glasgow, G4 0RE, UK
| |
Collapse
|
5
|
Reichelt AC, Morris MJ, Westbrook RF. Cafeteria diet impairs expression of sensory-specific satiety and stimulus-outcome learning. Front Psychol 2014; 5:852. [PMID: 25221530 PMCID: PMC4146395 DOI: 10.3389/fpsyg.2014.00852] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2014] [Accepted: 07/17/2014] [Indexed: 11/13/2022] Open
Abstract
A range of animal and human data demonstrates that excessive consumption of palatable food leads to neuroadaptive responses in brain circuits underlying reward. Unrestrained consumption of palatable food has been shown to increase the reinforcing value of food and weaken inhibitory control; however, whether it impacts upon the sensory representations of palatable solutions has not been formally tested. These experiments sought to determine whether exposure to a cafeteria diet consisting of palatable high fat foods impacts upon the ability of rats to learn about food-associated cues and the sensory properties of ingested foods. We found that rats fed a cafeteria diet for 2 weeks were impaired in the control of Pavlovian responding in accordance to the incentive value of palatable outcomes associated with auditory cues following devaluation by sensory-specific satiety. Sensory-specific satiety is one mechanism by which a diet containing different foods increases ingestion relative to one lacking variety. Hence, choosing to consume greater quantities of a range of foods may contribute to the current prevalence of obesity. We observed that rats fed a cafeteria diet for 2 weeks showed impaired sensory-specific satiety following consumption of a high calorie solution. The deficit in expression of sensory-specific satiety was also present 1 week following the withdrawal of cafeteria foods. Thus, exposure to obesogenic diets may impact upon neurocircuitry involved in motivated control of behavior.
Collapse
Affiliation(s)
- Amy C Reichelt
- School of Medical Sciences, The University of New South Wales Sydney, NSW, Australia ; School of Psychology, The University of New South Wales Sydney, NSW, Australia
| | - Margaret J Morris
- School of Medical Sciences, The University of New South Wales Sydney, NSW, Australia
| | - R F Westbrook
- School of Psychology, The University of New South Wales Sydney, NSW, Australia
| |
Collapse
|
6
|
Pittenger ST, Bevins RA. Interoceptive conditioning with a nicotine stimulus is susceptible to reinforcer devaluation. Behav Neurosci 2013; 127:465-73. [PMID: 23731077 DOI: 10.1037/a0032691] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Pavlovian conditioning processes contribute to the etiology of nicotine dependence. Conditioning involving interoceptive stimuli is increasingly recognized as playing a role in many diseases and psychopathologies, including drug addiction. Previous animal research on diminishing the influence of interoceptive conditioning has been limited to antagonism and nonreinforced exposures to the drug stimulus. The goal of the present research was to determine whether interoceptive conditioning with a nicotine stimulus could be diminished through an unconditioned stimulus (US) devaluation procedure. In two separate experiments, male Sprague-Dawley rats received nicotine injections (0.4 mg base/kg) followed by intermittent sucrose (26%) access in a conditioning chamber. On intermixed saline sessions, sucrose was withheld. Conditioning was demonstrated by a reliable increase in head entries in the dipper receptacle on nicotine versus saline sessions. After conditioning, rats in a devaluation condition were given access to sucrose in their home cages immediately followed by a lithium chloride (LiCl) injection on 3 consecutive days. On subsequent test days, nicotine-evoked conditioned responding was significantly attenuated. Within-subject (Experiment 1) and between-subjects (Experiment 2) controls revealed that the diminished responding was not attributable to mere exposure to the sucrose US in the devaluation phase. Experiment 2 included a LiCl-alone control group. Repeated illness induced by LiCl did not reduce later nicotine-evoked responding. These findings suggest that there is a direct association between the interoceptive stimulus effects of nicotine and the appetitive sucrose US (i.e., stimulus-stimulus) rather than a stimulus-response association.
Collapse
Affiliation(s)
- Steven T Pittenger
- Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE 68588-0308, USA
| | | |
Collapse
|
7
|
Shi Z, Sun X, Liu X, Chen S, Chang Q, Chen L, Song G, Li H. Evaluation of an Aβ1–40-induced cognitive deficit in rat using a reward-directed instrumental learning task. Behav Brain Res 2012; 234:323-33. [DOI: 10.1016/j.bbr.2012.07.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Revised: 07/03/2012] [Accepted: 07/06/2012] [Indexed: 01/04/2023]
|
8
|
Son JH, Latimer C, Keefe KA. Impaired formation of stimulus-response, but not action-outcome, associations in rats with methamphetamine-induced neurotoxicity. Neuropsychopharmacology 2011; 36:2441-51. [PMID: 21775980 PMCID: PMC3194071 DOI: 10.1038/npp.2011.131] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Methamphetamine (METH) induces neurotoxic changes, including partial striatal dopamine depletions, which are thought to contribute to cognitive dysfunction in rodents and humans. The dorsal striatum is implicated in action-outcome (A-O) and stimulus-response (S-R) associations underlying instrumental learning. Thus, the present study examined the long-term consequences of METH-induced neurotoxicity on A-O and S-R associations underlying appetitive instrumental behavior. Rats were pretreated with saline or a neurotoxic regimen of METH (4 × 7.5-10 mg/kg). Rats trained on random ratio (RR) or random interval (RI) schedules of reinforcement were then subjected to outcome devaluation or contingency degradation, followed by an extinction test. All rats then were killed, and brains removed for determination of striatal dopamine loss. The results show that: (1) METH pretreatment induced a partial 45-50% decrease in striatal dopamine tissue content in dorsomedial and dorsolateral striatum; (2) METH-induced neurotoxicity did not alter acquisition of instrumental behavior on either RR or RI schedules; (3) outcome devaluation and contingency degradation similarly decreased responding in saline- and METH-pretreated rats trained on the RR schedule, suggesting intact A-O associations guiding behavior; (4) outcome devaluation after training on the RI schedule decreased extinction responding only in METH-pretreated rats, suggesting impaired S-R associations. Overall, these data suggest that METH-induced neurotoxicity, possibly due to impairment of the function of dorsolateral striatal circuitry, may decrease cognitive flexibility by impairing the ability to automatize behavioral patterns.
Collapse
Affiliation(s)
- Jong-Hyun Son
- Department of Pharmacology and Toxicology, College of Pharmacy, University of Utah, Salt Lake City, USA
| | - Christine Latimer
- Department of Neuroscience, Westminster College, Salt Lake City, UT, USA
| | - Kristen A Keefe
- Department of Pharmacology and Toxicology, College of Pharmacy, University of Utah, Salt Lake City, USA,Department of Pharmacology and Toxicology, College of Pharmacy, University of Utah, 30 S 2000 E Rm 102, Salt Lake City, UT 84112, USA, Tel: +1 801 585 1253, Fax: +1 801 585 5111, E-mail:
| |
Collapse
|