1
|
Blackwell KT, Doya K. Enhancing reinforcement learning models by including direct and indirect pathways improves performance on striatal dependent tasks. PLoS Comput Biol 2023; 19:e1011385. [PMID: 37594982 PMCID: PMC10479916 DOI: 10.1371/journal.pcbi.1011385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/05/2023] [Accepted: 07/25/2023] [Indexed: 08/20/2023] Open
Abstract
A major advance in understanding learning behavior stems from experiments showing that reward learning requires dopamine inputs to striatal neurons and arises from synaptic plasticity of cortico-striatal synapses. Numerous reinforcement learning models mimic this dopamine-dependent synaptic plasticity by using the reward prediction error, which resembles dopamine neuron firing, to learn the best action in response to a set of cues. Though these models can explain many facets of behavior, reproducing some types of goal-directed behavior, such as renewal and reversal, require additional model components. Here we present a reinforcement learning model, TD2Q, which better corresponds to the basal ganglia with two Q matrices, one representing direct pathway neurons (G) and another representing indirect pathway neurons (N). Unlike previous two-Q architectures, a novel and critical aspect of TD2Q is to update the G and N matrices utilizing the temporal difference reward prediction error. A best action is selected for N and G using a softmax with a reward-dependent adaptive exploration parameter, and then differences are resolved using a second selection step applied to the two action probabilities. The model is tested on a range of multi-step tasks including extinction, renewal, discrimination; switching reward probability learning; and sequence learning. Simulations show that TD2Q produces behaviors similar to rodents in choice and sequence learning tasks, and that use of the temporal difference reward prediction error is required to learn multi-step tasks. Blocking the update rule on the N matrix blocks discrimination learning, as observed experimentally. Performance in the sequence learning task is dramatically improved with two matrices. These results suggest that including additional aspects of basal ganglia physiology can improve the performance of reinforcement learning models, better reproduce animal behaviors, and provide insight as to the role of direct- and indirect-pathway striatal neurons.
Collapse
Affiliation(s)
- Kim T Blackwell
- Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia, United States of America
| | - Kenji Doya
- Neural Computation Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| |
Collapse
|
2
|
Abstract
Learning to stop responding is an important process that allows behavior to adapt to a changing and variable environment. This article reviews recent research in this laboratory and others that has studied how animals learn to stop responding in operant extinction, punishment, and feature-negative learning. Extinction and punishment are shown to be similar in two fundamental ways. First, the response-suppressing effects of both are highly context-specific. Second, the response-suppressing effects of both can be remarkably response-specific: Inhibition of one response transfers little to other responses. Learning to inhibit the response so specifically may result from the correction of "response error," the difference between the level of responding and what the current reinforcer supports. In contrast, the inhibition of responding that develops in feature-negative learning, where the response is reinforced during one discriminative stimulus (A) but not in a compound of A and stimulus B, is less response-specific: The inhibition of responding by stimulus B transfers and inhibits a second response, especially if the second response has itself been inhibited before. The results thus indicate both response-specific and response-general forms of behavioral inhibition. One possibility is that response-specific inhibition is learned when the circumstances encourage the organism to pay attention to the response-to what it is actually doing-as behavioral suppression is learned.
Collapse
|
3
|
Bouton ME, Maren S, McNally GP. BEHAVIORAL AND NEUROBIOLOGICAL MECHANISMS OF PAVLOVIAN AND INSTRUMENTAL EXTINCTION LEARNING. Physiol Rev 2021; 101:611-681. [PMID: 32970967 PMCID: PMC8428921 DOI: 10.1152/physrev.00016.2020] [Citation(s) in RCA: 152] [Impact Index Per Article: 50.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
This article reviews the behavioral neuroscience of extinction, the phenomenon in which a behavior that has been acquired through Pavlovian or instrumental (operant) learning decreases in strength when the outcome that reinforced it is removed. Behavioral research indicates that neither Pavlovian nor operant extinction depends substantially on erasure of the original learning but instead depends on new inhibitory learning that is primarily expressed in the context in which it is learned, as exemplified by the renewal effect. Although the nature of the inhibition may differ in Pavlovian and operant extinction, in either case the decline in responding may depend on both generalization decrement and the correction of prediction error. At the neural level, Pavlovian extinction requires a tripartite neural circuit involving the amygdala, prefrontal cortex, and hippocampus. Synaptic plasticity in the amygdala is essential for extinction learning, and prefrontal cortical inhibition of amygdala neurons encoding fear memories is involved in extinction retrieval. Hippocampal-prefrontal circuits mediate fear relapse phenomena, including renewal. Instrumental extinction involves distinct ensembles in corticostriatal, striatopallidal, and striatohypothalamic circuits as well as their thalamic returns for inhibitory (extinction) and excitatory (renewal and other relapse phenomena) control over operant responding. The field has made significant progress in recent decades, although a fully integrated biobehavioral understanding still awaits.
Collapse
Affiliation(s)
- Mark E Bouton
- Department of Psychological Science, University of Vermont, Burlington, Vermont
| | - Stephen Maren
- Department of Psychological and Brain Sciences and Institute for Neuroscience, Texas A&M University, College Station, Texas
| | - Gavan P McNally
- School of Psychology, University of New South Wales, Sydney, Australia
| |
Collapse
|
4
|
Activation of alpha7 nicotinic and NMDA receptors is necessary for performance in a working memory task. Psychopharmacology (Berl) 2020; 237:1723-1735. [PMID: 32162104 PMCID: PMC7313359 DOI: 10.1007/s00213-020-05495-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 02/19/2020] [Indexed: 10/24/2022]
Abstract
RATIONALE Working memory deficits are present in schizophrenia (SZ) but remain insufficiently resolved by medications. Similar cognitive dysfunctions can be produced acutely in animals by elevating brain levels of kynurenic acid (KYNA). KYNA's effects may reflect interference with the function of both the α7 nicotinic acetylcholine receptor (α7nAChR) and the glycineB site of the NMDA receptor. OBJECTIVES The aim of the present study was to examine, using pharmacological tools, the respective roles of these two receptor sites on performance in a delayed non-match-to-position working memory (WM) task (DNMTP). METHODS DNMTP consisted of 120 trials/session (5, 10, and 15 s delays). Rats received two doses (25 or 100 mg/kg, i.p.) of L-kynurenine (KYN; bioprecursor of KYNA) or L-4-chlorokynurenine (4-Cl-KYN; bioprecursor of the selective glycineB site antagonist 7-Cl-kynurenic acid). Attenuation of KYN- or 4-Cl-KYN-induced deficits was assessed by co-administration of galantamine (GAL, 3 mg/kg) or PAM-2 (1 mg/kg), two positive modulators of α7nAChR function. Reversal of 4-Cl-KYN-induced deficits was examined using D-cycloserine (DCS; 30 mg/kg), a partial agonist at the glycineB site. RESULTS Both KYN and 4-Cl-KYN administration produced dose-related deficits in DNMTP accuracy that were more severe at the longer delays. In KYN-treated rats, these deficits were reversed to control levels by GAL or PAM-2 but not by DCS. In contrast, DCS eliminated performance deficits in 4-Cl-KYN-treated animals. CONCLUSIONS These experiments reveal that both α7nAChR and NMDAR activity are necessary for normal WM accuracy. They provide substantive new support for the therapeutic potential of positive modulators at these two receptor sites in SZ and other major brain diseases.
Collapse
|
5
|
Fitzpatrick CJ, Geary T, Creeden JF, Morrow JD. Sign-tracking behavior is difficult to extinguish and resistant to multiple cognitive enhancers. Neurobiol Learn Mem 2019; 163:107045. [PMID: 31319166 DOI: 10.1016/j.nlm.2019.107045] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 07/08/2019] [Accepted: 07/14/2019] [Indexed: 12/23/2022]
Abstract
The attribution of incentive-motivational value to drug-related cues underlies relapse and craving in drug addiction. One method of addiction treatment, cue-exposure therapy, utilizes repeated presentations of drug-related cues in the absence of drug (i.e., extinction learning); however, its efficacy has been limited due to an incomplete understanding of extinction and relapse processes after cues have been imbued with incentive-motivational value. To investigate this, we used a Pavlovian conditioned approach procedure to screen for rats that attribute incentive-motivational value to reward-related cues (sign-trackers; STs) or those that do not (goal-trackers; GTs). In Experiment 1, rats underwent Pavlovian extinction followed by reinstatement and spontaneous recovery tests. For comparison, a separate group of rats underwent PCA training followed by operant conditioning, extinction, and tests of reinstatement and spontaneous recovery. In Experiment 2, three cognitive enhancers (sodium butyrate, D-cycloserine, and fibroblast growth factor 2) were administered following extinction training to facilitate extinction learning. STs but not GTs displayed enduring resistance to Pavlovian, but not operant, extinction and were more susceptible to spontaneous recovery. In addition, none of the cognitive enhancers tested affected extinction learning. These results expand our understanding of extinction learning by demonstrating that there is individual variation in extinction and relapse processes and highlight potential difficulties in applying extinction-based therapies to drug addiction treatment in the clinic.
Collapse
Affiliation(s)
| | - Trevor Geary
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Justin F Creeden
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Jonathan D Morrow
- Neuroscience Graduate Program, University of Michigan, Ann Arbor, MI, USA; Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
6
|
Phosphoproteomic Analysis Reveals a Novel Mechanism of CaMKIIα Regulation Inversely Induced by Cocaine Memory Extinction versus Reconsolidation. J Neurosci 2017; 36:7613-27. [PMID: 27445140 DOI: 10.1523/jneurosci.1108-16.2016] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 06/15/2016] [Indexed: 11/21/2022] Open
Abstract
UNLABELLED Successful addiction treatment depends on maintaining long-term abstinence, making relapse prevention an essential therapeutic goal. However, exposure to environmental cues associated with drug use often thwarts abstinence efforts by triggering drug using memories that drive craving and relapse. We sought to develop a dual approach for weakening cocaine memories through phosphoproteomic identification of targets regulated in opposite directions by memory extinction compared with reconsolidation in male Sprague-Dawley rats that had been trained to self-administer cocaine paired with an audiovisual cue. We discovered a novel, inversely regulated, memory-dependent phosphorylation event on calcium-calmodulin-dependent kinase II α (CaMKIIα) at serine (S)331. Correspondingly, extinction-associated S331 phosphorylation inhibited CaMKIIα activity. Intra-basolateral amygdala inhibition of CaMKII promoted memory extinction and disrupted reconsolidation, leading to a reduction in subsequent cue-induced reinstatement. CaMKII inhibition had no effect if the memory was neither retrieved nor extinguished. Therefore, inhibition of CaMKII represents a novel mechanism for memory-based addiction treatment that leverages both extinction enhancement and reconsolidation disruption to reduce relapse-like behavior. SIGNIFICANCE STATEMENT Preventing relapse to drug use is an important goal for the successful treatment of addictive disorders. Relapse-prevention therapies attempt to interfere with drug-associated memories, but are often hindered by unintentional memory strengthening. In this study, we identify phosphorylation events that are bidirectionally regulated by the reconsolidation versus extinction of a cocaine-associated memory, including a novel site on CaMKIIα. Additionally, using a rodent model of addiction, we show that CaMKII inhibition in the amygdala can reduce relapse-like behavior. Together, our data supports the existence of mechanisms that can be used to enhance current strategies for addiction treatment.
Collapse
|
7
|
Bouton ME, Trask S, Carranza-Jasso R. Learning to inhibit the response during instrumental (operant) extinction. JOURNAL OF EXPERIMENTAL PSYCHOLOGY. ANIMAL LEARNING AND COGNITION 2016; 42:246-58. [PMID: 27379715 PMCID: PMC4943680 DOI: 10.1037/xan0000102] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Five experiments tested implications of the idea that instrumental (operant) extinction involves learning to inhibit the learned response. All experiments used a discriminated operant procedure in which rats were reinforced for lever pressing or chain pulling in the presence of a discriminative stimulus (S), but not in its absence. In Experiment 1, extinction of the response (R) in the presence of S weakened responding in S, but equivalent nonreinforced exposure to S (without the opportunity to make R) did not. Experiment 2 replicated that result and found that extinction of R had no effect on a different R that had also been reinforced in the stimulus. In Experiments 3 and 4, rats first learned to perform several different stimulus and response combinations (S1R1, S2R1, S3R2, and S4R2). Extinction of a response in one stimulus (i.e., S1R1) transferred and weakened the same response, but not a different response, when it was tested in another stimulus (i.e., S2R1 but not S3R2). In Experiment 5, extinction still transferred between S1 and S2 when the stimuli set the occasion for R's association with different types of food pellets. The results confirm the importance of response inhibition in instrumental extinction: Nonreinforcement of the response in S causes the most effective suppression of responding, and response suppression is specific to the response but transfers and influences performance of the same response when it is occasioned by other stimuli. Theoretical and practical implications are discussed. (PsycINFO Database Record
Collapse
Affiliation(s)
- Mark E Bouton
- Department of Psychological Science, University of Vermont
| | - Sydney Trask
- Department of Psychological Science, University of Vermont
| | | |
Collapse
|
8
|
Brom M, Laan E, Everaerd W, Spinhoven P, Trimbos B, Both S. d-Cycloserine reduces context specificity of sexual extinction learning. Neurobiol Learn Mem 2015; 125:202-10. [PMID: 26456134 DOI: 10.1016/j.nlm.2015.09.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Revised: 09/07/2015] [Accepted: 09/28/2015] [Indexed: 11/19/2022]
Abstract
BACKGROUND d-Cycloserine (DCS) enhances extinction processes in animals. Although classical conditioning is hypothesized to play a pivotal role in the aetiology of appetitive motivation problems, no research has been conducted on the effect of DCS on the reduction of context specificity of extinction in human appetitive learning, while facilitation hereof is relevant in the context of treatment of problematic reward-seeking behaviors. METHODS Female participants were presented with two conditioned stimuli (CSs) that either predicted (CS+) or did not predict (CS-) a potential sexual reward (unconditioned stimulus (US); genital vibrostimulation). Conditioning took place in context A and extinction in context B. Subjects received DCS (125mg) or placebo directly after the experiment on day 1 in a randomized, double-blind, between-subject fashion (Placebo n=31; DCS n=31). Subsequent testing for CS-evoked conditioned responses (CRs) in both the conditioning (A) and the extinction context (B) took place 24h later on day 2. Drug effects on consolidation were then assessed by comparing the recall of sexual extinction memories between the DCS and the placebo groups. RESULTS Post learning administration of DCS facilitates sexual extinction memory consolidation and affects extinction's fundamental context specificity, evidenced by reduced conditioned genital and subjective sexual responses, relative to placebo, for presentations of the reward predicting cue 24h later outside the extinction context. CONCLUSIONS DCS makes appetitive extinction memories context-independent and prevents the return of conditioned response. NMDA receptor glycine site agonists may be potential pharmacotherapies for the prevention of relapse of appetitive motivation disorders with a learned component.
Collapse
Affiliation(s)
- Mirte Brom
- Institute of Psychology, Clinical Psychology Unit, Leiden University, Wassenaarseweg 52, 2333 AK, The Netherlands; Leiden University Medical Centre, Department of Psychosomatic Gynaecology & Sexology, VRSP, Rijnsburgerweg 10, Zone PG4-Z, P.O. Box 9600, 2300 RC Leiden, The Netherlands.
| | - Ellen Laan
- Department of Sexology and Psychosomatic Obstetrics and Gynaecology, Academic Medical Centre, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands.
| | - Walter Everaerd
- Department Clinical Psychology, University of Amsterdam, Weesperplein 4, 1018 XA Amsterdam, The Netherlands.
| | - Philip Spinhoven
- Institute of Psychology, Clinical Psychology Unit, Leiden University, Wassenaarseweg 52, 2333 AK, The Netherlands; Department of Psychiatry, Leiden University Medical Centre, P.O. Box 9600, 2300 RC Leiden, The Netherlands.
| | - Baptist Trimbos
- Leiden University Medical Centre, Department of Gynaecology, P.O. Box 9600, 2300 RC Leiden, The Netherlands.
| | - Stephanie Both
- Leiden University Medical Centre, Department of Psychosomatic Gynaecology & Sexology, VRSP, Rijnsburgerweg 10, Zone PG4-Z, P.O. Box 9600, 2300 RC Leiden, The Netherlands.
| |
Collapse
|
9
|
Pizzimenti CL, Lattal KM. Epigenetics and memory: causes, consequences and treatments for post-traumatic stress disorder and addiction. GENES BRAIN AND BEHAVIOR 2015; 14:73-84. [PMID: 25560936 DOI: 10.1111/gbb.12187] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Revised: 10/24/2014] [Accepted: 11/10/2014] [Indexed: 01/06/2023]
Abstract
Understanding the interaction between fear and reward at the circuit and molecular levels has implications for basic scientific approaches to memory and for understanding the etiology of psychiatric disorders. Both stress and exposure to drugs of abuse induce epigenetic changes that result in persistent behavioral changes, some of which may contribute to the formation of a drug addiction or a stress-related psychiatric disorder. Converging evidence suggests that similar behavioral, neurobiological and molecular mechanisms control the extinction of learned fear and drug-seeking responses. This may, in part, account for the fact that individuals with post-traumatic stress disorder have a significantly elevated risk of developing a substance use disorder and have high rates of relapse to drugs of abuse, even after long periods of abstinence. At the behavioral level, a major challenge in treatments is that extinguished behavior is often not persistent, returning with changes in context, the passage of time or exposure to mild stressors. A common goal of treatments is therefore to weaken the ability of stressors to induce relapse. With the discovery of epigenetic mechanisms that create persistent molecular signals, recent work on extinction has focused on how modulating these epigenetic targets can create lasting extinction of fear or drug-seeking behavior. Here, we review recent evidence pointing to common behavioral, systems and epigenetic mechanisms in the regulation of fear and drug seeking. We suggest that targeting these mechanisms in combination with behavioral therapy may promote treatment and weaken stress-induced relapse.
Collapse
Affiliation(s)
- C L Pizzimenti
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | | |
Collapse
|
10
|
Boutelle KN, Bouton ME. Implications of learning theory for developing programs to decrease overeating. Appetite 2015; 93:62-74. [PMID: 25998235 PMCID: PMC4654402 DOI: 10.1016/j.appet.2015.05.013] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Revised: 04/30/2015] [Accepted: 05/12/2015] [Indexed: 01/09/2023]
Abstract
Childhood obesity is associated with medical and psychological comorbidities, and interventions targeting overeating could be pragmatic and have a significant impact on weight. Calorically dense foods are easily available, variable, and tasty which allows for effective opportunities to learn to associate behaviors and cues in the environment with food through fundamental conditioning processes, resulting in measurable psychological and physiological food cue reactivity in vulnerable children. Basic research suggests that initial learning is difficult to erase, and that it is vulnerable to a number of phenomena that will allow the original learning to re-emerge after it is suppressed or replaced. These processes may help explain why it may be difficult to change food cue reactivity and overeating over the long term. Extinction theory may be used to develop effective cue-exposure treatments to decrease food cue reactivity through inhibitory learning, although these processes are complex and require an integral understanding of the theory and individual differences. Additionally, learning theory can be used to develop other interventions that may prove to be useful. Through an integration of learning theory, basic and translational research, it may be possible to develop interventions that can decrease the urges to overeat, and improve the weight status of children.
Collapse
Affiliation(s)
- Kerri N Boutelle
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA; Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
| | - Mark E Bouton
- Department of Psychological Science, University of Vermont, Burlington, VT, USA
| |
Collapse
|
11
|
Abstract
Unhealthy behavior is responsible for much human disease, and a common goal of contemporary preventive medicine is therefore to encourage behavior change. However, while behavior change often seems easy in the short run, it can be difficult to sustain. This article provides a selective review of research from the basic learning and behavior laboratory that provides some insight into why. The research suggests that methods used to create behavior change (including extinction, counterconditioning, punishment, reinforcement of alternative behavior, and abstinence reinforcement) tend to inhibit, rather than erase, the original behavior. Importantly, the inhibition, and thus behavior change more generally, is often specific to the "context" in which it is learned. In support of this view, the article discusses a number of lapse and relapse phenomena that occur after behavior has been changed (renewal, spontaneous recovery, reinstatement, rapid reacquisition, and resurgence). The findings suggest that changing a behavior can be an inherently unstable and unsteady process; frequent lapses should be expected. In the long run, behavior-change therapies might benefit from paying attention to the context in which behavior change occurs.
Collapse
|
12
|
Todd TP, Vurbic D, Bouton ME. Mechanisms of renewal after the extinction of discriminated operant behavior. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-ANIMAL LEARNING AND COGNITION 2014; 40:355-68. [PMID: 25545982 DOI: 10.1037/xan0000021] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Three experiments demonstrated, and examined the mechanisms that underlie, the renewal of extinguished discriminated operant behavior. In Experiment 1, rats were trained to perform 1 response (lever press or chain pull) in the presence of one discriminative stimulus (S; light or tone) in Context A, and to perform the other response in the presence of the other S in Context B. Next, each of the original S/response combinations was extinguished in the alternate context. When the S/response combinations were tested back in the context in which they had been trained, responding in the presence of S returned (an ABA renewal effect was observed). This renewal could not be due to differential context-reinforcer associations, suggesting instead that the extinction context inhibits either the response and/or the effectiveness of the S. Consistent with the latter mechanism, in Experiment 2, ABA renewal was still observed when both the extinction and renewal contexts inhibited the same response. However, in Experiment 3, previous extinction of the response in the renewing context (occasioned by a different S) reduced AAB renewal more than did extinction of the different response. Taken together, the results suggest at least 2 mechanisms of renewal after instrumental extinction. First, extinction performance is at least partly controlled by a direct inhibitory association that is formed between the context and the response. Second, in the discriminated operant procedure, extinction performance can sometimes be partly controlled by a reduction in the effectiveness of the S in the extinction context. Renewal of discriminated operant behavior can be produced by a release from either of these forms of inhibition.
Collapse
|
13
|
Todd TP, Vurbic D, Bouton ME. Behavioral and neurobiological mechanisms of extinction in Pavlovian and instrumental learning. Neurobiol Learn Mem 2013; 108:52-64. [PMID: 23999219 DOI: 10.1016/j.nlm.2013.08.012] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 08/06/2013] [Accepted: 08/23/2013] [Indexed: 11/30/2022]
Abstract
This article reviews research on the behavioral and neural mechanisms of extinction as it is represented in both Pavlovian and instrumental learning. In Pavlovian extinction, repeated presentation of a signal without its reinforcer weakens behavior evoked by the signal; in instrumental extinction, repeated occurrence of a voluntary action without its reinforcer weakens the strength of the action. In either case, contemporary research at both the behavioral and neural levels of analysis has been guided by a set of extinction principles that were first generated by research conducted at the behavioral level. The review discusses these principles and illustrates how they have informed the study of both Pavlovian and instrumental extinction. It shows that behavioral and neurobiological research efforts have been tightly linked and that their results are readily integrated. Pavlovian and instrumental extinction are also controlled by compatible behavioral and neural processes. Since many behavioral effects observed in extinction can be multiply determined, we suggest that the current close connection between behavioral-level and neural-level analyses will need to continue.
Collapse
Affiliation(s)
- Travis P Todd
- Department of Psychology, University of Vermont, 2 Colchester Ave., Burlington, VT 05405-0134, United States
| | - Drina Vurbic
- Department of Psychology, University of Vermont, 2 Colchester Ave., Burlington, VT 05405-0134, United States
| | - Mark E Bouton
- Department of Psychology, University of Vermont, 2 Colchester Ave., Burlington, VT 05405-0134, United States..
| |
Collapse
|
14
|
de Bruin N, van Drimmelen M, Kops M, van Elk J, Wetering MMVD, Schwienbacher I. Effects of risperidone, clozapine and the 5-HT6 antagonist GSK-742457 on PCP-induced deficits in reversal learning in the two-lever operant task in male Sprague Dawley rats. Behav Brain Res 2013; 244:15-28. [DOI: 10.1016/j.bbr.2013.01.035] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2012] [Revised: 01/21/2013] [Accepted: 01/26/2013] [Indexed: 12/31/2022]
|
15
|
Leslie JC, Norwood K. Facilitation of extinction and re-extinction of operant behavior in mice by chlordiazepoxide and d-cycloserine. Neurobiol Learn Mem 2013; 102:1-6. [DOI: 10.1016/j.nlm.2013.02.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Revised: 02/01/2013] [Accepted: 02/05/2013] [Indexed: 10/27/2022]
|
16
|
Reconsolidation and extinction of an appetitive pavlovian memory. Neurobiol Learn Mem 2013; 104:25-31. [PMID: 23639449 DOI: 10.1016/j.nlm.2013.04.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Revised: 04/19/2013] [Accepted: 04/19/2013] [Indexed: 01/20/2023]
Abstract
When memories are retrieved, they can enter a labile state during which the memory may be modified and subsequently restabilized through the process of reconsolidation. However, this does not occur in all situations, and certain "boundary conditions" determine whether a memory will undergo reconsolidation. Naïve male lister hooded rats were trained for 5 days to press a lever in order to retrieve a food reward associated with a pavlovian light stimulus. Three days post-training, animals were injected with either MK-801 (0.1 mgkg(-1); i.p.) or saline vehicle, 30 min before they were placed back into the training context for a retrieval session. Lever pressing was reinforced only by the light stimulus and was restricted to either 10, 30 or 50 presentations of the light conditioned stimulus. After 48 h, animals were again returned to the boxes and light-reinforced lever-pressing activity was recorded. MK-801-treated animals in the 10CS group significantly reduced lever pressing at test, compared to saline controls. In contrast, MK-801-treated rats in the 50CS group demonstrated a significant increase. There was no effect of MK-801 in the 30CS group. Additionally, there were no effects of MK-801 in an analogous, pure instrumental, setting when the cue lights were omitted. The opposing effects of MK-801 under different parametric conditions likely reflect impairments of appetitive pavlovian memory reconsolidation and extinction, respectively. These results demonstrate a competition between reconsolidation and extinction. However, there are also conditions under which MK-801 fails to impair either process.
Collapse
|
17
|
Torregrossa MM, Taylor JR. Learning to forget: manipulating extinction and reconsolidation processes to treat addiction. Psychopharmacology (Berl) 2013; 226:659-72. [PMID: 22638814 PMCID: PMC3466391 DOI: 10.1007/s00213-012-2750-9] [Citation(s) in RCA: 118] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2012] [Accepted: 05/13/2012] [Indexed: 11/29/2022]
Abstract
Finding effective long-lasting treatments for drug addiction has been an elusive goal. Consequently, researchers are beginning to investigate novel treatment strategies including manipulations of drug-associated memories. When environmental stimuli (cues) become associated with drug use, they become powerful motivators of continued drug use and relapse after abstinence. Reducing the strength of these cue-drug memories could decrease the number of factors that induce craving and relapse to aid in the treatment of addiction. Enhancing the consolidation of extinction learning and/or disrupting cue-drug memory reconsolidation are two strategies that have been proposed to reduce the strength of cues in motivating drug-seeking and drug-taking behavior. Here, we review the latest basic and clinical research elucidating the mechanisms underlying consolidation of extinction and reconsolidation of cue-drug memories in the hopes of developing pharmacological tools that exploit these signaling systems to treat addiction.
Collapse
Affiliation(s)
| | - Jane R. Taylor
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT,Department of Psychology, Yale University, New Haven, CT
| |
Collapse
|
18
|
d-Cycloserine administered directly to infralimbic medial prefrontal cortex enhances extinction memory in sucrose-seeking animals. Neuroscience 2013; 230:24-30. [DOI: 10.1016/j.neuroscience.2012.11.004] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Revised: 11/01/2012] [Accepted: 11/03/2012] [Indexed: 01/13/2023]
|
19
|
Portero-Tresserra M, Martí-Nicolovius M, Guillazo-Blanch G, Boadas-Vaello P, Vale-Martínez A. D-cycloserine in the basolateral amygdala prevents extinction and enhances reconsolidation of odor-reward associative learning in rats. Neurobiol Learn Mem 2012. [PMID: 23200640 DOI: 10.1016/j.nlm.2012.11.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
It is well established that D-cycloserine (DCS), a partial agonist of the NMDA receptor glycine site, enhances learning and memory processes. Although the effects of DCS have been especially elucidated in the extinction and reconsolidation of aversive behavioral paradigms or drug-related behaviors, they have not been clearly determined in appetitive tasks using natural reinforcers. The current study examined the effects of pre-retrieval intra-basolateral amygdala (BLA) infusions of DCS on the extinction and reconsolidation of an appetitive odor discrimination task. Rats were trained to discriminate between three odors, one of which was associated with a palatable food reward, and, 20 min prior to extinction learning (experiment 1) or reactivation (experiment 2), they received bilateral intra-BLA infusions of DCS or vehicle. In experiment 1, DCS infusion reduced the rate of extinction learning, weakened extinction retention in a post-extinction test and enhanced reacquisition of the ODT task. In experiment 2, DCS improved subsequent memory expression in the reconsolidation test performed one day after the reactivation session. Such results indicate the involvement of BLA NMDA receptors in odor-food reward associative memory and suggest that DCS may potentiate the persistence or strength of the original memory trace.
Collapse
Affiliation(s)
- Marta Portero-Tresserra
- Departament de Psicobiologia i Metodologia de les Ciències de la Salut, Institut de Neurociències, Universitat Autònoma de Barcelona, Barcelona, Spain
| | | | | | | | | |
Collapse
|
20
|
Bouton ME, Winterbauer NE, Todd TP. Relapse processes after the extinction of instrumental learning: renewal, resurgence, and reacquisition. Behav Processes 2012; 90:130-41. [PMID: 22450305 PMCID: PMC3355659 DOI: 10.1016/j.beproc.2012.03.004] [Citation(s) in RCA: 146] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 03/05/2012] [Accepted: 03/07/2012] [Indexed: 12/23/2022]
Abstract
It is widely recognized that extinction (the procedure in which a Pavlovian conditioned stimulus or an instrumental action is repeatedly presented without its reinforcer) weakens behavior without erasing the original learning. Most of the experiments that support this claim have focused on several "relapse" effects that occur after Pavlovian extinction, which collectively suggest that the original learning is saved through extinction. However, although such effects do occur after instrumental extinction, they have not been explored there in as much detail. This article reviews recent research in our laboratory that has investigated three relapse effects that occur after the extinction of instrumental (operant) learning. In renewal, responding returns after extinction when the behavior is tested in a different context; in resurgence, responding recovers when a second response that has been reinforced during extinction of the first is itself put on extinction; and in rapid reacquisition, extinguished responding returns rapidly when the response is reinforced again. The results provide new insights into extinction and relapse, and are consistent with principles that have been developed to explain extinction and relapse as they occur after Pavlovian conditioning. Extinction of instrumental learning, like Pavlovian learning, involves new learning that is relatively dependent on the context for expression.
Collapse
Affiliation(s)
- Mark E Bouton
- Department of Psychology, University of Vermont, 354 Dewey Hall, 2 Colchester Ave., Burlington, VT 05405-0134, USA.
| | | | | |
Collapse
|