1
|
Song MR, Lee SW. Rethinking dopamine-guided action sequence learning. Eur J Neurosci 2024; 60:3447-3465. [PMID: 38798086 DOI: 10.1111/ejn.16426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 04/21/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024]
Abstract
As opposed to those requiring a single action for reward acquisition, tasks necessitating action sequences demand that animals learn action elements and their sequential order and sustain the behaviour until the sequence is completed. With repeated learning, animals not only exhibit precise execution of these sequences but also demonstrate enhanced smoothness and efficiency. Previous research has demonstrated that midbrain dopamine and its major projection target, the striatum, play crucial roles in these processes. Recent studies have shown that dopamine from the substantia nigra pars compacta (SNc) and the ventral tegmental area (VTA) serve distinct functions in action sequence learning. The distinct contributions of dopamine also depend on the striatal subregions, namely the ventral, dorsomedial and dorsolateral striatum. Here, we have reviewed recent findings on the role of striatal dopamine in action sequence learning, with a focus on recent rodent studies.
Collapse
Affiliation(s)
- Minryung R Song
- Department of Brain and Cognitive Sciences, KAIST, Daejeon, South Korea
| | - Sang Wan Lee
- Department of Brain and Cognitive Sciences, KAIST, Daejeon, South Korea
- Kim Jaechul Graduate School of AI, KAIST, Daejeon, South Korea
- KI for Health Science and Technology, KAIST, Daejeon, South Korea
- Center for Neuroscience-inspired AI, KAIST, Daejeon, South Korea
| |
Collapse
|
2
|
Crego ACG, Amaya KA, Palmer JA, Smith KS. A role for the dorsolateral striatum in prospective action control. iScience 2024; 27:110044. [PMID: 38883824 PMCID: PMC11176669 DOI: 10.1016/j.isci.2024.110044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 03/20/2024] [Accepted: 05/17/2024] [Indexed: 06/18/2024] Open
Abstract
The dorsolateral striatum (DLS) is important for performing actions persistently, even when it becomes suboptimal, reflecting a function that is reflexive and habitual. However, there are also ways in which persistent behaviors can result from a more prospective, planning mode of behavior. To help tease apart these possibilities for DLS function, we trained animals to perform a lever press for reward and then inhibited the DLS in key test phases: as the task shifted from a 1-press to a 3-press rule (upshift), as the task was maintained, as the task shifted back to the one-press rule (downshift), and when rewards came independent of pressing. During DLS inhibition, animals always favored their initially learned strategy to press just once, particularly so during the free-reward period. DLS inhibition surprisingly changed performance speed bidirectionally depending on the task shifts. DLS inhibition thus encouraged habitual behavior, suggesting it could normally help adapt to changing conditions.
Collapse
Affiliation(s)
- Adam C G Crego
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA
| | - Kenneth A Amaya
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA
| | - Jensen A Palmer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA
| | - Kyle S Smith
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH 03755, USA
| |
Collapse
|
3
|
Ferguson LA, Matamales M, Nolan C, Balleine BW, Bertran-Gonzalez J. Adaptation of sequential action benefits from timing variability related to lateral basal ganglia circuitry. iScience 2024; 27:109274. [PMID: 38496293 PMCID: PMC10943431 DOI: 10.1016/j.isci.2024.109274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/11/2023] [Accepted: 02/15/2024] [Indexed: 03/19/2024] Open
Abstract
Streamlined action sequences must remain flexible should stable contingencies in the environment change. By combining analyses of behavioral structure with a circuit-specific manipulation in mice, we report on a relationship between action timing variability and successful adaptation that relates to post-synaptic targets of primary motor cortical (M1) projections to dorsolateral striatum (DLS). In a two-lever instrumental task, mice formed successful action sequences by, first, establishing action scaffolds and, second, smoothly extending action duration to adapt to increased task requirements. Interruption of DLS neurons in M1 projection territories altered this process, evoking higher-rate actions that were more stereotyped in their timing, reducing opportunities for success. Based on evidence from neuronal tracing experiments, we propose that DLS neurons in M1 projection territories supply action timing variability to facilitate adaptation, a function that may involve additional downstream subcortical processing relating to collateralization of descending motor pathways to multiple basal ganglia centers.
Collapse
Affiliation(s)
- Lachlan A. Ferguson
- Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia
| | - Miriam Matamales
- Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia
| | - Christopher Nolan
- Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia
| | - Bernard W. Balleine
- Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia
| | - Jesus Bertran-Gonzalez
- Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
4
|
Sears RM, Andrade EC, Samels SB, Laughlin LC, Moloney DM, Wilson DA, Alwood MR, Moscarello JM, Cain CK. Devaluation of response-produced safety signals reveals circuits for goal-directed versus habitual avoidance in dorsal striatum. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.07.579321. [PMID: 38370659 PMCID: PMC10871355 DOI: 10.1101/2024.02.07.579321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Active avoidance responses (ARs) are instrumental behaviors that prevent harm. Adaptive ARs may contribute to active coping, whereas maladaptive avoidance habits are implicated in anxiety and obsessive-compulsive disorders. The AR learning mechanism has remained elusive, as successful avoidance trials produce no obvious reinforcer. We used a novel outcome-devaluation procedure in rats to show that ARs are positively reinforced by response-produced feedback (FB) cues that develop into safety signals during training. Males were sensitive to FB-devaluation after moderate training, but not overtraining, consistent with a transition from goal-directed to habitual avoidance. Using chemogenetics and FB-devaluation, we also show that goal-directed vs. habitual ARs depend on dorsomedial vs. dorsolateral striatum, suggesting a significant overlap between the mechanisms of avoidance and rewarded instrumental behavior. Females were insensitive to FB-devaluation due to a remarkable context-dependence of counterconditioning. However, degrading the AR-FB contingency suggests that both sexes rely on safety signals to perform goal-directed ARs.
Collapse
Affiliation(s)
- Robert M Sears
- Department of Child & Adolescent Psychiatry, NYU Grossman School of Medicine, 1 Park Avenue, 8 Floor, New York, NY 10016
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
- These authors contributed equally
| | - Erika C Andrade
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
- These authors contributed equally
| | - Shanna B Samels
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
| | - Lindsay C Laughlin
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
| | - Danielle M Moloney
- Department of Child & Adolescent Psychiatry, NYU Grossman School of Medicine, 1 Park Avenue, 8 Floor, New York, NY 10016
| | - Donald A Wilson
- Department of Child & Adolescent Psychiatry, NYU Grossman School of Medicine, 1 Park Avenue, 8 Floor, New York, NY 10016
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
| | - Matthew R Alwood
- Department of Psychological & Brain Sciences, Texas A&M Institute for Neuroscience, Texas A&M University, 301 Old Main, TAMU MS 3474, College Station, TX 77843-3474
| | - Justin M Moscarello
- Department of Psychological & Brain Sciences, Texas A&M Institute for Neuroscience, Texas A&M University, 301 Old Main, TAMU MS 3474, College Station, TX 77843-3474
| | - Christopher K Cain
- Department of Child & Adolescent Psychiatry, NYU Grossman School of Medicine, 1 Park Avenue, 8 Floor, New York, NY 10016
- Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962
| |
Collapse
|
5
|
Thrailkill EA, Daniels CW. The temporal structure of goal-directed and habitual operant behavior. J Exp Anal Behav 2024; 121:38-51. [PMID: 38131488 PMCID: PMC10872308 DOI: 10.1002/jeab.896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 12/01/2023] [Indexed: 12/23/2023]
Abstract
Operant behavior can reflect the influence of goal-directed and habitual processes. These can be distinguished by changes to response rate following devaluation of the reinforcing outcome. Whether a response is goal directed or habitual depends on whether devaluation affects response rate. Response rate can be decomposed into frequencies of bouts and pauses by analyzing the distribution of interresponse times. This study sought to characterize goal-directed and habitual behaviors in terms of bout-initiation rate, within-bout response rate, bout length, and bout duration. Data were taken from three published studies that compared sensitivity to devaluation following brief and extended training with variable-interval schedules. Analyses focused on goal-directed and habitual responding, a comparison of a habitual response to a similarly trained response that had been converted back to goal-directed status after a surprising event, and a demonstration of contextual control of habit and goal direction in the same subjects. Across experiments and despite responses being clearly distinguished as goal directed and habitual by total response rate, analyses of bout-initiation rate, within-bout rate, bout length, and bout duration did not reveal a pattern that distinguished goal-directed from habitual responding.
Collapse
|
6
|
Handel SN, Smith RJ. Making and breaking habits: Revisiting the definitions and behavioral factors that influence habits in animals. J Exp Anal Behav 2024; 121:8-26. [PMID: 38010353 PMCID: PMC10842199 DOI: 10.1002/jeab.889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/26/2023] [Indexed: 11/29/2023]
Abstract
Habits have garnered significant interest in studies of associative learning and maladaptive behavior. However, habit research has faced scrutiny and challenges related to the definitions and methods. Differences in the conceptualizations of habits between animal and human studies create difficulties for translational research. Here, we review the definitions and commonly used methods for studying habits in animals and humans and discuss potential alternative ways to assess habits, such as automaticity. To better understand habits, we then focus on the behavioral factors that have been shown to make or break habits in animals, as well as potential mechanisms underlying the influence of these factors. We discuss the evidence that habitual and goal-directed systems learn in parallel and that they seem to interact in competitive and cooperative manners. Finally, we draw parallels between habitual responding and compulsive drug seeking in animals to delineate the similarities and differences in these behaviors.
Collapse
Affiliation(s)
- Sophia N Handel
- Department of Psychological and Brain Sciences, Texas A&M University, College Station, Texas, USA
| | - Rachel J Smith
- Department of Psychological and Brain Sciences, Texas A&M University, College Station, Texas, USA
- Institute for Neuroscience, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
7
|
Turner KM, Balleine BW. Stimulus control of habits: Evidence for both stimulus specificity and devaluation insensitivity in a dual-response task. J Exp Anal Behav 2024; 121:52-61. [PMID: 38100179 PMCID: PMC10953355 DOI: 10.1002/jeab.898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 12/01/2023] [Indexed: 12/21/2023]
Abstract
Goal-directed and habitual actions are clearly defined by their associative relations. Whereas goal-directed control can be confirmed via tests of outcome devaluation and contingency-degradation sensitivity, a comparable criterion for positively detecting habits has not been established. To confirm habitual responding, a test of control by the stimulus-response association is required while also ruling out goal-directed control. Here we describe an approach to developing such a test in rats using two discriminative stimuli that set the occasion for two different responses that then earn the same outcome. Performance was insensitive to outcome devaluation and showed stimulus-response specificity, indicative of stimulus-controlled behavior. The reliance of stimulus-response associations was further supported by a lack of sensitivity during the single extinction test session used here. These results demonstrate that two concurrently trained responses can come under habitual control when they share a common outcome. By reducing the ability of one stimulus to signal its corresponding response-outcome association, we found evidence for goal-directed control that can be dissociated from habits. Overall, these experiments provide evidence that tests assessing specific stimulus-response associations can be used to investigate habits.
Collapse
Affiliation(s)
- K. M. Turner
- School of PsychologyUniversity of New South WalesSydneyAustralia
| | - B. W. Balleine
- School of PsychologyUniversity of New South WalesSydneyAustralia
| |
Collapse
|
8
|
Fraser KM, Chen BJ, Janak PH. Nucleus accumbens and dorsal medial striatal dopamine and neural activity are essential for action sequence performance. Eur J Neurosci 2024; 59:220-237. [PMID: 38093522 PMCID: PMC10841748 DOI: 10.1111/ejn.16210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 11/09/2023] [Accepted: 11/15/2023] [Indexed: 01/23/2024]
Abstract
Separable striatal circuits have unique functions in Pavlovian and instrumental behaviors but how these roles relate to performance of sequences of actions with and without associated cues are less clear. Here, we tested whether dopamine transmission and neural activity more generally in three striatal subdomains are necessary for performance of an action chain leading to reward delivery. Male and female Long-Evans rats were trained to press a series of three spatially distinct levers to receive reward. We assessed the contribution of neural activity or dopamine transmission within each striatal subdomain when progression through the action sequence was explicitly cued and in the absence of cues. Behavior in both task variations was substantially impacted following microinfusion of the dopamine antagonist, flupenthixol, into nucleus accumbens core (NAc) or dorsomedial striatum (DMS), with impairments in sequence timing and numbers of rewards earned after NAc flupenthixol. In contrast, after pharmacological inactivation to suppress overall activity, there was minimal impact on total rewards earned. Instead, inactivation of both NAc and DMS impaired sequence timing and led to sequence errors in the uncued, but not cued task. There was no impact of dopamine antagonism or reversible inactivation of dorsolateral striatum on either cued or uncued action sequence completion. These results highlight an essential contribution of NAc and DMS dopamine systems in motivational and performance aspects of chains of actions, whether cued or internally generated, as well as the impact of intact NAc and DMS function for correct sequence performance.
Collapse
Affiliation(s)
- Kurt M. Fraser
- Department of Psychological & Brain Sciences, Krieger School of Arts & Sciences, Johns Hopkins University, Baltimore, MD, 21218
| | - Bridget J. Chen
- Department of Psychological & Brain Sciences, Krieger School of Arts & Sciences, Johns Hopkins University, Baltimore, MD, 21218
| | - Patricia H. Janak
- Department of Psychological & Brain Sciences, Krieger School of Arts & Sciences, Johns Hopkins University, Baltimore, MD, 21218
- Solomon H. Snyder Department of Neuroscience, School of Medicine, Johns Hopkins University, Baltimore, MD, 21205
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD 21218
| |
Collapse
|
9
|
Chevée M, Kim CJ, Crow N, Follman EG, Leonard MZ, Calipari ES. Food Restriction Level and Reinforcement Schedule Differentially Influence Behavior during Acquisition and Devaluation Procedures in Mice. eNeuro 2023; 10:ENEURO.0063-23.2023. [PMID: 37696663 PMCID: PMC10537440 DOI: 10.1523/eneuro.0063-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/13/2023] Open
Abstract
Behavioral strategies are often classified based on whether reinforcer value controls reinforcement. Value-sensitive behaviors, in which animals update their actions when reinforcer value is changed, are classified as goal-directed; conversely, value-insensitive actions, where behavior remains consistent when the reinforcer is removed or devalued, are considered habitual. Basic reinforcement schedules can help to bias behavior toward either process: random ratio (RR) schedules are thought to promote the formation of goal-directed behaviors while random intervals (RIs) promote habitual control. However, how the schedule-specific features of these tasks interact with other factors that influence learning to control behavior has not been well characterized. Using male and female mice, we asked how distinct food restriction levels, a strategy often used to increase task engagement, interact with RR and RI schedules to control performance during task acquisition and devaluation procedures. We determined that food restriction level has a stronger effect on the behavior of mice following RR schedules compared with RI schedules, and that it promotes a decrease in response rate during devaluation procedures that is best explained by the effects of extinction rather than devaluation. Surprisingly, food restriction accelerated the decrease in response rates observed following devaluation across sequential extinction sessions, but not within a single session. Our results support the idea that the relationships between schedules and behavioral control strategies are not clear-cut and suggest that an animal's engagement in a task must be accounted for, together with the structure of reinforcement schedules, to appropriately interpret the cognitive underpinnings of behavior.
Collapse
Affiliation(s)
- Maxime Chevée
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
| | - Courtney J Kim
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
| | - Nevin Crow
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
| | - Emma G Follman
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
| | - Michael Z Leonard
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
| | - Erin S Calipari
- Department of Pharmacology, Vanderbilt University, Nashville TN 37232
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN 37232
- Vanderbilt Center for Addiction Research, Vanderbilt University, Nashville, TN 37232
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University, Nashville, TN 37232
| |
Collapse
|
10
|
María-Ríos CE, Fitzpatrick CJ, Czesak FN, Morrow JD. Effects of predictive and incentive value manipulation on sign- and goal-tracking behavior. Neurobiol Learn Mem 2023; 203:107796. [PMID: 37385521 PMCID: PMC10599606 DOI: 10.1016/j.nlm.2023.107796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 05/01/2023] [Accepted: 06/26/2023] [Indexed: 07/01/2023]
Abstract
When a neutral stimulus is repeatedly paired with an appetitive reward, two different types of conditioned approach responses may develop: a sign-tracking response directed toward the neutral cue, or a goal-tracking response directed toward the location of impending reward delivery. Sign-tracking responses have been postulated to result from attribution of incentive value to conditioned cues, while goal-tracking reflects the assignment of only predictive value to the cue. We therefore hypothesized that sign-tracking rats would be more sensitive to manipulations of incentive value, while goal-tracking rats would be more responsive to changes in the predictive value of the cue. We tested sign- and goal-tracking before and after devaluation of a food reward using lithium chloride, and tested whether either response could be learned under negative contingency conditions that precluded any serendipitous reinforcement of the behavior that might support instrumental learning. We also tested the effects of blocking the predictive value of a cue using simultaneous presentation of a pre-conditioned cue. We found that sign-tracking was sensitive to outcome devaluation, while goal-tracking was not. We also confirmed that both responses are Pavlovian because they can be learned under negative contingency conditions. Goal-tracking was almost completely blocked by a pre-conditioned cue, while sign-tracking was much less sensitive to such interference. These results indicate that sign- and goal-tracking may follow different rules of reinforcement learning and suggest a need to revise current models of associative learning to account for these differences.
Collapse
Affiliation(s)
- Cristina E María-Ríos
- Neuroscience Graduate Program, University of Michigan, 204 Washtenaw Ave., Ann Arbor, MI 48109, USA
| | | | - Francesca N Czesak
- Neuroscience Graduate Program, University of Michigan, 204 Washtenaw Ave., Ann Arbor, MI 48109, USA
| | - Jonathan D Morrow
- Neuroscience Graduate Program, University of Michigan, 204 Washtenaw Ave., Ann Arbor, MI 48109, USA; Department of Psychiatry, University of Michigan, 1500 E. Medical Center Dr., Ann Arbor, MI 48109, USA.
| |
Collapse
|
11
|
Moore S, Wang Z, Zhu Z, Sun R, Lee A, Charles A, Kuchibhotla KV. Revealing abrupt transitions from goal-directed to habitual behavior. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.05.547783. [PMID: 37461576 PMCID: PMC10349993 DOI: 10.1101/2023.07.05.547783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
A fundamental tenet of animal behavior is that decision-making involves multiple 'controllers.' Initially, behavior is goal-directed, driven by desired outcomes, shifting later to habitual control, where cues trigger actions independent of motivational state. Clark Hull's question from 1943 still resonates today: "Is this transition abrupt, or is it gradual and progressive?"1 Despite a century-long belief in gradual transitions, this question remains unanswered2,3 as current methods cannot disambiguate goal-directed versus habitual control in real-time. Here, we introduce a novel 'volitional engagement' approach, motivating animals by palatability rather than biological need. Offering less palatable water in the home cage4,5 reduced motivation to 'work' for plain water in an auditory discrimination task when compared to water-restricted animals. Using quantitative behavior and computational modeling6, we found that palatability-driven animals learned to discriminate as quickly as water-restricted animals but exhibited state-like fluctuations when responding to the reward-predicting cue-reflecting goal-directed behavior. These fluctuations spontaneously and abruptly ceased after thousands of trials, with animals now always responding to the reward-predicting cue. In line with habitual control, post-transition behavior displayed motor automaticity, decreased error sensitivity (assessed via pupillary responses), and insensitivity to outcome devaluation. Bilateral lesions of the habit-related dorsolateral striatum7 blocked transitions to habitual behavior. Thus, 'volitional engagement' reveals spontaneous and abrupt transitions from goal-directed to habitual behavior, suggesting the involvement of a higher-level process that arbitrates between the two.
Collapse
Affiliation(s)
- Sharlen Moore
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Zyan Wang
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Ziyi Zhu
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Johns Hopkins Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Ruolan Sun
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Angel Lee
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Adam Charles
- Johns Hopkins Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Kishore V. Kuchibhotla
- Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
- Johns Hopkins Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
12
|
Yamada K, Toda K. Habit formation viewed as structural change in the behavioral network. Commun Biol 2023; 6:303. [PMID: 37016036 PMCID: PMC10073220 DOI: 10.1038/s42003-023-04500-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 01/18/2023] [Indexed: 04/06/2023] Open
Abstract
Habit formation is a process in which an action becomes involuntary. While goal-directed behavior is driven by its consequences, habits are elicited by a situation rather than its consequences. Existing theories have proposed that actions are controlled by corresponding two distinct systems. Although canonical theories based on such distinctions are starting to be challenged, there are a few theoretical frameworks that implement goal-directed behavior and habits within a single system. Here, we propose a novel theoretical framework by hypothesizing that behavior is a network composed of several responses. With this framework, we have shown that the transition of goal-directed actions to habits is caused by a change in a single network structure. Furthermore, we confirmed that the proposed network model behaves in a manner consistent with the existing experimental results reported in animal behavioral studies. Our results revealed that habit could be formed under the control of a single system rather than two distinct systems. By capturing the behavior as a single network change, this framework provides a new perspective on studying the structure of the behavior for experimental and theoretical research.
Collapse
Affiliation(s)
- Kota Yamada
- Department of Psychology, Keio University, Tokyo, Japan.
- Japan Society for Promotion of Science, Tokyo, Japan.
| | - Koji Toda
- Department of Psychology, Keio University, Tokyo, Japan
| |
Collapse
|
13
|
Chevée M, Kim CJ, Crow N, Follman EG, Calipari ES. Sensitivity to outcome devaluation in operant tasks is better predicted by food restriction level than reinforcement training schedule in mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.23.529699. [PMID: 36865193 PMCID: PMC9980049 DOI: 10.1101/2023.02.23.529699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Behavioral strategies are often classified based on whether reinforcement is controlled by the value of the reinforcer. Value-sensitive behaviors, in which animals update their actions when reinforcer value is changed, are classified as goal-directed; conversely, value-insensitive actions, where behavior remains consistent when the reinforcer is removed or devalued, are considered habitual. Understanding the features of operant training that bias behavioral control toward either strategy is essential to understanding the cognitive and neuronal processes on which they rely. Using basic reinforcement principles, behavior can be biased toward relying on either process: random ratio (RR) schedules are thought to promote the formation of goal-directed behaviors while random intervals (RI) promote habitual control. However, how the schedule-specific features of these task structures relate to external factors to influence behavior is not well understood. Using male and female mice on distinct food restriction levels, we trained each group on RR schedules with responses-per-reinforcer rates matched to their RI counterparts to control for differences in reinforcement rate. We determined that food restriction level has a stronger effect on the behavior of mice following RR schedules than mice following RI schedules and that food restriction better predicted sensitivity to outcome devaluation than training schedule. Our results support the idea the relationships between RR or RI schedules with goal-directed or habitual behaviors, respectively, are more nuanced than previously appreciated and suggest that an animal's engagement in a task must be accounted for, together with the structure of reinforcement schedules, to appropriately interpret the cognitive underpinnings of behavior.
Collapse
Affiliation(s)
- Maxime Chevée
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Courtney J. Kim
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Nevin Crow
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Emma G. Follman
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Erin S. Calipari
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Center for Addiction Research, Vanderbilt University, Nashville, TN, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
14
|
Frölich S, Esmeyer M, Endrass T, Smolka MN, Kiebel SJ. Interaction between habits as action sequences and goal-directed behavior under time pressure. Front Neurosci 2023; 16:996957. [PMID: 36711151 PMCID: PMC9880255 DOI: 10.3389/fnins.2022.996957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 12/14/2022] [Indexed: 01/15/2023] Open
Abstract
Human behavior consists in large parts of action sequences that are often repeated in mostly the same way. Through extensive repetition, sequential responses become automatic or habitual, but our environment often confronts us with events to which we have to react flexibly and in a goal-directed manner. To assess how implicitly learned action sequences interfere with goal-directed control, we developed a novel behavioral paradigm in which we combined action sequence learning through repetition with a goal-directed task component. So-called dual-target trials require the goal-directed selection of the response with the highest reward probability in a fast succession of trials with short response deadlines. Importantly, the response primed by the learned action sequence is sometimes different from that required by the goal-directed task. As expected, we found that participants learned the action sequence through repetition, as evidenced by reduced reaction times (RT) and error rates (ER), while still acting in a goal-directed manner in dual-target trials. Specifically, we found that the learned action sequence biased choices in the goal-directed task toward the sequential response, and this effect was more pronounced the better individuals had learned the sequence. Our novel task may help shed light on the acquisition of automatic behavioral patterns and habits through extensive repetition, allows to assess positive features of habitual behavior (e.g., increased response speed and reduced error rates), and importantly also the interaction of habitual and goal-directed behaviors under time pressure.
Collapse
Affiliation(s)
- Sascha Frölich
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Marlon Esmeyer
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Tanja Endrass
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Michael N. Smolka
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany
| | - Stefan J. Kiebel
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
- Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
15
|
Crego ACG, Amaya KA, Palmer JA, Smith KS. Task history dictates how the dorsolateral striatum controls action strategy and vigor. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.11.523640. [PMID: 36711550 PMCID: PMC9882068 DOI: 10.1101/2023.01.11.523640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
The dorsolateral striatum (DLS) is linked to the learning and honing of action routines. However, the DLS is also important for performing behaviors that have been successful in the past. The learning function can be thought of as prospective, helping to plan ongoing actions to be efficient and often optimal. The performance function is more retrospective, helping the animal continue to behave in a way that had worked previously. How the DLS manages this all is curious. What happens when a learned behavior becomes sub-optimal due to environment changes. In this case, the prospective function of the DLS would cause animals to (adaptively) learn and plan more optimal actions. In contrast, the retrospective function would cause animals to (maladaptively) favor the old behavior. Here we find that, during a change in learned task rules, DLS inhibition causes animals to adjust less rapidly to the new task (and to behave less vigorously) in a 'maladaptive' way. Yet, when the task is changed back to the initially learned rules, DLS inhibition instead causes a rapid and vigorous adjustment of behavior in an 'adaptive' way. These results show that inhibiting the DLS biases behavior towards initially acquired strategies, implying a more retrospective outlook in action selection when the DLS is offline. Thus, an active DLS could encourage planning and learning action routines more prospectively. Moreover, the DLS control over behavior can appear to be either advantageous/flexible or disadvantageous/inflexible depending on task context, and its control over vigor can change depending on task context. Significant Statement Basal ganglia networks aid behavioral learning (a prospective planning function) but also favor the use of old behaviors (a retrospective performance function), making it unclear what happens when learned behaviors become suboptimal. Here we inhibit the dorsolateral striatum (DLS) as animals encounter a change in task rules, and again when they shift back to those learned task rules. DLS inhibition reduces adjustment to new task rules (and reduces behavioral vigor), but it increases adjustment back to the initially learned task rules later (and increases vigor). Thus, in both cases, DLS inhibition favored the use of the initially learned behavioral strategy, which could appear either maladaptive or adaptive. We suggest that the DLS might promote a prospective orientation of action control.
Collapse
|
16
|
Lack of action monitoring as a prerequisite for habitual and chunked behavior: Behavioral and neural correlates. iScience 2022; 26:105818. [PMID: 36636348 PMCID: PMC9830217 DOI: 10.1016/j.isci.2022.105818] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/01/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
We previously reported the rapid development of habitual behavior in a discrete-trials instrumental task in which lever insertion and retraction act as reward-predictive cues delineating sequence execution. Here we asked whether lever cues or performance variables reflective of skill and automaticity might account for habitual behavior in male rats. Behavior in the discrete-trials habit-promoting task was compared with two task variants lacking the sequence-delineating cues of lever extension and retraction. We find that behavior is under goal-directed control in absence of sequence-delineating cues but not in their presence, and that skilled performance does not predict goal-directed vs. habitual behavior. Neural activity recordings revealed an engagement of dorsolateral striatum and a disengagement of dorsomedial striatum during the sequence execution of the habit-promoting task, specifically. Together, these results indicate that sequence delineation cues promote habit and differential engagement of striatal subregions during instrumental responding, a pattern that may reflect cue-elicited behavioral chunking.
Collapse
|
17
|
Animal models of action control and cognitive dysfunction in Parkinson's disease. PROGRESS IN BRAIN RESEARCH 2022; 269:227-255. [PMID: 35248196 DOI: 10.1016/bs.pbr.2022.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Parkinson's disease (PD) has historically been considered a motor disorder induced by a loss of dopaminergic neurons in the substantia nigra pars compacta. More recently, it has been recognized to have significant non-motor symptoms, most prominently cognitive symptoms associated with a dysexecutive syndrome. It is common in the literature to see motor and cognitive symptoms treated separately and, indeed, there has been a general call for specialized treatment of the latter, particularly in the more severe cases of PD with mild cognitive impairment and dementia. Animal studies have similarly been developed to model the motor or non-motor symptoms. Nevertheless, considerable research has established that segregating consideration of cognition from the precursors to motor movement, particularly movement associated with goal-directed action, is difficult if not impossible. Indeed, on some contemporary views cognition is embodied in action control, something that is particularly prevalent in theory and evidence relating to the integration of goal-directed and habitual control processes. The current paper addresses these issues within the literature detailing animal models of cognitive dysfunction in PD and their neural and neurochemical bases. Generally, studies using animal models of PD provide some of the clearest evidence for the integration of these action control processes at multiple levels of analysis and imply that consideration of this integrative process may have significant benefits for developing new approaches to the treatment of PD.
Collapse
|
18
|
van Elzelingen W, Warnaar P, Matos J, Bastet W, Jonkman R, Smulders D, Goedhoop J, Denys D, Arbab T, Willuhn I. Striatal dopamine signals are region specific and temporally stable across action-sequence habit formation. Curr Biol 2022; 32:1163-1174.e6. [PMID: 35134325 PMCID: PMC8926842 DOI: 10.1016/j.cub.2021.12.027] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 11/03/2021] [Accepted: 12/09/2021] [Indexed: 12/24/2022]
Abstract
Habits are automatic, inflexible behaviors that develop slowly with repeated performance. Striatal dopamine signaling instantiates this habit-formation process, presumably region specifically and via ventral-to-dorsal and medial-to-lateral signal shifts. Here, we quantify dopamine release in regions implicated in these presumed shifts (ventromedial striatum [VMS], dorsomedial striatum [DMS], and dorsolateral striatum [DLS]) in rats performing an action-sequence task and characterize habit development throughout a 10-week training. Surprisingly, all regions exhibited stable dopamine dynamics throughout habit development. VMS and DLS signals did not differ between habitual and non-habitual animals, but DMS dopamine release increased during action-sequence initiation and decreased during action-sequence completion in habitual rats, whereas non-habitual rats showed opposite effects. Consistently, optogenetic stimulation of DMS dopamine release accelerated habit formation. Thus, we demonstrate that dopamine signals do not shift regionally during habit formation and that dopamine in DMS, but not VMS or DLS, determines habit bias, attributing “habit functions” to a region previously associated exclusively with non-habitual behavior. Validation of a novel test that monitors habit development individually across time Dopamine release during habit development is stable across relevant striatal regions Only dopamine release in dorsomedial striatum correlates with habit development Optogenetic stimulation of dorsomedial striatal dopamine accelerates habit formation
Collapse
|
19
|
Salimi-Badr A, Ebadzadeh MM. A Novel Self-Organizing Fuzzy Neural Network to Learn and Mimic Habitual Sequential Tasks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:323-332. [PMID: 32356769 DOI: 10.1109/tcyb.2020.2984646] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, a new self-organizing fuzzy neural network (FNN) model is presented which is able to simultaneously and accurately learn and reproduce different sequences. Multiple sequence learning is important in performing habitual and skillful tasks, such as writing, signing signatures, and playing piano. Generally, it is indispensable for pattern generation applications. Since multiple sequences have similar parts, local information such as some previous samples is not sufficient to efficiently reproduce them. Instead, it is necessary to consider global and discriminative information, maybe in the very initial samples of each sequence, to first recognize them, and then predict their next sample based on the current local information. Therefore, the structure of the proposed network consists of two parts: 1) sequence identifier, which computes a novel sequence identity value based on initial samples of a sequence, and detects the sequence identity based on proper fuzzy rules and 2) sequence locator, which locates the input sample in the sequence. Therefore, by integrating outputs of these two parts in fuzzy rules, the network is able to produce the proper output based on the current state of each sequence. To learn the proposed structure, a gradual learning procedure is proposed. First, learning is performed by adding new fuzzy rules, based on coverage measure, using available correct data. Next, the initialized parameters are fine-tuned, by the gradient descent algorithm, based on fed back approximated network output as the next input. The proposed method has a dynamic structure able to learn new sequences online. Finally, to investigate the effectiveness of the presented approach, it is used to simultaneously learn and reproduce multiple sequences in different applications, including sequences with similar parts, different patterns, and writing different letters. The performance of the proposed method is evaluated and compared with other existing methods, including the adaptive network-based fuzzy inference system, GDFNN, CFNN, and long short-term memory (LSTM). According to these experiments, the proposed method outperforms traditional FNNs and LSTM in learning multiple sequences.
Collapse
|
20
|
Garr E, Padovan-Hernandez Y, Janak PH, Delamater AR. Maintained goal-directed control with overtraining on ratio schedules. Learn Mem 2021; 28:435-439. [PMID: 34782401 PMCID: PMC8600976 DOI: 10.1101/lm.053472.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Accepted: 09/16/2021] [Indexed: 11/25/2022]
Abstract
It is thought that goal-directed control of actions weakens or becomes masked by habits over time. We tested the opposing hypothesis that goal-directed control becomes stronger over time, and that this growth is modulated by the overall action-outcome contiguity. Despite group differences in action-outcome contiguity early in training, rats trained under random and fixed ratio schedules showed equivalent goal-directed control of lever pressing that appeared to grow over time. We confirmed that goal-directed control was maintained after extended training under another type of ratio schedule-continuous reinforcement-using specific satiety and taste aversion devaluation methods. These results add to the growing literature showing that extensive training does not reliably weaken goal-directed control and that it may strengthen it, or at least maintain it.
Collapse
Affiliation(s)
- Eric Garr
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Yasmin Padovan-Hernandez
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | - Patricia H Janak
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | - Andrew R Delamater
- Department of Psychology, Brooklyn College, City University of New York, New York 11210, USA
- Department of Psychology, Graduate Center, City University of New York, New York 10016, USA
| |
Collapse
|
21
|
Serotonin 2C Antagonism in the Lateral Orbitofrontal Cortex Ameliorates Cue-Enhanced Risk Preference and Restores Sensitivity to Reinforcer Devaluation in Male Rats. eNeuro 2021; 8:ENEURO.0341-21.2021. [PMID: 34815296 PMCID: PMC8670605 DOI: 10.1523/eneuro.0341-21.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/27/2021] [Accepted: 11/16/2021] [Indexed: 11/21/2022] Open
Abstract
Previous research has indicated that reward-paired cues can enhance disadvantageous risky choice in both humans and rodents. Systemic administration of a serotonin 2C receptor antagonist can attenuate this cue-induced risk preference in rats. However, the neurocognitive mechanisms mediating this effect are currently unknown. We therefore assessed whether the serotonin 2C receptor antagonist RS 102221 is able to attenuate cue-enhanced risk preference via its actions in the lateral orbitofrontal cortex (lOFC) or prelimbic (PrL) area of the medial prefrontal cortex (mPFC). A total of 32 male Long–Evans rats were trained on the cued version of the rat gambling task (rGT), a rodent analog of the human Iowa gambling task, and bilateral guide cannulae were implanted into the lOFC or PrL. Intra-lOFC infusions of the 5-HT2C antagonist RS 102221 reduced risky choice in animals that showed a preference for the risky options of the rGT at baseline. This effect was not observed in optimal decision-makers, nor those that received infusions targeting the PrL. Given prior data showing that 5-HT2C antagonists also improve reversal learning through the same neural locus, we hypothesized that reward-concurrent cues may amplify risky decision-making through cognitive inflexibility. We therefore devalued the sugar pellet rewards used in the cued rGT (crGT) through satiation and observed that decision-making patterns did not shift unless animals also received intra-lOFC RS 102221. Collectively, these data suggest that the lOFC is one critical site through which reward-concurrent cues promote risky choice patterns that are insensitive to reinforcer devaluation, and that 5-HT2C antagonism may optimize choice by facilitating exploration.
Collapse
|
22
|
Abstract
We present a new mathematical formulation of associative learning focused on non-human animals, which we call A-learning. Building on current animal learning theory and machine learning, A-learning is composed of two learning equations, one for stimulus-response values and one for stimulus values (conditioned reinforcement). A third equation implements decision-making by mapping stimulus-response values to response probabilities. We show that A-learning can reproduce the main features of: instrumental acquisition, including the effects of signaled and unsignaled non-contingent reinforcement; Pavlovian acquisition, including higher-order conditioning, omission training, autoshaping, and differences in form between conditioned and unconditioned responses; acquisition of avoidance responses; acquisition and extinction of instrumental chains and Pavlovian higher-order conditioning; Pavlovian-to-instrumental transfer; Pavlovian and instrumental outcome revaluation effects, including insight into why these effects vary greatly with training procedures and with the proximity of a response to the reinforcer. We discuss the differences between current theory and A-learning, such as its lack of stimulus-stimulus and response-stimulus associations, and compare A-learning with other temporal-difference models from machine learning, such as Q-learning, SARSA, and the actor-critic model. We conclude that A-learning may offer a more convenient view of associative learning than current mathematical models, and point out areas that need further development.
Collapse
|
23
|
Balleine BW. The Meaning of Behavior: Discriminating Reflex and Volition in the Brain. Neuron 2020; 104:47-62. [PMID: 31600515 DOI: 10.1016/j.neuron.2019.09.024] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 08/20/2019] [Accepted: 09/16/2019] [Indexed: 12/11/2022]
Abstract
The ability to establish behaviorally what psychological capacity an animal is deploying-to discern accurately what an animal is doing-is key to functional analyses of the brain. Our current understanding of these capacities suggests, however, that this task is complex; there is evidence that multiple capacities are engaged simultaneously and contribute independently to the control of behavior. As such, establishing the contribution of a cell, circuit, or neural system to any one function requires careful dissection of that role from its influence on other functions and, therefore, the careful selection and design of behavioral tasks fit for that purpose. Here I describe recent research that has sought to utilize behavioral tools to investigate the neural bases of instrumental conditioning, particularly the circuits and systems supporting the capacity for goal-directed action, as opposed to conditioned reflexes and habits, and how these sources of action control interact to generate adaptive behavior.
Collapse
|
24
|
Garr E, Delamater AR. Chemogenetic inhibition in the dorsal striatum reveals regional specificity of direct and indirect pathway control of action sequencing. Neurobiol Learn Mem 2020; 169:107169. [PMID: 31972244 DOI: 10.1016/j.nlm.2020.107169] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 01/07/2020] [Accepted: 01/18/2020] [Indexed: 11/17/2022]
Abstract
Animals engage in intricate action sequences that are constructed during instrumental learning. There is broad consensus that the basal ganglia play a crucial role in the formation and fluid performance of action sequences. To investigate the role of the basal ganglia direct and indirect pathways in action sequencing, we virally expressed Cre-dependent Gi-DREADDs in either the dorsomedial (DMS) or dorsolateral (DLS) striatum during and/or after action sequence learning in D1 and D2 Cre rats. Action sequence performance in D1 Cre rats was slowed down early in training when DREADDs were activated in the DMS, but sped up when activated in the DLS. Acquisition of the reinforced sequence was hindered when DREADDs were activated in the DLS of D2 Cre rats. Outcome devaluation tests conducted after training revealed that the goal-directed control of action sequence rates was immune to chemogenetic inhibition-rats suppressed the rate of sequence performance when rewards were devalued. Sequence initiation latencies were generally sensitive to outcome devaluation, except in the case where DREADD activation was removed in D2 Cre rats that previously experienced DREADD activation in the DMS during training. Sequence completion latencies were generally not sensitive to outcome devaluation, except in the case where D1 Cre rats experienced DREADD activation in the DMS during training and test. Collectively, these results suggest that the indirect pathway originating from the DLS is part of a circuit involved in the effective reinforcement of action sequences, while the direct and indirect pathways originating from the DMS contribute to the goal-directed control of sequence completion and initiation, respectively.
Collapse
Affiliation(s)
- Eric Garr
- Graduate Center, City University of New York, United States; Brooklyn College, City University of New York, United States.
| | - Andrew R Delamater
- Graduate Center, City University of New York, United States; Brooklyn College, City University of New York, United States
| |
Collapse
|
25
|
Banca P, McNamee D, Piercy T, Luo Q, Robbins TW. A Mobile Phone App for the Generation and Characterization of Motor Habits. Front Psychol 2020; 10:2850. [PMID: 31969845 PMCID: PMC6960169 DOI: 10.3389/fpsyg.2019.02850] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 12/02/2019] [Indexed: 11/13/2022] Open
Abstract
Habits are a powerful route to efficiency; the ability to constantly shift between goal-directed and habitual strategies, as well as integrate them into behavioral output, is key to optimal performance in everyday life. When such ability is impaired, it may lead to loss of control and to compulsive behavior. Habits have successfully been induced and investigated in rats using methods such as overtraining stimulus-response associations and outcome devaluation, respectively. However, such methods have ineffectively measured habits in humans because (1) human habits usually involve more complex sequences of actions than in rats and (2) of pragmatic impediments posed by the extensive time (weeks or even months), it may take for routine habits to develop. We present here a novel behavioral paradigm-a mobile-phone app methodology-for inducing and measuring habits in humans during their everyday schedule and environment. It assumes that practice is key to achieve automaticity and proficiency and that the use of a hierarchical sequence of actions is the best strategy for capturing the cognitive mechanisms involved in habit formation (including "chunking") and consolidation. The task is a gamified self-instructed and self-paced app on a mobile phone that enables subjects to learn and practice two sequences of finger movements, composed of chords and single presses. It involves a step-wise learning procedure in which subjects begin responding to a visual and auditory cued sequence by generating responses on the screen using four fingers. Such cues progressively disappear throughout 1 month of training, enabling the subject ultimately to master the motor skill involved. We present preliminary data for the acquisition of motor sequence learning in 29 healthy individuals, each trained over a month period. We demonstrate an asymptotic improvement in performance, as well as its automatic nature. We also report how people integrate the task into their daily routine, the development of motor precision throughout training, and the effect of intermittent reinforcement and reward extinction in habit preservation. The findings help to validate this "real world" app for measuring human habits.
Collapse
Affiliation(s)
- Paula Banca
- Department of Psychology, Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, United Kingdom
| | - Daniel McNamee
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry, University College London, London, United Kingdom
| | - Thomas Piercy
- Department of Psychiatry, Addenbrooke’s Hospital, University of Cambridge, Cambridge, United Kingdom
| | - Qiang Luo
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Fudan University, Shanghai, China
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institute of Brain Science and Human Phenome Institute, Fudan University, Shanghai, China
| | - Trevor W. Robbins
- Department of Psychology, Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
26
|
Schreiner DC, Renteria R, Gremel CM. Fractionating the all-or-nothing definition of goal-directed and habitual decision-making. J Neurosci Res 2019; 98:998-1006. [PMID: 31642551 DOI: 10.1002/jnr.24545] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 09/30/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022]
Abstract
Goal-directed and habitual decision-making are fundamental processes that support the ongoing adaptive behavior. There is a growing interest in examining their disruption in psychiatric disease, often with a focus on a disease shifting control from one process to the other, usually a shift from goal-directed to habitual control. However, several different experimental procedures can be used to probe whether decision-making is under goal-directed or habitual control, including outcome devaluation and contingency degradation. These different experimental procedures may recruit diverse behavioral and neural processes. Thus, there are potentially many opportunities for these disease phenotypes to manifest as alterations to both goal-directed and habitual controls. In this review, we highlight the examples of behavioral and neural circuit divergence and similarity, and suggest that interpretation based on behavioral processes recruited during testing may leave more room for goal-directed and habitual decision-making to coexist. Furthermore, this may improve our understanding of precisely what the involved neural mechanisms underlying aspects of goal-directed and habitual behavior are, as well as how disease affects behavior and these circuits.
Collapse
Affiliation(s)
- Drew C Schreiner
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA
| | - Rafael Renteria
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA
| | - Christina M Gremel
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA.,Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
27
|
Garr E, Bushra B, Tu N, Delamater AR. Goal-directed control on interval schedules does not depend on the action-outcome correlation. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-ANIMAL LEARNING AND COGNITION 2019; 46:47-64. [PMID: 31621353 DOI: 10.1037/xan0000229] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
When an organism's action is based on an anticipation of its consequences, that action is said to be goal-directed. It has long been thought that goal-directed control is made possible by experiencing a strong correlation between response rates and reward rates (Dickinson, 1985). To test this idea, we designed a set of experiments to determine whether the response rate-reward rate correlation is a reliable predictor of goal-directed control on interval schedules. In Experiment 1, rats were trained on random interval (RI) schedules in which the response rate-reward rate correlation was manipulated across groups. In tests of reward devaluation, rats behaved in a goal-directed manner regardless of the experienced correlation. In Experiment 2, rats once again experienced either a strong or weak correlation, but on RI schedules with lower overall reward densities. This time, behavior appeared habitual regardless of the experienced correlation. Experiment 3 confirmed that the density of the RI schedule influences goal-directed control, and also revealed that extensive training on these schedules resulted in goal-directed action. Finally, in Experiment 4 goal-directed responding was greater and emerged sooner on fixed than random interval schedules, but, again, was manifest after extensive training on the RI schedule. Taken together, our data suggest that goal-directed and habitual control are not determined by the correlation between response rates and reward rates. We discuss the importance of temporal uncertainty, action-outcome contiguity, and reinforcement probability in goal-directed control. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
Affiliation(s)
- Eric Garr
- Department of Psychology, Graduate Center
| | | | | | | |
Collapse
|
28
|
Garr E. Contributions of the basal ganglia to action sequence learning and performance. Neurosci Biobehav Rev 2019; 107:279-295. [PMID: 31541637 DOI: 10.1016/j.neubiorev.2019.09.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 07/22/2019] [Accepted: 09/11/2019] [Indexed: 12/12/2022]
Abstract
Animals engage in intricately woven and choreographed action sequences that are constructed from trial-and-error learning. The mechanisms by which the brain links together individual actions which are later recalled as fluid chains of behavior are not fully understood, but there is broad consensus that the basal ganglia play a crucial role in this process. This paper presents a comprehensive review of the role of the basal ganglia in action sequencing, with a focus on whether the computational framework of reinforcement learning can capture key behavioral features of sequencing and the neural mechanisms that underlie them. While a simple neurocomputational model of reinforcement learning can capture key features of action sequence learning, this model is not sufficient to capture goal-directed control of sequences or their hierarchical representation. The hierarchical structure of action sequences, in particular, poses a challenge for building better models of action sequencing, and it is in this regard that further investigations into basal ganglia information processing may be informative.
Collapse
Affiliation(s)
- Eric Garr
- Graduate Center, City University of New York, 365 5(th) Avenue, New York, NY 10016, United States.
| |
Collapse
|