1
|
Ghane M, Trambaiolli L, Bertocci MA, Martinez-Rivera FJ, Chase HW, Brady T, Skeba A, Graur S, Bonar L, Iyengar S, Quirk GJ, Rasmussen SA, Haber SN, Phillips ML. Specific Patterns of Endogenous Functional Connectivity Are Associated With Harm Avoidance in Obsessive-Compulsive Disorder. Biol Psychiatry 2024; 96:137-146. [PMID: 38336216 DOI: 10.1016/j.biopsych.2023.12.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/11/2023] [Accepted: 12/06/2023] [Indexed: 02/12/2024]
Abstract
BACKGROUND Individuals with obsessive-compulsive disorder (OCD) show persistent avoidance behaviors, often in the absence of actual threat. Quality-of-life costs and heterogeneity support the need for novel brain-behavior intervention targets. Informed by mechanistic and anatomical studies of persistent avoidance in rodents and nonhuman primates, our goal was to test whether connections within a hypothesized persistent avoidance-related network predicted OCD-related harm avoidance (HA), a trait measure of persistent avoidance. We hypothesized that 1) HA, not an OCD diagnosis, would be associated with altered endogenous connectivity in at least one connection in the network; 2) HA-specific findings would be robust to comorbid symptoms; and 3) reliable findings would replicate in a holdout testing subsample. METHODS Using resting-state functional connectivity magnetic resonance imaging, cross-validated elastic net for feature selection, and Poisson generalized linear models, we tested which connections significantly predicted HA in our training subsample (n = 73; 71.8% female; healthy control group n = 36, OCD group n = 37); robustness to comorbidities; and replicability in a testing subsample (n = 30; 56.7% female; healthy control group n = 15, OCD group n = 15). RESULTS Stronger inverse connectivity between the right dorsal anterior cingulate cortex and right basolateral amygdala and stronger positive connectivity between the right ventral anterior insula and left ventral striatum were associated with greater HA across groups. Network connections did not discriminate OCD diagnostic status or predict HA-correlated traits, suggesting sensitivity to trait HA. The dorsal anterior cingulate cortex-basolateral amygdala relationship was robust to controlling for comorbidities and medication in individuals with OCD and was also predictive of HA in our testing subsample. CONCLUSIONS Stronger inverse dorsal anterior cingulate cortex-basolateral amygdala connectivity was robustly and reliably associated with HA across groups and in OCD. Results support the relevance of a cross-species persistent avoidance-related network to OCD, with implications for precision-based approaches and treatment.
Collapse
Affiliation(s)
- Merage Ghane
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania.
| | - Lucas Trambaiolli
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Boston, Massachusetts
| | - Michele A Bertocci
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | | | - Henry W Chase
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Tyler Brady
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Alex Skeba
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Simona Graur
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Lisa Bonar
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Satish Iyengar
- Department of Statistics, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Gregory J Quirk
- School of Medicine, University of Puerto Rico, San Juan, Puerto Rico
| | - Steven A Rasmussen
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School of Brown University, Providence, Rhode Island
| | - Suzanne N Haber
- Department of Psychiatry, McLean Hospital, Harvard Medical School, Boston, Massachusetts; School of Medicine and Dentistry, University of Rochester Medical Center, Rochester, New York
| | - Mary L Phillips
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| |
Collapse
|
2
|
Stoll FM, Rudebeck PH. Preferences reveal dissociable encoding across prefrontal-limbic circuits. Neuron 2024; 112:2241-2256.e8. [PMID: 38640933 PMCID: PMC11223984 DOI: 10.1016/j.neuron.2024.03.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 12/04/2023] [Accepted: 03/19/2024] [Indexed: 04/21/2024]
Abstract
Individual preferences for the flavor of different foods and fluids exert a strong influence on behavior. Most current theories posit that preferences are integrated with other state variables in the orbitofrontal cortex (OFC), which is thought to derive the relative subjective value of available options to guide choice behavior. Here, we report that instead of a single integrated valuation system in the OFC, another complementary one is centered in the ventrolateral prefrontal cortex (vlPFC) in macaques. Specifically, we found that the OFC and vlPFC preferentially represent outcome flavor and outcome probability, respectively, and that preferences are separately integrated into value representations in these areas. In addition, the vlPFC, but not the OFC, represented the probability of receiving the available outcome flavors separately, with the difference between these representations reflecting the degree of preference for each flavor. Thus, both the vlPFC and OFC exhibit dissociable but complementary representations of subjective value, both of which are necessary for decision-making.
Collapse
Affiliation(s)
- Frederic M Stoll
- Nash Family Department of Neuroscience, Lipschultz Center for Cognitive Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Peter H Rudebeck
- Nash Family Department of Neuroscience, Lipschultz Center for Cognitive Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| |
Collapse
|
3
|
Burk DC, Taswell C, Tang H, Averbeck BB. Computational Mechanisms Underlying Motivation to Earn Symbolic Reinforcers. J Neurosci 2024; 44:e1873232024. [PMID: 38670805 PMCID: PMC11170943 DOI: 10.1523/jneurosci.1873-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 02/27/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Reinforcement learning is a theoretical framework that describes how agents learn to select options that maximize rewards and minimize punishments over time. We often make choices, however, to obtain symbolic reinforcers (e.g., money, points) that are later exchanged for primary reinforcers (e.g., food, drink). Although symbolic reinforcers are ubiquitous in our daily lives, widely used in laboratory tasks because they can be motivating, mechanisms by which they become motivating are less understood. In the present study, we examined how monkeys learn to make choices that maximize fluid rewards through reinforcement with tokens. The question addressed here is how the value of a state, which is a function of multiple task features (e.g., the current number of accumulated tokens, choice options, task epoch, trials since the last delivery of primary reinforcer, etc.), drives value and affects motivation. We constructed a Markov decision process model that computes the value of task states given task features to then correlate with the motivational state of the animal. Fixation times, choice reaction times, and abort frequency were all significantly related to values of task states during the tokens task (n = 5 monkeys, three males and two females). Furthermore, the model makes predictions for how neural responses could change on a moment-by-moment basis relative to changes in the state value. Together, this task and model allow us to capture learning and behavior related to symbolic reinforcement.
Collapse
Affiliation(s)
- Diana C Burk
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892-4415
| | - Craig Taswell
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892-4415
| | - Hua Tang
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892-4415
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892-4415
| |
Collapse
|
4
|
Tang H, Bartolo-Orozco R, Averbeck BB. Ventral frontostriatal circuitry mediates the computation of reinforcement from symbolic gains and losses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.03.587097. [PMID: 38617219 PMCID: PMC11014508 DOI: 10.1101/2024.04.03.587097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Reinforcement learning (RL), particularly in primates, is often driven by symbolic outcomes. However, it is usually studied with primary reinforcers. To examine the neural mechanisms underlying learning from symbolic outcomes, we trained monkeys on a task in which they learned to choose options that led to gains of tokens and avoid choosing options that led to losses of tokens. We then recorded simultaneously from the orbitofrontal cortex (OFC), ventral striatum (VS), amygdala (AMY), and the mediodorsal thalamus (MDt). We found that the OFC played a dominant role in coding token outcomes and token prediction errors. The other areas contributed complementary functions with the VS coding appetitive outcomes and the AMY coding the salience of outcomes. The MDt coded actions and relayed information about tokens between the OFC and VS. Thus, OFC leads the process of symbolic reinforcement learning in the ventral frontostriatal circuitry.
Collapse
|
5
|
Bernklau TW, Righetti B, Mehrke LS, Jacob SN. Striatal dopamine signals reflect perceived cue-action-outcome associations in mice. Nat Neurosci 2024; 27:747-757. [PMID: 38291283 PMCID: PMC11001585 DOI: 10.1038/s41593-023-01567-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 12/21/2023] [Indexed: 02/01/2024]
Abstract
Striatal dopamine drives associative learning by acting as a teaching signal. Much work has focused on simple learning paradigms, including Pavlovian and instrumental learning. However, higher cognition requires that animals generate internal concepts of their environment, where sensory stimuli, actions and outcomes become flexibly associated. Here, we performed fiber photometry dopamine measurements across the striatum of male mice as they learned cue-action-outcome associations based on implicit and changing task rules. Reinforcement learning models of the behavioral and dopamine data showed that rule changes lead to adjustments of learned cue-action-outcome associations. After rule changes, mice discarded learned associations and reset outcome expectations. Cue- and outcome-triggered dopamine signals became uncoupled and dependent on the adopted behavioral strategy. As mice learned the new association, coupling between cue- and outcome-triggered dopamine signals and task performance re-emerged. Our results suggest that dopaminergic reward prediction errors reflect an agent's perceived locus of control.
Collapse
Affiliation(s)
- Tobias W Bernklau
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
- Graduate School of Systemic Neurosciences, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Beatrice Righetti
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Leonie S Mehrke
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Simon N Jacob
- Translational Neurotechnology Laboratory, Department of Neurosurgery, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany.
| |
Collapse
|
6
|
Giarrocco F, Costa VD, Basile BM, Pujara MS, Murray EA, Averbeck BB. Motor System-Dependent Effects of Amygdala and Ventral Striatum Lesions on Explore-Exploit Behaviors. J Neurosci 2024; 44:e1206232023. [PMID: 38296647 PMCID: PMC10860650 DOI: 10.1523/jneurosci.1206-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/17/2023] [Accepted: 11/21/2023] [Indexed: 02/02/2024] Open
Abstract
Deciding whether to forego immediate rewards or explore new opportunities is a key component of flexible behavior and is critical for the survival of the species. Although previous studies have shown that different cortical and subcortical areas, including the amygdala and ventral striatum (VS), are implicated in representing the immediate (exploitative) and future (explorative) value of choices, the effect of the motor system used to make choices has not been examined. Here, we tested male rhesus macaques with amygdala or VS lesions on two versions of a three-arm bandit task where choices were registered with either a saccade or an arm movement. In both tasks we presented the monkeys with explore-exploit tradeoffs by periodically replacing familiar options with novel options that had unknown reward probabilities. We found that monkeys explored more with saccades but showed better learning with arm movements. VS lesions caused the monkeys to be more explorative with arm movements and less explorative with saccades, although this may have been due to an overall decrease in performance. VS lesions affected the monkeys' ability to learn novel stimulus-reward associations in both tasks, while after amygdala lesions this effect was stronger when choices were made with saccades. Further, on average, VS and amygdala lesions reduced the monkeys' ability to choose better options only when choices were made with a saccade. These results show that learning reward value associations to manage explore-exploit behaviors is motor system dependent and they further define the contributions of amygdala and VS to reinforcement learning.
Collapse
Affiliation(s)
- Franco Giarrocco
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
| | - Vincent D Costa
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
- Division of Neuroscience, Oregon National Primate Research Center, Beaverton 97006, OR
| | - Benjamin M Basile
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
- Department of Psychology, Dickinson College, Carlisle 17013, PA
| | - Maia S Pujara
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
| | - Elisabeth A Murray
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda 20892-4415, MD
| |
Collapse
|
7
|
Aguirre CG, Woo JH, Romero-Sosa JL, Rivera ZM, Tejada AN, Munier JJ, Perez J, Goldfarb M, Das K, Gomez M, Ye T, Pannu J, Evans K, O'Neill PR, Spigelman I, Soltani A, Izquierdo A. Dissociable Contributions of Basolateral Amygdala and Ventrolateral Orbitofrontal Cortex to Flexible Learning Under Uncertainty. J Neurosci 2024; 44:e0622232023. [PMID: 37968116 PMCID: PMC10860573 DOI: 10.1523/jneurosci.0622-23.2023] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 11/17/2023] Open
Abstract
Reversal learning measures the ability to form flexible associations between choice outcomes with stimuli and actions that precede them. This type of learning is thought to rely on several cortical and subcortical areas, including the highly interconnected orbitofrontal cortex (OFC) and basolateral amygdala (BLA), and is often impaired in various neuropsychiatric and substance use disorders. However, the unique contributions of these regions to stimulus- and action-based reversal learning have not been systematically compared using a chemogenetic approach particularly before and after the first reversal that introduces new uncertainty. Here, we examined the roles of ventrolateral OFC (vlOFC) and BLA during reversal learning. Male and female rats were prepared with inhibitory designer receptors exclusively activated by designer drugs targeting projection neurons in these regions and tested on a series of deterministic and probabilistic reversals during which they learned about stimulus identity or side (left or right) associated with different reward probabilities. Using a counterbalanced within-subject design, we inhibited these regions prior to reversal sessions. We assessed initial and pre-/post-reversal changes in performance to measure learning and adjustments to reversals, respectively. We found that inhibition of the ventrolateral orbitofrontal cortex (vlOFC), but not BLA, eliminated adjustments to stimulus-based reversals. Inhibition of BLA, but not vlOFC, selectively impaired action-based probabilistic reversal learning, leaving deterministic reversal learning intact. vlOFC exhibited a sex-dependent role in early adjustment to action-based reversals, but not in overall learning. These results reveal dissociable roles for BLA and vlOFC in flexible learning and highlight a more crucial role for BLA in learning meaningful changes in the reward environment.
Collapse
Affiliation(s)
- C G Aguirre
- Department of Psychology, University of California, Los Angeles, California 90095
| | - J H Woo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - J L Romero-Sosa
- Department of Psychology, University of California, Los Angeles, California 90095
| | - Z M Rivera
- Department of Psychology, University of California, Los Angeles, California 90095
| | - A N Tejada
- Department of Psychology, University of California, Los Angeles, California 90095
| | - J J Munier
- Section of Biosystems and Function, School of Dentistry, University of California, Los Angeles, California 90095
| | - J Perez
- Department of Psychology, University of California, Los Angeles, California 90095
| | - M Goldfarb
- Department of Psychology, University of California, Los Angeles, California 90095
| | - K Das
- Department of Psychology, University of California, Los Angeles, California 90095
| | - M Gomez
- Department of Psychology, University of California, Los Angeles, California 90095
| | - T Ye
- Department of Psychology, University of California, Los Angeles, California 90095
| | - J Pannu
- Section of Biosystems and Function, School of Dentistry, University of California, Los Angeles, California 90095
| | - K Evans
- Department of Psychology, University of California, Los Angeles, California 90095
| | - P R O'Neill
- Shirley and Stefan Hatos Center for Neuropharmacology, Department of Psychiatry and Biobehavioral Sciences, University of California Los Angeles, Los Angeles, California 90095
| | - I Spigelman
- Section of Biosystems and Function, School of Dentistry, University of California, Los Angeles, California 90095
| | - A Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - A Izquierdo
- Department of Psychology, University of California, Los Angeles, California 90095
| |
Collapse
|
8
|
Lowet AS, Zheng Q, Meng M, Matias S, Drugowitsch J, Uchida N. An opponent striatal circuit for distributional reinforcement learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.02.573966. [PMID: 38260354 PMCID: PMC10802299 DOI: 10.1101/2024.01.02.573966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Machine learning research has achieved large performance gains on a wide range of tasks by expanding the learning target from mean rewards to entire probability distributions of rewards - an approach known as distributional reinforcement learning (RL)1. The mesolimbic dopamine system is thought to underlie RL in the mammalian brain by updating a representation of mean value in the striatum2,3, but little is known about whether, where, and how neurons in this circuit encode information about higher-order moments of reward distributions4. To fill this gap, we used high-density probes (Neuropixels) to acutely record striatal activity from well-trained, water-restricted mice performing a classical conditioning task in which reward mean, reward variance, and stimulus identity were independently manipulated. In contrast to traditional RL accounts, we found robust evidence for abstract encoding of variance in the striatum. Remarkably, chronic ablation of dopamine inputs disorganized these distributional representations in the striatum without interfering with mean value coding. Two-photon calcium imaging and optogenetics revealed that the two major classes of striatal medium spiny neurons - D1 and D2 MSNs - contributed to this code by preferentially encoding the right and left tails of the reward distribution, respectively. We synthesize these findings into a new model of the striatum and mesolimbic dopamine that harnesses the opponency between D1 and D2 MSNs5-15 to reap the computational benefits of distributional RL.
Collapse
Affiliation(s)
- Adam S Lowet
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Program in Neuroscience, Harvard University, Boston, MA, USA
| | - Qiao Zheng
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Melissa Meng
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Sara Matias
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Jan Drugowitsch
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Naoshige Uchida
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
9
|
Basile BM, Costa VD, Schafroth JL, Karaskiewicz CL, Lucas DR, Murray EA. The amygdala is not necessary for the familiarity aspect of recognition memory. Nat Commun 2023; 14:8109. [PMID: 38062014 PMCID: PMC10703781 DOI: 10.1038/s41467-023-43906-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
Dual-process accounts of item recognition posit two memory processes: slow but detailed recollection, and quick but vague familiarity. It has been proposed, based on prior rodent work, that the amygdala is critical for the familiarity aspect of item recognition. Here, we evaluated this proposal in male rhesus monkeys (Macaca mulatta) with selective bilateral excitotoxic amygdala damage. We used four established visual memory tests designed to assess different aspects of familiarity, all administered on touchscreen computers. Specifically, we assessed monkeys' tendencies to make low-latency false alarms, to make false alarms to recently seen lures, to produce curvilinear ROC curves, and to discriminate stimuli based on repetition across days. Three of the four tests showed no familiarity impairment and the fourth was explained by a deficit in reward processing. Consistent with this, amygdala damage did produce an anticipated deficit in reward processing in a three-arm-bandit gambling task, verifying the effectiveness of the lesions. Together, these results contradict prior rodent work and suggest that the amygdala is not critical for the familiarity aspect of item recognition.
Collapse
Affiliation(s)
- Benjamin M Basile
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA.
- Department of Psychology, Dickinson College, Carlisle, PA, USA.
| | - Vincent D Costa
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
- Division of Neuroscience, Oregon National Primate Research Center, Portland, OR, USA
| | - Jamie L Schafroth
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
- School of Anthropology, University of Arizona, Tucson, AZ, USA
| | - Chloe L Karaskiewicz
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
- Department of Psychology, UC Davis, Davis, CA, USA
| | - Daniel R Lucas
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Elisabeth A Murray
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
10
|
Hattori R, Hedrick NG, Jain A, Chen S, You H, Hattori M, Choi JH, Lim BK, Yasuda R, Komiyama T. Meta-reinforcement learning via orbitofrontal cortex. Nat Neurosci 2023; 26:2182-2191. [PMID: 37957318 PMCID: PMC10689244 DOI: 10.1038/s41593-023-01485-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 10/06/2023] [Indexed: 11/15/2023]
Abstract
The meta-reinforcement learning (meta-RL) framework, which involves RL over multiple timescales, has been successful in training deep RL models that generalize to new environments. It has been hypothesized that the prefrontal cortex may mediate meta-RL in the brain, but the evidence is scarce. Here we show that the orbitofrontal cortex (OFC) mediates meta-RL. We trained mice and deep RL models on a probabilistic reversal learning task across sessions during which they improved their trial-by-trial RL policy through meta-learning. Ca2+/calmodulin-dependent protein kinase II-dependent synaptic plasticity in OFC was necessary for this meta-learning but not for the within-session trial-by-trial RL in experts. After meta-learning, OFC activity robustly encoded value signals, and OFC inactivation impaired the RL behaviors. Longitudinal tracking of OFC activity revealed that meta-learning gradually shapes population value coding to guide the ongoing behavioral policy. Our results indicate that two distinct RL algorithms with distinct neural mechanisms and timescales coexist in OFC to support adaptive decision-making.
Collapse
Affiliation(s)
- Ryoma Hattori
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA.
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA.
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA.
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA.
- Department of Neuroscience, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, University of Florida, Jupiter, FL, USA.
| | - Nathan G Hedrick
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Anant Jain
- Max Planck Florida Institute for Neuroscience, Jupiter, FL, USA
| | - Shuqi Chen
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Hanjia You
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Mariko Hattori
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Jun-Hyeok Choi
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
| | - Byung Kook Lim
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
| | - Ryohei Yasuda
- Max Planck Florida Institute for Neuroscience, Jupiter, FL, USA
| | - Takaki Komiyama
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA.
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA.
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA.
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
11
|
Oyama K, Majima K, Nagai Y, Hori Y, Hirabayashi T, Eldridge MAG, Mimura K, Miyakawa N, Fujimoto A, Hori Y, Iwaoki H, Inoue KI, Saunders RC, Takada M, Yahata N, Higuchi M, Richmond BJ, Minamimoto T. Distinct roles of monkey OFC-subcortical pathways in adaptive behavior. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.17.567492. [PMID: 38076986 PMCID: PMC10705585 DOI: 10.1101/2023.11.17.567492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
To be the most successful, primates must adapt to changing environments and optimize their behavior by making the most beneficial choices. At the core of adaptive behavior is the orbitofrontal cortex (OFC) of the brain, which updates choice value through direct experience or knowledge-based inference. Here, we identify distinct neural circuitry underlying these two separate abilities. We designed two behavioral tasks in which macaque monkeys updated the values of certain items, either by directly experiencing changes in stimulus-reward associations, or by inferring the value of unexperienced items based on the task's rules. Chemogenetic silencing of bilateral OFC combined with mathematical model-fitting analysis revealed that monkey OFC is involved in updating item value based on both experience and inference. In vivo imaging of chemogenetic receptors by positron emission tomography allowed us to map projections from the OFC to the rostromedial caudate nucleus (rmCD) and the medial part of the mediodorsal thalamus (MDm). Chemogenetic silencing of the OFC-rmCD pathway impaired experience-based value updating, while silencing the OFC-MDm pathway impaired inference-based value updating. Our results thus demonstrate a dissociable contribution of distinct OFC projections to different behavioral strategies, and provide new insights into the neural basis of value-based adaptive decision-making in primates.
Collapse
Affiliation(s)
- Kei Oyama
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan
| | - Kei Majima
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba, Japan
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan
| | - Yuji Nagai
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Yukiko Hori
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Toshiyuki Hirabayashi
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Mark A G Eldridge
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Koki Mimura
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
- Research Center for Medical and Health Data Science, The Institute of Statistical Mathematics, Tachikawa, Japan
| | - Naohisa Miyakawa
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Atsushi Fujimoto
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Yuki Hori
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Haruhiko Iwaoki
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Ken-Ichi Inoue
- Systems Neuroscience Section, Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan
| | - Richard C Saunders
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Masahiko Takada
- Systems Neuroscience Section, Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan
| | - Noriaki Yahata
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Makoto Higuchi
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Barry J Richmond
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Takafumi Minamimoto
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| |
Collapse
|
12
|
Aguirre CG, Woo JH, Romero-Sosa JL, Rivera ZM, Tejada AN, Munier JJ, Perez J, Goldfarb M, Das K, Gomez M, Ye T, Pannu J, Evans K, O'Neill PR, Spigelman I, Soltani A, Izquierdo A. Dissociable contributions of basolateral amygdala and ventrolateral orbitofrontal cortex to flexible learning under uncertainty. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.03.535471. [PMID: 37066321 PMCID: PMC10104064 DOI: 10.1101/2023.04.03.535471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Reversal learning measures the ability to form flexible associations between choice outcomes with stimuli and actions that precede them. This type of learning is thought to rely on several cortical and subcortical areas, including highly interconnected orbitofrontal cortex (OFC) and basolateral amygdala (BLA), and is often impaired in various neuropsychiatric and substance use disorders. However, unique contributions of these regions to stimulus- and action-based reversal learning have not been systematically compared using a chemogenetic approach and particularly before and after the first reversal that introduces new uncertainty. Here, we examined the roles of ventrolateral OFC (vlOFC) and BLA during reversal learning. Male and female rats were prepared with inhibitory DREADDs targeting projection neurons in these regions and tested on a series of deterministic and probabilistic reversals during which they learned about stimulus identity or side (left or right) associated with different reward probabilities. Using a counterbalanced within-subject design, we inhibited these regions prior to reversal sessions. We assessed initial and pre-post reversal changes in performance to measure learning and adjustments to reversals, respectively. We found that inhibition of vlOFC, but not BLA, eliminated adjustments to stimulus-based reversals. Inhibition of BLA, but not vlOFC, selectively impaired action-based probabilistic reversal learning, leaving deterministic reversal learning intact. vlOFC exhibited a sex-dependent role in early adjustment to action-based reversals, but not in overall learning. These results reveal dissociable roles for BLA and vlOFC in flexible learning and highlight a more crucial role for BLA in learning meaningful changes in the reward environment.
Collapse
|
13
|
Burk DC, Taswell C, Tang H, Averbeck BB. Computational mechanisms underlying motivation to earn symbolic reinforcers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.11.561900. [PMID: 37873311 PMCID: PMC10592730 DOI: 10.1101/2023.10.11.561900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Reinforcement learning (RL) is a theoretical framework that describes how agents learn to select options that maximize rewards and minimize punishments over time. We often make choices, however, to obtain symbolic reinforcers (e.g. money, points) that can later be exchanged for primary reinforcers (e.g. food, drink). Although symbolic reinforcers are motivating, little is understood about the neural or computational mechanisms underlying the motivation to earn them. In the present study, we examined how monkeys learn to make choices that maximize fluid rewards through reinforcement with tokens. The question addressed here is how the value of a state, which is a function of multiple task features (e.g. current number of accumulated tokens, choice options, task epoch, trials since last delivery of primary reinforcer, etc.), drives value and affects motivation. We constructed a Markov decision process model that computes the value of task states given task features to capture the motivational state of the animal. Fixation times, choice reaction times, and abort frequency were all significantly related to values of task states during the tokens task (n=5 monkeys). Furthermore, the model makes predictions for how neural responses could change on a moment-by-moment basis relative to changes in state value. Together, this task and model allow us to capture learning and behavior related to symbolic reinforcement.
Collapse
Affiliation(s)
- Diana C. Burk
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda MD, 20892-4415
| | - Craig Taswell
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda MD, 20892-4415
| | - Hua Tang
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda MD, 20892-4415
| | - Bruno B. Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda MD, 20892-4415
| |
Collapse
|
14
|
Deng Y, Song D, Ni J, Qing H, Quan Z. Reward prediction error in learning-related behaviors. Front Neurosci 2023; 17:1171612. [PMID: 37662112 PMCID: PMC10471312 DOI: 10.3389/fnins.2023.1171612] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 07/31/2023] [Indexed: 09/05/2023] Open
Abstract
Learning is a complex process, during which our opinions and decisions are easily changed due to unexpected information. But the neural mechanism underlying revision and correction during the learning process remains unclear. For decades, prediction error has been regarded as the core of changes to perception in learning, even driving the learning progress. In this article, we reviewed the concept of reward prediction error, and the encoding mechanism of dopaminergic neurons and the related neural circuities. We also discussed the relationship between reward prediction error and learning-related behaviors, including reversal learning. We then demonstrated the evidence of reward prediction error signals in several neurological diseases, including Parkinson's disease and addiction. These observations may help to better understand the regulatory mechanism of reward prediction error in learning-related behaviors.
Collapse
Affiliation(s)
- Yujun Deng
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Da Song
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Junjun Ni
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Hong Qing
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, Beijing, China
- Department of Biology, Shenzhen MSU-BIT University, Shenzhen, China
| | - Zhenzhen Quan
- Key Laboratory of Molecular Medicine and Biotherapy, School of Life Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
15
|
Taswell CA, Janssen M, Murray EA, Averbeck BB. The motivational role of the ventral striatum and amygdala in learning from gains and losses. Behav Neurosci 2023; 137:268-280. [PMID: 37141014 PMCID: PMC10363235 DOI: 10.1037/bne0000558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
The ventral striatum (VS) and amygdala are two structures often implicated as essential structures for learning. The literature addressing the contribution of these areas to learning, however, is not entirely consistent. We propose that these inconsistencies are due to learning environments and the effect they have on motivation. To differentiate aspects of learning from environmental factors that affect motivation, we ran a series of experiments with varying task factors. We compared monkeys (Macaca mulatta) with VS lesions, amygdala lesions, and unoperated controls on reinforcement learning (RL) tasks that involve learning from both gains and losses as well as from deterministic and stochastic schedules of reinforcement. We found that for all three groups, performance varied by experiment. All three groups modulated their behavior in the same directions, to varying degrees, across the three experiments. This behavioral modulation is why we find deficits in some experiments, but not others. The amount of effort animals exhibited differed depending on the learning environment. Our results suggest that the VS is important for the amount of effort animals will give in rich deterministic and relatively leaner stochastic learning enivornments. We also showed that monkeys with amygdala lesions can learn stimulus-based RL in stochastic environments and environments with loss and conditioned reinforcers. These results show that learning environments shape motivation and that the VS is essential for distinct aspects of motivated behavior. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
- Craig A Taswell
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health
| | - Miriam Janssen
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health
| | - Elisabeth A Murray
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health
| |
Collapse
|
16
|
Woo JH, Aguirre CG, Bari BA, Tsutsui KI, Grabenhorst F, Cohen JY, Schultz W, Izquierdo A, Soltani A. Mechanisms of adjustments to different types of uncertainty in the reward environment across mice and monkeys. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023; 23:600-619. [PMID: 36823249 PMCID: PMC10444905 DOI: 10.3758/s13415-022-01059-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/22/2022] [Indexed: 02/25/2023]
Abstract
Despite being unpredictable and uncertain, reward environments often exhibit certain regularities, and animals navigating these environments try to detect and utilize such regularities to adapt their behavior. However, successful learning requires that animals also adjust to uncertainty associated with those regularities. Here, we analyzed choice data from two comparable dynamic foraging tasks in mice and monkeys to investigate mechanisms underlying adjustments to different types of uncertainty. In these tasks, animals selected between two choice options that delivered reward probabilistically, while baseline reward probabilities changed after a variable number (block) of trials without any cues to the animals. To measure adjustments in behavior, we applied multiple metrics based on information theory that quantify consistency in behavior, and fit choice data using reinforcement learning models. We found that in both species, learning and choice were affected by uncertainty about reward outcomes (in terms of determining the better option) and by expectation about when the environment may change. However, these effects were mediated through different mechanisms. First, more uncertainty about the better option resulted in slower learning and forgetting in mice, whereas it had no significant effect in monkeys. Second, expectation of block switches accompanied slower learning, faster forgetting, and increased stochasticity in choice in mice, whereas it only reduced learning rates in monkeys. Overall, while demonstrating the usefulness of metrics based on information theory in examining adaptive behavior, our study provides evidence for multiple types of adjustments in learning and choice behavior according to uncertainty in the reward environment.
Collapse
Affiliation(s)
- Jae Hyung Woo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Claudia G Aguirre
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bilal A Bari
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Ken-Ichiro Tsutsui
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
- Laboratory of Systems Neuroscience, Tohoku University Graduate School of Life Sciences, Sendai, Japan
| | - Fabian Grabenhorst
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Allen Institute for Neural Dynamics, Seattle, WA, USA
| | - Wolfram Schultz
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
| | - Alicia Izquierdo
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
- The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
17
|
Stoll FM, Rudebeck PH. Preferences reveal separable valuation systems in prefrontal-limbic circuits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.10.540239. [PMID: 37214895 PMCID: PMC10197711 DOI: 10.1101/2023.05.10.540239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Individual preferences for the flavor of different foods and fluids exert a strong influence on behavior. Most current theories posit that preferences are integrated with other state variables in orbitofrontal cortex (OFC), which is thought to derive the relative subjective value of available options to drive choice behavior. Here we report that instead of a single integrated valuation system in OFC, another separate one is centered in ventrolateral prefrontal cortex (vlPFC) in macaque monkeys. Specifically, we found that OFC and vlPFC preferentially represent outcome flavor and outcome probability, respectively, and that preferences are separately integrated into these two aspects of subjective valuation. In addition, vlPFC, but not OFC, represented the outcome probability for the two options separately, with the difference between these representations reflecting the degree of preference. Thus, there are at least two separable valuation systems that work in concert to guide choices and that both are biased by preferences.
Collapse
Affiliation(s)
- Frederic M Stoll
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| | - Peter H Rudebeck
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY, 10029, USA
| |
Collapse
|
18
|
Wen T, Geddert RM, Madlon-Kay S, Egner T. Transfer of Learned Cognitive Flexibility to Novel Stimuli and Task Sets. Psychol Sci 2023; 34:435-454. [PMID: 36693129 PMCID: PMC10236430 DOI: 10.1177/09567976221141854] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 11/03/2022] [Indexed: 01/25/2023] Open
Abstract
Adaptive behavior requires learning about the structure of one's environment to derive optimal action policies, and previous studies have documented transfer of such structural knowledge to bias choices in new environments. Here, we asked whether people could also acquire and transfer more abstract knowledge across different task environments, specifically expectations about cognitive control demands. Over three experiments, participants (Amazon Mechanical Turk workers; N = ~80 adults per group) performed a probabilistic card-sorting task in environments of either a low or high volatility of task rule changes (requiring low or high cognitive flexibility, respectively) before transitioning to a medium-volatility environment. Using reinforcement-learning modeling, we consistently found that previous exposure to high task rule volatilities led to faster adaptation to rule changes in the subsequent transfer phase. These transfers of expectations about cognitive flexibility demands were both task independent (Experiment 2) and stimulus independent (Experiment 3), thus demonstrating the formation and generalization of environmental structure knowledge to guide cognitive control.
Collapse
Affiliation(s)
- Tanya Wen
- Center for Cognitive Neuroscience, Duke
University
| | | | - Seth Madlon-Kay
- Department of Biostatistics and
Bioinformatics, Duke University School of Medicine
| | - Tobias Egner
- Center for Cognitive Neuroscience, Duke
University
- Department of Psychology and
Neuroscience, Duke University
| |
Collapse
|
19
|
Wang X, Liao J, Nan Y, Hu J, Wu Y. Can testosterone modulate prosocial learning in healthy males? A double-blind, placebo-controlled, testosterone administration study. Biol Psychol 2023; 178:108524. [PMID: 36801356 DOI: 10.1016/j.biopsycho.2023.108524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 02/09/2023] [Accepted: 02/15/2023] [Indexed: 02/18/2023]
Abstract
Testosterone is associated with both aggressive and prosocial behavior, which depend on the social context and the trade-off between self- and other-interest. However, little is known about the effects of testosterone on prosocial behavior in a context without such trade-offs. The present study aimed to investigate the effects of exogenous testosterone on prosocial behavior by using a prosocial learning task. Healthy male participants (n =120) received a single dose of testosterone gel in a double-blind, placebo-controlled, between-participants experiment. Participants performed a prosocial learning task in which they were asked to learn to gain rewards for three different recipients, i.e., self, other and computer, by choosing symbols associated with potential rewards. The results showed that testosterone administration increased the learning rates across all the recipient conditions (dother = 1.57; dself = 0.50; dcomputer = 0.99). More importantly, participants in the testosterone group had a higher prosocial learning rate than those in the placebo group (d = 1.57). These findings suggest that testosterone generally enhances reward sensitivity and prosocial learning. The present study corroborates the social status hypothesis, according to which testosterone promotes status-seeking prosocial behavior when it is appropriate to the social context.
Collapse
Affiliation(s)
- Xin Wang
- Department of Applied Social Sciences, Hong Kong Polytechnic University, Hung Hom, Hong Kong; School of Psychology, Shenzhen University, Shenzhen, China
| | - Jiajun Liao
- School of Psychology, South China Normal University, Guangzhou, China
| | - Yu Nan
- School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| | - Jie Hu
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Switzerland
| | - Yin Wu
- Department of Applied Social Sciences, Hong Kong Polytechnic University, Hung Hom, Hong Kong; Research Institute for Sports Science and Technology, Hong Kong Polytechnic University, Hung Hom, Hong Kong.
| |
Collapse
|
20
|
Ludowicy P, Czernochowski D, Arnaez-Telleria J, Gurunandan K, Lachmann T, Paz-Alonso PM. Functional underpinnings of feedback-enhanced test-potentiated encoding. Cereb Cortex 2022; 33:6184-6197. [PMID: 36585773 DOI: 10.1093/cercor/bhac494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 11/14/2022] [Accepted: 11/17/2022] [Indexed: 01/01/2023] Open
Abstract
The testing effect describes the finding that retrieval practice enhances memory performance compared to restudy practice. Prior evidence demonstrates that this effect can be boosted by providing feedback after retrieval attempts (i.e. test-potentiated encoding [TPE]). The present fMRI study investigated the neural processes during successful memory retrieval underlying this beneficial effect of correct answer feedback compared with restudy and whether additional performance feedback leads to further benefits. Twenty-seven participants learned cue-target pairs by (i) restudying, (ii) standard TPE including a restudy opportunity, or (iii) TPE including a restudy opportunity immediately after a positive or negative performance feedback. One day later, a cued retrieval recognition test was performed inside the MRI scanner. Behavioral results confirmed the testing effect and that adding explicit performance feedback-enhanced memory relative to restudy and standard TPE. Stronger functional engagement while retrieving items previously restudied was found in lateral prefrontal cortex and superior parietal lobe. By contrast, lateral temporo-parietal areas were more strongly recruited while retrieving items previously tested. Performance feedback increased the hippocampal activation and resulted in stronger functional coupling between hippocampus, supramarginal gyrus, and ventral striatum with lateral temporo-parietal cortex. Our results unveil the main functional dynamics and connectivity nodes underlying memory benefits from additional performance feedback.
Collapse
Affiliation(s)
- Petra Ludowicy
- Center for Cognitive Science, University of Kaiserslautern, Kaiserslautern 67663, Germany
| | - Daniela Czernochowski
- Center for Cognitive Science, University of Kaiserslautern, Kaiserslautern 67663, Germany
| | - Jaione Arnaez-Telleria
- BCBL-Basque Center on Cognition, Brain and Language, Donostia-San Sebastian 20009, Spain
| | - Kshipra Gurunandan
- BCBL-Basque Center on Cognition, Brain and Language, Donostia-San Sebastian 20009, Spain
| | - Thomas Lachmann
- Center for Cognitive Science, University of Kaiserslautern, Kaiserslautern 67663, Germany.,Facultad de Lenguas y Educación, Centro de Investigación Nebrija en Cognición (CINC), Universidad Nebrija, Madrid 28015, Spain
| | - Pedro M Paz-Alonso
- BCBL-Basque Center on Cognition, Brain and Language, Donostia-San Sebastian 20009, Spain.,Ikerbasque, Basque Foundation for Science, Bilbao 48013, Spain
| |
Collapse
|
21
|
Wang X, Li Z, Sun R, Li X, Guo R, Cui X, Liu B, Li W, Yang Y, Huang X, Qu H, Liu C, Wang Z, Lü Y, Yue C. Zunyimycin C enhances immunity and improves cognitive impairment and its mechanism. Front Cell Infect Microbiol 2022; 12:1081243. [PMID: 36579344 PMCID: PMC9791046 DOI: 10.3389/fcimb.2022.1081243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 11/28/2022] [Indexed: 12/14/2022] Open
Abstract
This study aimed to explore the efficacy of zunyimycin C in the immunological enhancement of hypoimmune mice and improvement of cognitive impairment in a mice model of Alzheimer's disease (AD). Zunyimycin C was administered intranasally to interfere with AD mouse models or gavage to hypoimmune animals. Results of the Morris water maze (MWM) showed that zunyimycin may improve the learning and memory abilities of the AD mice model. The results of differential expression analysis of mRNA levels of inflammatory factors and pathways in brain tissues of the AD mouse model suggested that differential expression was more obvious under Zun-Int L. Western blot revealed that the relative expression of glial fibrillary acidic protein in the brain tissue of the AD mouse model in the Zun-Pre group was significantly higher than that in the other groups, and the difference was statistically significant. The relative expression of interleukin (IL)-6 protein in the brain tissue of mice in the low-dose intervention group was significantly lower than that in the other groups, and the difference was statistically significant. As for hypoimmune animals, short chain fatty acids (SCFAs) assay and intestinal flora assay results showed that zunyimycin C may change intestinal flora diversity and SCFA biosynthesis. The prophylactic administration of zunyimycin C could not inhibit acute neuroinflammation in AD mice. Zunyimycin C may participate in the immune response by activating the Ras-Raf-MEK-ERK signaling pathway to stimulate microglia to produce more inflammatory factors. Zunyimycin C may inhibit autophagy by activating the PI3K-AKT-mTOR signaling pathway, promote cell survival, mediate neuroprotective effects of reactive microglia and reactive astrocytes, and reduce IL-1β in brain tissue and IL-6 secretion, thereby attenuating neuroinflammation in AD mice and achieving the effect of improving learning and memory impairment. Zunyimycin C may play a role in immunological enhancement by changing intestinal flora diversity and SCFAs.
Collapse
Affiliation(s)
- Xuemei Wang
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Zexin Li
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Rui Sun
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Xueli Li
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China,Shaanxi Key Laboratory of Chemical Reaction Engineering, College of Chemistry and Chemical Engineering, Yan’an University, Yan’an, Shaanxi, China
| | - Ruirui Guo
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Xiangyi Cui
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Bingxin Liu
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Wujuan Li
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Yi Yang
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Xiaoyu Huang
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Hanlin Qu
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Chen Liu
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Zhuoling Wang
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China
| | - Yuhong Lü
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China,*Correspondence: Changwu Yue, ; Yuhong Lü,
| | - Changwu Yue
- Yan’an Key Laboratory of Microbial Drug Innovation and Transformation, School of Basic Medicine, Yan’an University, Yan’an, Shaanxi, China,Shaanxi Institute of Basic Sciences (Chemistry and Biology), Northwestern University, Xi’an, Shaanxi, China,*Correspondence: Changwu Yue, ; Yuhong Lü,
| |
Collapse
|
22
|
Kaskan PM, Nicholas MA, Dean AM, Murray EA. Attention to Stimuli of Learned versus Innate Biological Value Relies on Separate Neural Systems. J Neurosci 2022; 42:9242-9252. [PMID: 36319119 PMCID: PMC9761678 DOI: 10.1523/jneurosci.0925-22.2022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 09/25/2022] [Accepted: 10/20/2022] [Indexed: 01/07/2023] Open
Abstract
The neural bases of attention, a set of neural processes that promote behavioral selection, is a subject of intense investigation. In humans, rewarded cues influence attention, even when those cues are irrelevant to the current task. Because the amygdala plays a role in reward processing, and the activity of amygdala neurons has been linked to spatial attention, we reasoned that the amygdala may be essential for attending to rewarded images. To test this possibility, we used an attentional capture task, which provides a quantitative measure of attentional bias. Specifically, we compared reaction times (RTs) of adult male rhesus monkeys with bilateral amygdala lesions and unoperated controls as they made a saccade away from a high- or low-value rewarded image to a peripheral target. We predicted that: (1) RTs will be longer for high- compared with low-value images, revealing attentional capture by rewarded stimuli; and (2) relative to controls, monkeys with amygdala lesions would exhibit shorter RT for high-value images. For comparison, we assessed the same groups of monkeys for attentional capture by images of predators and conspecifics, categories thought to have innate biological value. In performing the attentional capture task, all monkeys were slowed more by high-value relative to low-value rewarded images. Contrary to our prediction, amygdala lesions failed to disrupt this effect. When presented with images of predators and conspecifics, however, monkeys with amygdala lesions showed significantly diminished attentional capture relative to controls. Thus, separate neural pathways are responsible for allocating attention to stimuli with learned versus innate value.SIGNIFICANCE STATEMENT Valuable objects attract attention. The amygdala is known to contribute to reward processing and the encoding of object reward value. We therefore examined whether the amygdala is necessary for allocating attention to rewarded objects. For comparison, we assessed the amygdala's contribution to attending to objects with innate biological value: predators and conspecifics. We found that the macaque amygdala is necessary for directing attention to images with innate biological value, but not for directing attention to recently learned reward-predictive images. These findings indicate that the amygdala makes selective contributions to attending to valuable objects. The data are relevant to mental health disorders, such as social anxiety disorders and small animal phobias, that arise from biased attention to select categories of objects.
Collapse
Affiliation(s)
- Peter M Kaskan
- Leo M. Davidoff Department of Neurological Surgery, Albert Einstein College of Medicine, Bronx, New York 10461
| | - Mark A Nicholas
- Section on Neurobiology of Learning and Memory, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892
| | - Aaron M Dean
- Section on Neurobiology of Learning and Memory, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892
| | - Elisabeth A Murray
- Section on Neurobiology of Learning and Memory, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland 20892
| |
Collapse
|
23
|
Monosov IE, Ogasawara T, Haber SN, Heimel JA, Ahmadlou M. The zona incerta in control of novelty seeking and investigation across species. Curr Opin Neurobiol 2022; 77:102650. [PMID: 36399897 DOI: 10.1016/j.conb.2022.102650] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 10/02/2022] [Accepted: 10/06/2022] [Indexed: 11/17/2022]
Abstract
Many organisms rely on a capacity to rapidly replicate, disperse, and evolve when faced with uncertainty and novelty. But mammals do not evolve and replicate quickly. They rely on a sophisticated nervous system to generate predictions and select responses when confronted with these challenges. An important component of their behavioral repertoire is the adaptive context-dependent seeking or avoiding of perceptually novel objects, even when their values have not yet been learned. Here, we outline recent cross-species breakthroughs that shed light on how the zona incerta (ZI), a relatively evolutionarily conserved brain area, supports novelty-seeking and novelty-related investigations. We then conjecture how the architecture of the ZI's anatomical connectivity - the wide-ranging top-down cortical inputs to the ZI, and its specifically strong outputs to both the brainstem action controllers and to brain areas involved in action value learning - place the ZI in a unique role at the intersection of cognitive control and learning.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Takaya Ogasawara
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, 63110, USA
| | - Suzanne N Haber
- Department of Pharmacology and Physiology, University of Rochester School of Medicine & Dentistry, Rochester, NY, 14642, USA; Department of Psychiatry, McLean Hospital, Harvard Medical School, Belmont, MA, 02478, USA
| | - J Alexander Heimel
- Circuits Structure and Function Group, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA, Amsterdam, the Netherlands
| | - Mehran Ahmadlou
- Circuits Structure and Function Group, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA, Amsterdam, the Netherlands; Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, 25 Howland St., W1T4JG London, UK
| |
Collapse
|
24
|
Prospective and retrospective values integrated in frontal cortex drive predictive choice. Proc Natl Acad Sci U S A 2022; 119:e2206067119. [PMID: 36417435 PMCID: PMC9889848 DOI: 10.1073/pnas.2206067119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
To make a deliberate action in a volatile environment, the brain must frequently reassess the value of each action (action-value). Choice can be initially made from the experience of trial-and-errors, but once the dynamics of the environment is learned, the choice can be made from the knowledge of the environment. The action-values constructed from the experience (retrospective value) and the ones from the knowledge (prospective value) were identified in various regions of the brain. However, how and which neural circuit integrates these values and executes the chosen action remains unknown. Combining reinforcement learning and two-photon calcium imaging, we found that the preparatory activity of neurons in a part of the frontal cortex, the anterior-lateral motor (ALM) area, initially encodes retrospective value, but after extensive training, they jointly encode the retrospective and prospective value. Optogenetic inhibition of ALM preparatory activity specifically abolished the expert mice's predictive choice behavior and returned them to the novice-like state. Thus, the integrated action-value encoded in the preparatory activity of ALM plays an important role to bias the action toward the knowledge-dependent, predictive choice behavior.
Collapse
|
25
|
Tesli N, Bell C, Hjell G, Fischer-Vieler T, I Maximov I, Richard G, Tesli M, Melle I, Andreassen OA, Agartz I, Westlye LT, Friestad C, Haukvik UK, Rokicki J. The age of violence: Mapping brain age in psychosis and psychopathy. Neuroimage Clin 2022; 36:103181. [PMID: 36088844 PMCID: PMC9474919 DOI: 10.1016/j.nicl.2022.103181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 07/31/2022] [Accepted: 08/30/2022] [Indexed: 12/14/2022]
Abstract
Young chronological age is one of the strongest predictors for antisocial behaviour in the general population and for violent offending in individuals with psychotic disorders. An individual's age can be predicted with high accuracy using neuroimaging and machine-learning. The deviation between predicted and chronological age, i.e., brain age gap (BAG) has been suggested to reflect brain health, likely relating partly to neurodevelopmental and aging-related processes and specific disease mechanisms. Higher BAG has been demonstrated in patients with psychotic disorders. However, little is known about the brain-age in violent offenders with psychosis and the possible associations with psychopathy traits. We estimated brain-age in 782 male individuals using T1-weighted MRI scans. Three machine learning models (random forest, extreme gradient boosting with and without hyper parameter tuning) were first trained and tested on healthy controls (HC, n = 586). The obtained BAGs were compared between HC and age matched violent offenders with psychosis (PSY-V, n = 38), violent offenders without psychosis (NPV, n = 20) and non-violent psychosis patients (PSY-NV, n = 138). We ran additional comparisons between BAG of PSY-V and PSY-NV and associations with Positive and Negative Syndrome Scale (PANSS) total score as a measure of psychosis symptoms. Psychopathy traits in the violence groups were assessed with Psychopathy Checklist-revised (PCL-R) and investigated for associations with BAG. We found significantly higher BAG in PSY-V compared with HC (4.9 years, Cohen'sd = 0.87) and in PSY-NV compared with HC (2.7 years, d = 0.41). Total PCL-R scores were negatively associated with BAG in the violence groups (d = 1.17, p < 0.05). Additionally, there was a positive association between psychosis symptoms and BAG in the psychosis groups (d = 1.12, p < 0.05). While the significant BAG differences related to psychosis and not violence suggest larger BAG for psychosis, the negative associations between BAG and psychopathy suggest a complex interplay with psychopathy traits. This proof-of-concept application of brain age prediction in severe mental disorders with a history of violence and psychopathy traits should be tested and replicated in larger samples.
Collapse
Affiliation(s)
- Natalia Tesli
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Christina Bell
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Psychiatry, Oslo University Hospital, Oslo, Norway
| | - Gabriela Hjell
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Psychiatry, Østfold Hospital Trust, Graalum, Norway
| | - Thomas Fischer-Vieler
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Division of Mental Health and Addiction, Vestre Viken Hospital Trust, Drammen, Norway
| | - Ivan I Maximov
- Department of Health and Functioning, Western Norway University of Applied Sciences, Bergen, Norway
| | - Genevieve Richard
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Martin Tesli
- Department of Mental Disorders, Norwegian Institute of Public Health, Oslo, Norway; Centre of Research and Education in Forensic Psychiatry, Oslo University Hospital, Oslo, Norway
| | - Ingrid Melle
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway; Department of Adult Psychiatry, Institute of Clinical Medicine, University of Oslo, Norway
| | - Ole A Andreassen
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Ingrid Agartz
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Psychiatric Research, Diakonhjemmet Hospital, Oslo, Norway; Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet & Stockholm Health Care Services, Stockholm County Council, Stockholm, Sweden
| | - Lars T Westlye
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Psychology, University of Oslo, Oslo, Norway
| | - Christine Friestad
- Centre of Research and Education in Forensic Psychiatry, Oslo University Hospital, Oslo, Norway; University College of Norwegian Correctional Service, Oslo, Norway
| | - Unn K Haukvik
- Norwegian Centre for Mental Disorders Research (NORMENT), Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway; Department of Psychiatry, Oslo University Hospital, Oslo, Norway; Centre of Research and Education in Forensic Psychiatry, Oslo University Hospital, Oslo, Norway
| | - Jaroslav Rokicki
- Norwegian Centre for Mental Disorders Research (NORMENT), Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway; Centre of Research and Education in Forensic Psychiatry, Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
26
|
Janssen M, LeWarne C, Burk D, Averbeck BB. Hierarchical Reinforcement Learning, Sequential Behavior, and the Dorsal Frontostriatal System. J Cogn Neurosci 2022; 34:1307-1325. [PMID: 35579977 PMCID: PMC9274316 DOI: 10.1162/jocn_a_01869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
To effectively behave within ever-changing environments, biological agents must learn and act at varying hierarchical levels such that a complex task may be broken down into more tractable subtasks. Hierarchical reinforcement learning (HRL) is a computational framework that provides an understanding of this process by combining sequential actions into one temporally extended unit called an option. However, there are still open questions within the HRL framework, including how options are formed and how HRL mechanisms might be realized within the brain. In this review, we propose that the existing human motor sequence literature can aid in understanding both of these questions. We give specific emphasis to visuomotor sequence learning tasks such as the discrete sequence production task and the M × N (M steps × N sets) task to understand how hierarchical learning and behavior manifest across sequential action tasks as well as how the dorsal cortical-subcortical circuitry could support this kind of behavior. This review highlights how motor chunks within a motor sequence can function as HRL options. Furthermore, we aim to merge findings from motor sequence literature with reinforcement learning perspectives to inform experimental design in each respective subfield.
Collapse
Affiliation(s)
| | | | - Diana Burk
- National Institute of Mental Health, Bethesda, MD
| | | |
Collapse
|
27
|
Pruning recurrent neural networks replicates adolescent changes in working memory and reinforcement learning. Proc Natl Acad Sci U S A 2022; 119:e2121331119. [PMID: 35622896 DOI: 10.1073/pnas.2121331119] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
SignificanceAdolescence is a period during which there are important changes in behavior and the structure of the brain. In this manuscript, we use theoretical modeling to show how improvements in working memory and reinforcement learning that occur during adolescence can be explained by the reduction in synaptic connectivity in prefrontal cortex that occurs during a similar period. We train recurrent neural networks to solve working memory and reinforcement learning tasks and show that when we prune connectivity in these networks, they perform the tasks better. The improvement in task performance, however, can come at the cost of flexibility as the pruned networks are not able to learn some new tasks as well.
Collapse
|
28
|
Zhang K, Bromberg-Martin ES, Sogukpinar F, Kocher K, Monosov IE. Surprise and recency in novelty detection in the primate brain. Curr Biol 2022; 32:2160-2173.e6. [DOI: 10.1016/j.cub.2022.03.064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 02/28/2022] [Accepted: 03/24/2022] [Indexed: 11/16/2022]
|
29
|
Groman SM, Thompson SL, Lee D, Taylor JR. Reinforcement learning detuned in addiction: integrative and translational approaches. Trends Neurosci 2022; 45:96-105. [PMID: 34920884 PMCID: PMC8770604 DOI: 10.1016/j.tins.2021.11.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 11/04/2021] [Accepted: 11/19/2021] [Indexed: 02/03/2023]
Abstract
Suboptimal decision-making strategies have been proposed to contribute to the pathophysiology of addiction. Decision-making, however, arises from a collection of computational components that can independently influence behavior. Disruptions in these different components can lead to decision-making deficits that appear similar behaviorally, but differ at the computational, and likely the neurobiological, level. Here, we discuss recent studies that have used computational approaches to investigate the decision-making processes underlying addiction. Studies in animal models have found that value updating following positive, but not negative, outcomes is predictive of drug use, whereas value updating following negative, but not positive, outcomes is disrupted following drug self-administration. We contextualize these findings with studies on the circuit and biological mechanisms of decision-making to develop a framework for revealing the biobehavioral mechanisms of addiction.
Collapse
Affiliation(s)
- Stephanie M. Groman
- Department of Neuroscience, University of Minnesota,Department of Psychiatry, Yale University,Correspondence to be directed to: Stephanie Groman, 321 Church Street SE, 4-125 Jackson Hall Minneapolis MN 55455,
| | | | - Daeyeol Lee
- The Zanvyl Krieger Mind/Brain Institute, The Solomon H Snyder Department of Neuroscience, Department of Psychological and Brain Sciences, Kavli Neuroscience Discovery Institute, Johns Hopkins University
| | - Jane R. Taylor
- Department of Psychiatry, Yale University,Department of Neuroscience, Yale University,Department of Psychology, Yale University
| |
Collapse
|
30
|
Murray EA, Fellows LK. Prefrontal cortex interactions with the amygdala in primates. Neuropsychopharmacology 2022; 47:163-179. [PMID: 34446829 PMCID: PMC8616954 DOI: 10.1038/s41386-021-01128-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 07/21/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023]
Abstract
This review addresses functional interactions between the primate prefrontal cortex (PFC) and the amygdala, with emphasis on their contributions to behavior and cognition. The interplay between these two telencephalic structures contributes to adaptive behavior and to the evolutionary success of all primate species. In our species, dysfunction in this circuitry creates vulnerabilities to psychopathologies. Here, we describe amygdala-PFC contributions to behaviors that have direct relevance to Darwinian fitness: learned approach and avoidance, foraging, predator defense, and social signaling, which have in common the need for flexibility and sensitivity to specific and rapidly changing contexts. Examples include the prediction of positive outcomes, such as food availability, food desirability, and various social rewards, or of negative outcomes, such as threats of harm from predators or conspecifics. To promote fitness optimally, these stimulus-outcome associations need to be rapidly updated when an associative contingency changes or when the value of a predicted outcome changes. We review evidence from nonhuman primates implicating the PFC, the amygdala, and their functional interactions in these processes, with links to experimental work and clinical findings in humans where possible.
Collapse
Affiliation(s)
| | - Lesley K Fellows
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| |
Collapse
|
31
|
Birnie MT, Levis SC, Mahler SV, Baram TZ. Developmental Trajectories of Anhedonia in Preclinical Models. Curr Top Behav Neurosci 2022; 58:23-41. [PMID: 35156184 DOI: 10.1007/7854_2021_299] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
This chapter discusses how the complex concept of anhedonia can be operationalized and studied in preclinical models. It provides information about the development of anhedonia in the context of early-life adversity, and the power of preclinical models to tease out the diverse molecular, epigenetic, and network mechanisms that are responsible for anhedonia-like behaviors.Specifically, we first discuss the term anhedonia, reviewing the conceptual components underlying reward-related behaviors and distinguish anhedonia pertaining to deficits in motivational versus consummatory behaviors. We then describe the repertoire of experimental approaches employed to study anhedonia-like behaviors in preclinical models, and the progressive refinement over the past decade of both experimental instruments (e.g., chemogenetics, optogenetics) and conceptual constructs (salience, valence, conflict). We follow with an overview of the state of current knowledge of brain circuits, nodes, and projections that execute distinct aspects of hedonic-like behaviors, as well as neurotransmitters, modulators, and receptors involved in the generation of anhedonia-like behaviors. Finally, we discuss the special case of anhedonia that arises following early-life adversity as an eloquent example enabling the study of causality, mechanisms, and sex dependence of anhedonia.Together, this chapter highlights the power, potential, and limitations of using preclinical models to advance our understanding of the origin and mechanisms of anhedonia and to discover potential targets for its prevention and mitigation.
Collapse
Affiliation(s)
- Matthew T Birnie
- Departments of Anatomy/Neurobiology and Pediatrics, University of California-Irvine, Irvine, CA, USA
| | - Sophia C Levis
- Departments of Anatomy/Neurobiology and Neurobiology/Behavior, University of California-Irvine, Irvine, CA, USA
| | - Stephen V Mahler
- Department of Neurobiology and Behavior, University of California-Irvine, Irvine, CA, USA
| | - Tallie Z Baram
- Departments of Anatomy/Neurobiology and Pediatrics, University of California-Irvine, Irvine, CA, USA.
| |
Collapse
|
32
|
Averbeck B, O'Doherty JP. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology 2022; 47:147-162. [PMID: 34354249 PMCID: PMC8616931 DOI: 10.1038/s41386-021-01108-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 07/06/2021] [Accepted: 07/09/2021] [Indexed: 01/03/2023]
Abstract
We review the current state of knowledge on the computational and neural mechanisms of reinforcement-learning with a particular focus on fronto-striatal circuits. We divide the literature in this area into five broad research themes: the target of the learning-whether it be learning about the value of stimuli or about the value of actions; the nature and complexity of the algorithm used to drive the learning and inference process; how learned values get converted into choices and associated actions; the nature of state representations, and of other cognitive machinery that support the implementation of various reinforcement-learning operations. An emerging fifth area focuses on how the brain allocates or arbitrates control over different reinforcement-learning sub-systems or "experts". We will outline what is known about the role of the prefrontal cortex and striatum in implementing each of these functions. We then conclude by arguing that it will be necessary to build bridges from algorithmic level descriptions of computational reinforcement-learning to implementational level models to better understand how reinforcement-learning emerges from multiple distributed neural networks in the brain.
Collapse
Affiliation(s)
| | - John P O'Doherty
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
33
|
Kangas BD, Der-Avakian A, Pizzagalli DA. Probabilistic Reinforcement Learning and Anhedonia. Curr Top Behav Neurosci 2022; 58:355-377. [PMID: 35435644 DOI: 10.1007/7854_2022_349] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Despite the prominence of anhedonic symptoms associated with diverse neuropsychiatric conditions, there are currently no approved therapeutics designed to attenuate the loss of responsivity to previously rewarding stimuli. However, the search for improved treatment options for anhedonia has been reinvigorated by a recent reconceptualization of the very construct of anhedonia, including within the Research Domain Criteria (RDoC) initiative. This chapter will focus on the RDoC Positive Valence Systems construct of reward learning generally and sub-construct of probabilistic reinforcement learning specifically. The general framework emphasizes objective measurement of a subject's responsivity to reward via reinforcement learning under asymmetrical probabilistic contingencies as a means to quantify reward learning. Indeed, blunted reward responsiveness and reward learning are central features of anhedonia and have been repeatedly described in major depression. Moreover, these probabilistic reinforcement techniques can also reveal neurobiological mechanisms to aid development of innovative treatment approaches. In this chapter, we describe how investigating reward learning can improve our understanding of anhedonia via the four RDoC-recommended tasks that have been used to probe sensitivity to probabilistic reinforcement contingencies and how such task performance is disrupted in various neuropsychiatric conditions. We also illustrate how reverse translational approaches of probabilistic reinforcement assays in laboratory animals can inform understanding of pharmacological and physiological mechanisms. Next, we briefly summarize the neurobiology of probabilistic reinforcement learning, with a focus on the prefrontal cortex, anterior cingulate cortex, striatum, and amygdala. Finally, we discuss treatment implications and future directions in this burgeoning area.
Collapse
Affiliation(s)
- Brian D Kangas
- Harvard Medical School, McLean Hospital, Belmont, MA, USA.
| | | | | |
Collapse
|
34
|
Taylor BK, Frenzel MR, Eastman JA, Embury CM, Agcaoglu O, Wang YP, Stephen JM, Calhoun VD, Wilson TW. Individual differences in amygdala volumes predict changes in functional connectivity between subcortical and cognitive control networks throughout adolescence. Neuroimage 2021; 247:118852. [PMID: 34954025 PMCID: PMC8822500 DOI: 10.1016/j.neuroimage.2021.118852] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 12/01/2021] [Accepted: 12/21/2021] [Indexed: 11/23/2022] Open
Abstract
Adolescence is a critical period of structural and functional neural maturation among regions serving the cognitive control of emotion. Evidence suggests that this process is guided by developmental changes in amygdala and striatum structure and shifts in functional connectivity between subcortical (SC) and cognitive control (CC) networks. Herein, we investigate the extent to which such developmental shifts in structure and function reciprocally predict one another over time. 179 youth (9–15 years-old) completed annual MRI scans for three years. Amygdala and striatum volumes and connectivity within and between SC and CC resting state networks were measured for each year. We tested for reciprocal predictability of within-person and between-person changes in structure and function using random-intercept cross-lagged panel models. Within-person shifts in amygdala volumes in a given year significantly and specifically predicted deviations in SC-CC connectivity in the following year, such that an increase in volume was associated with decreased SC-CC connectivity the following year. Deviations in connectivity did not predict changes in amygdala volumes over time. Conversely, broader group-level shifts in SC-CC connectivity were predictive of subsequent deviations in striatal volumes. We did not see any cross-predictability among amygdala or striatum volumes and within-network connectivity measures. Within-person shifts in amygdala structure year-to-year robustly predicted weaker SC-CC connectivity in subsequent years, whereas broader increases in SC-CC connectivity predicted smaller striatal volumes over time. These specific structure function relationships may contribute to the development of emotional control across adolescence.
Collapse
Affiliation(s)
- Brittany K Taylor
- Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE, USA; Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE, USA.
| | - Michaela R Frenzel
- Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE, USA
| | - Jacob A Eastman
- Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE, USA
| | - Christine M Embury
- Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE, USA; Department of Psychology, University of Nebraska at Omaha, Omaha, NE, USA
| | - Oktay Agcaoglu
- Tri-institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, and Emory University, Atlanta, GA, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, USA
| | | | - Vince D Calhoun
- Tri-institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, and Emory University, Atlanta, GA, USA; Mind Research Network, Albuquerque, NM, USA
| | - Tony W Wilson
- Institute for Human Neuroscience, Boys Town National Research Hospital, Boys Town, NE, USA; Department of Pharmacology and Neuroscience, Creighton University, Omaha, NE, USA
| |
Collapse
|
35
|
Piray P, Daw ND. A model for learning based on the joint estimation of stochasticity and volatility. Nat Commun 2021; 12:6587. [PMID: 34782597 PMCID: PMC8592992 DOI: 10.1038/s41467-021-26731-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 10/08/2021] [Indexed: 02/08/2023] Open
Abstract
Previous research has stressed the importance of uncertainty for controlling the speed of learning, and how such control depends on the learner inferring the noise properties of the environment, especially volatility: the speed of change. However, learning rates are jointly determined by the comparison between volatility and a second factor, moment-to-moment stochasticity. Yet much previous research has focused on simplified cases corresponding to estimation of either factor alone. Here, we introduce a learning model, in which both factors are learned simultaneously from experience, and use the model to simulate human and animal data across many seemingly disparate neuroscientific and behavioral phenomena. By considering the full problem of joint estimation, we highlight a set of previously unappreciated issues, arising from the mutual interdependence of inference about volatility and stochasticity. This interdependence complicates and enriches the interpretation of previous results, such as pathological learning in individuals with anxiety and following amygdala damage.
Collapse
Affiliation(s)
- Payam Piray
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Nathaniel D Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
36
|
Trepka E, Spitmaan M, Bari BA, Costa VD, Cohen JY, Soltani A. Entropy-based metrics for predicting choice behavior based on local response to reward. Nat Commun 2021; 12:6567. [PMID: 34772943 PMCID: PMC8590026 DOI: 10.1038/s41467-021-26784-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 10/18/2021] [Indexed: 11/16/2022] Open
Abstract
For decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching behavior. To address this, we developed metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct more accurate models of choice. Together, our entropy-based metrics provide a model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms.
Collapse
Affiliation(s)
- Ethan Trepka
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Mehran Spitmaan
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Bilal A Bari
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Brain Science Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Vincent D Costa
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Brain Science Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
37
|
Riels K, Ramos Campagnoli R, Thigpen N, Keil A. Oscillatory brain activity links experience to expectancy during associative learning. Psychophysiology 2021; 59:e13946. [PMID: 34622471 PMCID: PMC10150413 DOI: 10.1111/psyp.13946] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 08/20/2021] [Accepted: 08/30/2021] [Indexed: 01/23/2023]
Abstract
Associating a novel situation with a specific outcome involves a cascade of cognitive processes, including selecting relevant stimuli, forming predictions regarding expected outcomes, and updating memorized predictions based on experience. The present manuscript uses computational modeling and machine learning to test the hypothesis that alpha-band (8-12 Hz) oscillations are involved in the updating of expectations based on experience. Participants learned that a visual cue predicted an aversive loud noise with a probability of 50%. The Rescorla-Wagner model of associative learning explained trial-wise changes in self-reported noise expectancy as well as alpha power changes. Experience in the past trial and self-reported expectancy for the subsequent trial were accurately decoded based on the topographical distribution of alpha power at specific latencies. Decodable information during initial association formation and contingency report recurred when viewing the conditioned cue. Findings support the idea that alpha oscillations have multiple, temporally specific, roles in the formation of associations between cues and outcomes.
Collapse
Affiliation(s)
- Kierstin Riels
- Department of Psychology, University of Florida, Gainesville, Florida, USA
| | - Rafaela Ramos Campagnoli
- Department of Neurobiology, Institute of Biology, Universidade Federal Fluminense, Niterói, Brazil
| | - Nina Thigpen
- Department of Psychology, University of Florida, Gainesville, Florida, USA
| | - Andreas Keil
- Department of Psychology, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
38
|
Lockwood PL, Klein-Flügge MC. Computational modelling of social cognition and behaviour-a reinforcement learning primer. Soc Cogn Affect Neurosci 2021; 16:761-771. [PMID: 32232358 PMCID: PMC8343561 DOI: 10.1093/scan/nsaa040] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 02/07/2020] [Accepted: 03/18/2020] [Indexed: 02/06/2023] Open
Abstract
Social neuroscience aims to describe the neural systems that underpin social cognition and behaviour. Over the past decade, researchers have begun to combine computational models with neuroimaging to link social computations to the brain. Inspired by approaches from reinforcement learning theory, which describes how decisions are driven by the unexpectedness of outcomes, accounts of the neural basis of prosocial learning, observational learning, mentalizing and impression formation have been developed. Here we provide an introduction for researchers who wish to use these models in their studies. We consider both theoretical and practical issues related to their implementation, with a focus on specific examples from the field.
Collapse
Affiliation(s)
- Patricia L Lockwood
- Department of Experimental Psychology, University of Oxford, Oxford OX1 3PH, United Kingdom
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX1 3PH, United Kingdom
| | - Miriam C Klein-Flügge
- Department of Experimental Psychology, University of Oxford, Oxford OX1 3PH, United Kingdom
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX1 3PH, United Kingdom
| |
Collapse
|
39
|
Putnam PT, Chang SWC. Toward a holistic view of value and social processing in the amygdala: Insights from primate behavioral neurophysiology. Behav Brain Res 2021; 411:113356. [PMID: 33989727 PMCID: PMC8238892 DOI: 10.1016/j.bbr.2021.113356] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Revised: 05/05/2021] [Accepted: 05/09/2021] [Indexed: 11/22/2022]
Abstract
Located medially within the temporal lobes, the amygdala is a formation of heterogenous nuclei that has emerged as a target for investigations into the neural bases of both primitive and complex behaviors. Although modern neuroscience has eschewed the practice of assigning broad functions to distinct brain regions, the amygdala has classically been associated with regulating negative emotional processes (such as fear or aggression), primarily through research performed in rodent models. Contemporary studies, particularly those in non-human primate models, have provided evidence for a role of the amygdala in other aspects of cognition such as valuation of stimuli or shaping social behaviors. Consequently, many modern perspectives now also emphasize the amygdala's role in processing positive affect and social behaviors. Importantly, several recent experiments have examined the intersection of two seemingly autonomous domains; how both valence/value and social stimuli are simultaneously represented in the amygdala. Results from these studies suggest that there is an overlap between valence/value processing and the processing of social behaviors at the level of single neurons. These findings have prompted researchers investigating the neurophysiological mechanisms underlying social interactions to question what contributions reward-related processes in the amygdala make in shaping social behaviors. In this review, we will examine evidence, primarily from primate neurophysiology, suggesting that value-related processes in the amygdala interact with the processing of social stimuli, and explore holistic hypotheses about how these amygdalar interactions might be instantiated.
Collapse
Affiliation(s)
- Philip T Putnam
- Department of Psychology, Yale University, New Haven, CT, 06520, United States.
| | - Steve W C Chang
- Department of Psychology, Yale University, New Haven, CT, 06520, United States; Department of Neuroscience, Yale University School of Medicine, New Haven, CT, 06510, United States; Kavli Institute for Neuroscience, Yale University School of Medicine, New Haven, CT, 06511, United States
| |
Collapse
|
40
|
Abstract
Although rodent research provides important insights into neural correlates of human psychology, new cortical areas, connections, and cognitive abilities emerged during primate evolution, including human evolution. Comparison of human brains with those of nonhuman primates reveals two aspects of human brain evolution particularly relevant to emotional disorders: expansion of homotypical association areas and expansion of the hippocampus. Two uniquely human cognitive capacities link these phylogenetic developments with emotion: a subjective sense of participating in and reexperiencing remembered events and a limitless capacity to imagine details of future events. These abilities provided evolving humans with selective advantages, but they also created proclivities for emotional problems. The first capacity evokes the "reliving" of past events in the "here-and-now," accompanied by emotional responses that occurred during memory encoding. It contributes to risk for stress-related syndromes, such as posttraumatic stress disorder. The second capacity, an ability to imagine future events without temporal limitations, facilitates flexible, goal-related behavior by drawing on and creating a uniquely rich array of mental representations. It promotes goal achievement and reduces errors, but the mental construction of future events also contributes to developmental aspects of anxiety and mood disorders. With maturation of homotypical association areas, the concrete concerns of childhood expand to encompass the abstract apprehensions of adolescence and adulthood. These cognitive capacities and their dysfunction are amenable to a research agenda that melds experimental therapeutic interventions, cognitive neuropsychology, and developmental psychology in both humans and nonhuman primates.
Collapse
Affiliation(s)
- Daniel S. Pine
- Section on Development and Affective Neuroscience, Emotion and Development Branch, National Institute of Mental Health, Bethesda, MD 20892
| | - Steven P. Wise
- Olschefskie Institute for the Neurobiology of Knowledge, Bethesda, MD 20814
| | - Elisabeth A. Murray
- Section on the Neurobiology of Learning and Memory, Laboratory of Neuropsychology, National Institute of Mental Health, Bethesda, MD 20892
| |
Collapse
|
41
|
Han MJ, Park CU, Kang S, Kim B, Nikolaidis A, Milham MP, Hong SJ, Kim SG, Baeg E. Mapping functional gradients of the striatal circuit using simultaneous microelectric stimulation and ultrahigh-field fMRI in non-human primates. Neuroimage 2021; 236:118077. [PMID: 33878384 DOI: 10.1016/j.neuroimage.2021.118077] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/26/2021] [Accepted: 04/07/2021] [Indexed: 02/07/2023] Open
Abstract
Advances in functional magnetic resonance imaging (fMRI) have significantly enhanced our understanding of the striatal system of both humans and non-human primates (NHP) over the last few decades. However, its circuit-level functional anatomy remains poorly understood, partly because in-vivo fMRI cannot directly perturb a brain system and map its casual input-output relationship. Also, routine 3T fMRI has an insufficient spatial resolution. We performed electrical microstimulation (EM) of the striatum in lightly-anesthetized NHPs while simultaneously mapping whole-brain activation, using contrast-enhanced fMRI at ultra-high-field 7T. By stimulating multiple positions along the striatum's main (dorsal-to-ventral) axis, we revealed its complex functional circuit concerning mutually connected subsystems in both cortical and subcortical areas. Indeed, within the striatum, there were distinct brain activation patterns across different stimulation sites. Specifically, dorsal stimulation revealed a medial-to-lateral elongated shape of activation in upper caudate and putamen areas, whereas ventral stimulation evoked areas confined to the medial and lower caudate. Such dorsoventral gradients also appeared in neocortical and thalamic activations, indicating consistent embedding profiles of the striatal system across the whole brain. These findings reflect different forms of within-circuit and inter-regional neuronal connectivity between the dorsal and ventromedial striatum. These patterns both shared and contrasted with previous anatomical tract-tracing and in-vivo resting-state fMRI studies. Our approach of combining microstimulation and whole-brain fMRI mapping in NHPs provides a unique opportunity to integrate our understanding of a targeted brain area's meso- and macro-scale functional systems.
Collapse
Affiliation(s)
- Min-Jun Han
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Republic of Korea
| | - Chan-Ung Park
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Republic of Korea
| | - Sangyun Kang
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Republic of Korea
| | - Byounghoon Kim
- Neuroscience, University of Wisconsin - Madison, Madison, WI, United States
| | - Aki Nikolaidis
- Center for the Developing Brain, Child Mind Institute, New York, NY, United States
| | - Michael P Milham
- Center for the Developing Brain, Child Mind Institute, New York, NY, United States; Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute, New York, NY, United States
| | - Seok Jun Hong
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Republic of Korea,; Center for the Developing Brain, Child Mind Institute, New York, NY, United States
| | - Seong-Gi Kim
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Republic of Korea,.
| | - Eunha Baeg
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, Republic of Korea; Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Republic of Korea,.
| |
Collapse
|
42
|
Tavares TP, Mitchell DGV, Coleman KKL, Finger E. Neural correlates of reversal learning in frontotemporal dementia. Cortex 2021; 143:92-108. [PMID: 34399309 DOI: 10.1016/j.cortex.2021.06.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 05/06/2021] [Accepted: 06/17/2021] [Indexed: 01/07/2023]
Abstract
OBJECTIVE Frontotemporal Dementia (FTD) is a neurodegenerative disorder that results in disinhibition and difficulty with flexible responding when provided feedback. Inflexible responding is observed early in the course of the illness and contributes to the financial and social morbidities of FTD. Reversal learning is an established cognitive paradigm that indexes flexible responding in the face of feedback signaling a change in reinforcement contingencies, with components of reversal learning associated with specific neurotransmitter systems. The objective of the study was to evaluate the neural mechanisms underlying impaired flexible behavioural responding in FTD using a reversal learning paradigm combined with fMRI. METHODS Twenty-two patients meeting the diagnostic criteria for FTD and twenty-one healthy controls completed the study. Participants completed an fMRI-adapted reversal learning task that indexes behavioural flexibility when provided positive and negative feedback. RESULTS Patients with FTD demonstrated poorer behavioural flexibility relative to controls and abnormal BOLD responses within the left ventrolateral prefrontal cortex to incorrect responses made during the learning phase, and during correct responses when reward contingencies were reversed. As well, patients showed decreased activity within the left dorsal lateral prefrontal cortex to incorrect responses compared to controls. CONCLUSIONS These findings suggest that reversal learning impairments in patients with FTD, in particular those with frontal predominant atrophy, may be related to impaired flexible motor responding when selecting among several choices and deficient attention to relevant stimuli during instances of conflict (i.e., receiving negative feedback). These results and the associated neurotransmitter systems mediating these regions may provide targets for future pharmacological or behavioural interventions mediating these cognitive deficits.
Collapse
Affiliation(s)
- Tamara P Tavares
- Graduate Program in Neuroscience and Brain and Mind Institute, Schulich School of Medicine and Dentistry, Western University, Canada
| | - Derek G V Mitchell
- Graduate Program in Neuroscience and Brain and Mind Institute, Schulich School of Medicine and Dentistry, Western University, Canada; Department of Psychiatry and Department of Psychology, Western University, Canada
| | | | - Elizabeth Finger
- Graduate Program in Neuroscience and Brain and Mind Institute, Schulich School of Medicine and Dentistry, Western University, Canada; Parkwood Institute, Lawson Health Research Institute, Canada; Department of Clinical Neurological Sciences, Western University, Canada.
| |
Collapse
|
43
|
Grabenhorst F, Schultz W. Functions of primate amygdala neurons in economic decisions and social decision simulation. Behav Brain Res 2021; 409:113318. [PMID: 33901436 PMCID: PMC8164162 DOI: 10.1016/j.bbr.2021.113318] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 04/21/2021] [Accepted: 04/21/2021] [Indexed: 01/15/2023]
Abstract
Long implicated in aversive processing, the amygdala is now recognized as a key component of the brain systems that process rewards. Beyond reward valuation, recent findings from single-neuron recordings in monkeys indicate that primate amygdala neurons also play an important role in decision-making. The reward value signals encoded by amygdala neurons constitute suitable inputs to economic decision processes by being sensitive to reward contingency, relative reward quantity and temporal reward structure. During reward-based decisions, individual amygdala neurons encode both the value inputs and corresponding choice outputs of economic decision processes. The presence of such value-to-choice transitions in single amygdala neurons, together with other well-defined signatures of decision computation, indicate that a decision mechanism may be implemented locally within the primate amygdala. During social observation, specific amygdala neurons spontaneously encode these decision signatures to predict the choices of social partners, suggesting neural simulation of the partner's decision-making. The activity of these 'simulation neurons' could arise naturally from convergence between value neurons and social, self-other discriminating neurons. These findings identify single-neuron building blocks and computational architectures for decision-making and social behavior in the primate amygdala. An emerging understanding of the decision function of primate amygdala neurons can help identify potential vulnerabilities for amygdala dysfunction in human conditions afflicting social cognition and mental health.
Collapse
Affiliation(s)
- Fabian Grabenhorst
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, CB2 3DY, UK.
| | - Wolfram Schultz
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, CB2 3DY, UK.
| |
Collapse
|
44
|
Sias AC, Morse AK, Wang S, Greenfield VY, Goodpaster CM, Wrenn TM, Wikenheiser AM, Holley SM, Cepeda C, Levine MS, Wassum KM. A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories. eLife 2021; 10:e68617. [PMID: 34142660 PMCID: PMC8266390 DOI: 10.7554/elife.68617] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 06/16/2021] [Indexed: 12/18/2022] Open
Abstract
Adaptive reward-related decision making often requires accurate and detailed representation of potential available rewards. Environmental reward-predictive stimuli can facilitate these representations, allowing one to infer which specific rewards might be available and choose accordingly. This process relies on encoded relationships between the cues and the sensory-specific details of the rewards they predict. Here, we interrogated the function of the basolateral amygdala (BLA) and its interaction with the lateral orbitofrontal cortex (lOFC) in the ability to learn such stimulus-outcome associations and use these memories to guide decision making. Using optical recording and inhibition approaches, Pavlovian cue-reward conditioning, and the outcome-selective Pavlovian-to-instrumental transfer (PIT) test in male rats, we found that the BLA is robustly activated at the time of stimulus-outcome learning and that this activity is necessary for sensory-specific stimulus-outcome memories to be encoded, so they can subsequently influence reward choices. Direct input from the lOFC was found to support the BLA in this function. Based on prior work, activity in BLA projections back to the lOFC was known to support the use of stimulus-outcome memories to influence decision making. By multiplexing optogenetic and chemogenetic inhibition we performed a serial circuit disconnection and found that the lOFC→BLA and BLA→lOFC pathways form a functional circuit regulating the encoding (lOFC→BLA) and subsequent use (BLA→lOFC) of the stimulus-dependent, sensory-specific reward memories that are critical for adaptive, appetitive decision making.
Collapse
Affiliation(s)
- Ana C Sias
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Ashleigh K Morse
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Sherry Wang
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Venuz Y Greenfield
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Caitlin M Goodpaster
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Tyler M Wrenn
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
| | - Andrew M Wikenheiser
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
- Brain Research Institute, University of California, Los AngelesLos AngelesUnited States
- Integrative Center for Learning and Memory, University of California, Los AngelesLos AngelesUnited States
| | - Sandra M Holley
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Carlos Cepeda
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Michael S Levine
- Brain Research Institute, University of California, Los AngelesLos AngelesUnited States
- Intellectual and Developmental Disabilities Research Center, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Kate M Wassum
- Department of Psychology, University of California, Los AngelesLos AngelesUnited States
- Brain Research Institute, University of California, Los AngelesLos AngelesUnited States
- Integrative Center for Learning and Memory, University of California, Los AngelesLos AngelesUnited States
- Integrative Center for Addictive Disorders, University of California, Los AngelesLos AngelesUnited States
| |
Collapse
|
45
|
Gunther KE, Pérez-Edgar K. Dopaminergic associations between behavioral inhibition, executive functioning, and anxiety in development. DEVELOPMENTAL REVIEW 2021. [DOI: 10.1016/j.dr.2021.100966] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
46
|
Groman SM, Lee D, Taylor JR. Unlocking the reinforcement-learning circuits of the orbitofrontal cortex. Behav Neurosci 2021; 135:120-128. [PMID: 34060870 DOI: 10.1037/bne0000414] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Neuroimaging studies have consistently identified the orbitofrontal cortex (OFC) as being affected in individuals with neuropsychiatric disorders. OFC dysfunction has been proposed to be a key mechanism by which decision-making impairments emerge in diverse clinical populations, and recent studies employing computational approaches have revealed that distinct reinforcement-learning mechanisms of decision-making differ among diagnoses. In this perspective, we propose that these computational differences may be linked to select OFC circuits and present our recent work that has used a neurocomputational approach to understand the biobehavioral mechanisms of addiction pathology in rodent models. We describe how combining translationally analogous behavioral paradigms with reinforcement-learning algorithms and sophisticated neuroscience techniques in animals can provide critical insights into OFC pathology in biobehavioral disorders. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
47
|
|
48
|
Divergent Strategies for Learning in Males and Females. Curr Biol 2021; 31:39-50.e4. [PMID: 33125868 DOI: 10.1016/j.cub.2020.09.075] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 07/08/2020] [Accepted: 09/24/2020] [Indexed: 02/08/2023]
Abstract
A frequent assumption in value-based decision-making tasks is that agents make decisions based on the feature dimension that reward probabilities vary on. However, in complex, multidimensional environments, stimuli can vary on multiple dimensions at once, meaning that the feature deserving the most credit for outcomes is not always obvious. As a result, individuals may vary in the strategies used to sample stimuli across dimensions, and these strategies may have an unrecognized influence on decision-making. Sex is a proxy for multiple genetic and endocrine influences on behavior, including how environments are sampled. In this study, we examined the strategies adopted by female and male mice as they learned the value of stimuli that varied in both image and location in a visually cued two-armed bandit, allowing two possible dimensions to learn about. Female mice acquired the correct image-value associations more quickly than male mice, preferring a fundamentally different strategy. Female mice were more likely to constrain their decision-space early in learning by preferentially sampling one location over which images varied. Conversely, male mice were more likely to be inconsistent, changing their choice frequently and responding to the immediate experience of stochastic rewards. Individual strategies were related to sex-biased changes in neuronal activation in early learning. Together, we find that in mice, sex is associated with divergent strategies for sampling and learning about the world, revealing substantial unrecognized variability in the approaches implemented during value-based decision making.
Collapse
|
49
|
Kalinichenko LS, Abdel-Hafiz L, Wang AL, Mühle C, Rösel N, Schumacher F, Kleuser B, Smaga I, Frankowska M, Filip M, Schaller G, Richter-Schmidinger T, Lenz B, Gulbins E, Kornhuber J, Oliveira AWC, Barros M, Huston JP, Müller CP. Neutral Sphingomyelinase is an Affective Valence-Dependent Regulator of Learning and Memory. Cereb Cortex 2021; 31:1316-1333. [PMID: 33043975 DOI: 10.1093/cercor/bhaa298] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 09/11/2020] [Accepted: 09/11/2020] [Indexed: 12/16/2022] Open
Abstract
Sphingolipids and enzymes of the sphingolipid rheostat determine synaptic appearance and signaling in the brain, but sphingolipid contribution to normal behavioral plasticity is little understood. Here we asked how the sphingolipid rheostat contributes to learning and memory of various dimensions. We investigated the role of these lipids in the mechanisms of two different types of memory, such as appetitively and aversively motivated memory, which are considered to be mediated by different neural mechanisms. We found an association between superior performance in short- and long-term appetitively motivated learning and regionally enhanced neutral sphingomyelinase (NSM) activity. An opposite interaction was observed in an aversively motivated task. A valence-dissociating role of NSM in learning was confirmed in mice with genetically reduced NSM activity. This role may be mediated by the NSM control of N-methyl-d-aspartate receptor subunit expression. In a translational approach, we confirmed a positive association of serum NSM activity with long-term appetitively motivated memory in nonhuman primates and in healthy humans. Altogether, these data suggest a new sphingolipid mechanism of de-novo learning and memory, which is based on NSM activity.
Collapse
Affiliation(s)
- Liubov S Kalinichenko
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - Laila Abdel-Hafiz
- Center for Behavioral Neuroscience, Institute of Experimental Psychology, University of Düsseldorf, Düsseldorf 40225, Germany
| | - An-Li Wang
- Center for Behavioral Neuroscience, Institute of Experimental Psychology, University of Düsseldorf, Düsseldorf 40225, Germany
| | - Christiane Mühle
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - Nadine Rösel
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - Fabian Schumacher
- Department of Toxicology, Faculty of Mathematics and Natural Science, Institute of Nutritional Science, University of Potsdam, Potsdam 14558, Germany.,Department of Molecular Biology, University of Duisburg-Essen, Essen 45147, Germany
| | - Burkhard Kleuser
- Department of Toxicology, Faculty of Mathematics and Natural Science, Institute of Nutritional Science, University of Potsdam, Potsdam 14558, Germany
| | - Irena Smaga
- Department of Drug Addiction Pharmacology, Polish Academy of Sciences, Maj Institute of Pharmacology, Kraków 31-343, Poland
| | - Malgorzata Frankowska
- Department of Drug Addiction Pharmacology, Polish Academy of Sciences, Maj Institute of Pharmacology, Kraków 31-343, Poland
| | - Malgorzata Filip
- Department of Drug Addiction Pharmacology, Polish Academy of Sciences, Maj Institute of Pharmacology, Kraków 31-343, Poland
| | - Gerd Schaller
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - Tanja Richter-Schmidinger
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - Bernd Lenz
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany.,Department of Addictive Behavior and Addiction Medicine, Central Institute of Mental Health (CIMH), Medical Faculty Mannheim, Heidelberg University, Mannheim 68159, Germany
| | - Erich Gulbins
- Department of Molecular Biology, University of Duisburg-Essen, Essen 45147, Germany.,Department of Surgery, College of Medicine, University of Cincinnati, Cincinnati, OH 45267-0558, USA
| | - Johannes Kornhuber
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| | - André W C Oliveira
- Department of Pharmacy, School of Health Sciences, University of Brasilia, Brasilia, DF 70910-900, Brazil
| | - Marilia Barros
- Department of Pharmacy, School of Health Sciences, University of Brasilia, Brasilia, DF 70910-900, Brazil.,Primate Center, Institute of Biology, University of Brasilia, Brasilia 70910-900, Brazil
| | - Joseph P Huston
- Center for Behavioral Neuroscience, Institute of Experimental Psychology, University of Düsseldorf, Düsseldorf 40225, Germany
| | - Christian P Müller
- Department of Psychiatry and Psychotherapy, University Clinic, Friedrich-Alexander-University of Erlangen-Nuremberg, Erlangen 91054, Germany
| |
Collapse
|
50
|
Human Belief State-Based Exploration and Exploitation in an Information-Selective Symmetric Reversal Bandit Task. COMPUTATIONAL BRAIN & BEHAVIOR 2021; 4:442-462. [PMID: 34368622 PMCID: PMC8327602 DOI: 10.1007/s42113-021-00112-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Accepted: 05/24/2021] [Indexed: 02/07/2023]
Abstract
Humans often face sequential decision-making problems, in which information about the environmental reward structure is detached from rewards for a subset of actions. In the current exploratory study, we introduce an information-selective symmetric reversal bandit task to model such situations and obtained choice data on this task from 24 participants. To arbitrate between different decision-making strategies that participants may use on this task, we developed a set of probabilistic agent-based behavioral models, including exploitative and explorative Bayesian agents, as well as heuristic control agents. Upon validating the model and parameter recovery properties of our model set and summarizing the participants' choice data in a descriptive way, we used a maximum likelihood approach to evaluate the participants' choice data from the perspective of our model set. In brief, we provide quantitative evidence that participants employ a belief state-based hybrid explorative-exploitative strategy on the information-selective symmetric reversal bandit task, lending further support to the finding that humans are guided by their subjective uncertainty when solving exploration-exploitation dilemmas. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s42113-021-00112-3.
Collapse
|