1
|
Chase J, Xia L, Tai LH, Lin WC, Collins AGE, Wilbrecht L. Adolescent and adult mice use both incremental reinforcement learning and short term memory when learning concurrent stimulus-action associations. PLoS Comput Biol 2024; 20:e1012667. [PMID: 39715285 DOI: 10.1371/journal.pcbi.1012667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Accepted: 11/22/2024] [Indexed: 12/25/2024] Open
Abstract
Computational modeling has revealed that human research participants use both rapid working memory (WM) and incremental reinforcement learning (RL) (RL+WM) to solve a simple instrumental learning task, relying on WM when the number of stimuli is small and supplementing with RL when the number of stimuli exceeds WM capacity. Inspired by this work, we examined which learning systems and strategies are used by adolescent and adult mice when they first acquire a conditional associative learning task. In a version of the human RL+WM task translated for rodents, mice were required to associate odor stimuli (from a set of 2 or 4 odors) with a left or right port to receive reward. Using logistic regression and computational models to analyze the first 200 trials per odor, we determined that mice used both incremental RL and stimulus-insensitive, one-back strategies to solve the task. While these one-back strategies may be a simple form of short-term or working memory, they did not approximate the boost to learning performance that has been observed in human participants using WM in a comparable task. Adolescent and adult mice also showed comparable performance, with no change in learning rate or softmax beta parameters with adolescent development and task experience. However, reliance on a one-back perseverative, win-stay strategy increased with development in males in both odor set sizes, but was not dependent on gonadal hormones. Our findings advance a simple conditional associative learning task and new models to enable the isolation and quantification of reinforcement learning alongside other strategies mice use while learning to associate stimuli with rewards within a single behavioral session. These data and methods can inform and aid comparative study of reinforcement learning across species.
Collapse
Affiliation(s)
- Juliana Chase
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Neuroscience, University of California, Berkeley, Berkeley, California, United States of America
| | - Liyu Xia
- Department of Mathematics, University of California, Berkeley, Berkeley, California, United States of America
| | - Lung-Hao Tai
- Department of Neuroscience, University of California, Berkeley, Berkeley, California, United States of America
| | - Wan Chen Lin
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Neuroscience, University of California, Berkeley, Berkeley, California, United States of America
| | - Anne G E Collins
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Linda Wilbrecht
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Neuroscience, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
2
|
Schaaf JV, Johansson A, Visser I, Huizenga HM. What's in a name: The role of verbalization in reinforcement learning. Psychon Bull Rev 2024; 31:2746-2757. [PMID: 38769270 PMCID: PMC11680654 DOI: 10.3758/s13423-024-02506-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/25/2024] [Indexed: 05/22/2024]
Abstract
(e.g., characters or fractals) and concrete stimuli (e.g., pictures of everyday objects) are used interchangeably in the reinforcement-learning literature. Yet, it is unclear whether the same learning processes underlie learning from these different stimulus types. In two preregistered experiments (N = 50 each), we assessed whether abstract and concrete stimuli yield different reinforcement-learning performance and whether this difference can be explained by verbalization. We argued that concrete stimuli are easier to verbalize than abstract ones, and that people therefore can appeal to the phonological loop, a subcomponent of the working-memory system responsible for storing and rehearsing verbal information, while learning. To test whether this verbalization aids reinforcement-learning performance, we administered a reinforcement-learning task in which participants learned either abstract or concrete stimuli while verbalization was hindered or not. In the first experiment, results showed a more pronounced detrimental effect of hindered verbalization for concrete than abstract stimuli on response times, but not on accuracy. In the second experiment, in which we reduced the response window, results showed the differential effect of hindered verbalization between stimulus types on accuracy, not on response times. These results imply that verbalization aids learning for concrete, but not abstract, stimuli and therefore that different processes underlie learning from these types of stimuli. This emphasizes the importance of carefully considering stimulus types. We discuss these findings in light of generalizability and validity of reinforcement-learning research.
Collapse
Affiliation(s)
- Jessica V Schaaf
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.
- Cognitive Neuroscience Department, Radboud University Medical Centre, Nijmegen, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Annie Johansson
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Ingmar Visser
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Yield, Research Institute for Child Development and Education, Amsterdam, The Netherlands
- ABC, Amsterdam Brain and Cognition Centre, Amsterdam, The Netherlands
| | - Hilde M Huizenga
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Yield, Research Institute for Child Development and Education, Amsterdam, The Netherlands
- ABC, Amsterdam Brain and Cognition Centre, Amsterdam, The Netherlands
| |
Collapse
|
3
|
Shibata K, Klar V, Fallon SJ, Husain M, Manohar SG. Working memory as a representational template for reinforcement learning. Sci Rep 2024; 14:27660. [PMID: 39532969 PMCID: PMC11557606 DOI: 10.1038/s41598-024-79119-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024] Open
Abstract
Working memory (WM) and reinforcement learning (RL) both influence decision-making, but how they interact to affect behaviour remains unclear. We assessed whether RL is influenced by the format of visual stimuli held in WM, either feature-based or unified, object-based representations. In a pre-registered paradigm, participants learned stimulus-action combinations that provided reward through 80% probabilistic feedback. In parallel, participants retained the RL stimulus in WM and were asked to recall this stimulus after each RL choice. Crucially, the format of representation probed in WM was manipulated, with blocks encouraging either separate features or bound objects to be remembered. Incentivising a feature-based WM representation facilitated feature-based learning, shown by an improved choice strategy. This reveals a role of WM in providing sustained internal representations that are harnessed by RL, providing a framework by which these two cognitive processes cooperate.
Collapse
Affiliation(s)
- Kengo Shibata
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Level 6, West Wing, Oxford, OX3 9DU, UK.
| | - Verena Klar
- Department of Experimental Psychology, University of Oxford, Oxford, OX2 6GG, UK
| | - Sean J Fallon
- School of Psychology, University of Plymouth, Plymouth, PL4 8AA, UK
| | - Masud Husain
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Level 6, West Wing, Oxford, OX3 9DU, UK
- Department of Experimental Psychology, University of Oxford, Oxford, OX2 6GG, UK
| | - Sanjay G Manohar
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Level 6, West Wing, Oxford, OX3 9DU, UK
- Department of Experimental Psychology, University of Oxford, Oxford, OX2 6GG, UK
| |
Collapse
|
4
|
Li JJ, Collins AGE. An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policies. Cognition 2024; 254:105967. [PMID: 39368350 DOI: 10.1016/j.cognition.2024.105967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 09/17/2024] [Accepted: 09/23/2024] [Indexed: 10/07/2024]
Abstract
Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1026 human participants, who made over 1 million choices combined, in a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.
Collapse
Affiliation(s)
- Jing-Jing Li
- Helen Wills Neuroscience Institute, University of California, Berkeley, United States of America.
| | - Anne G E Collins
- Helen Wills Neuroscience Institute, University of California, Berkeley, United States of America; Department of Psychology, University of California, Berkeley, United States of America.
| |
Collapse
|
5
|
Ahmadi NK, Ozgur SF, Kiziltan E. Evaluating the Effects of Different Cognitive Tasks on Autonomic Nervous System Responses: Implementation of a High-Precision, Low-Cost Complementary Method. Brain Behav 2024; 14:e70089. [PMID: 39378296 PMCID: PMC11460642 DOI: 10.1002/brb3.70089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/06/2024] [Accepted: 09/14/2024] [Indexed: 10/10/2024] Open
Abstract
INTRODUCTION We developed a low-cost, user-friendly complementary research tool to evaluate autonomic nervous system (ANS) activity at varying levels of cognitive workload. This was achieved using visual stimuli as cognitive tasks, administered through a specially designed computer-based test battery. METHODS To assess sympathetic stress responses, skin conductance response (SCR) was measured, and electrocardiograms (ECG) were recorded to evaluate heart rate variability (HRV), an indicator of cardiac vagal tone. Twenty-five healthy adults participated in the study. SCR and ECG recordings were made during both tonic and phasic phases using a computer-based system designed for visual stimuli. Participants performed a button-pressing task upon seeing the target stimulus, and the relationship between reaction time (RT) and cognitive load was evaluated. RESULTS Analysis of the data showed higher skin conductance levels (SCLs) during tasks compared to baseline, indicating successful elicitation of sympathetic responses. RTs differed significantly between simple and cognitive tasks, increasing with mental load. Additionally, significant changes in vagally mediated HRV parameters during tasks compared to baseline highlighted the impact of cognitive load on the parasympathetic branch of the ANS, thereby influencing the brain-heart connection. CONCLUSION Our findings indicate that the developed research tool can successfully induce cognitive load, significantly affecting SCL, RTs, and HRV. This validates the tool's effectiveness in evaluating ANS responses to cognitive tasks.
Collapse
Affiliation(s)
- Nazli Karimi Ahmadi
- Department of Physiology, Faculty of MedicineHacettepe UniversityAnkaraTurkey
| | - Sezgi Firat Ozgur
- Department of Physiology, Faculty of MedicineHacettepe UniversityAnkaraTurkey
| | - Erhan Kiziltan
- Department of Biophysics, Faculty of MedicineBaskent UniversityAnkaraTurkey
| |
Collapse
|
6
|
Miller JA, Constantinidis C. Timescales of learning in prefrontal cortex. Nat Rev Neurosci 2024; 25:597-610. [PMID: 38937654 DOI: 10.1038/s41583-024-00836-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2024] [Indexed: 06/29/2024]
Abstract
The lateral prefrontal cortex (PFC) in humans and other primates is critical for immediate, goal-directed behaviour and working memory, which are classically considered distinct from the cognitive and neural circuits that support long-term learning and memory. Over the past few years, a reconsideration of this textbook perspective has emerged, in that different timescales of memory-guided behaviour are in constant interaction during the pursuit of immediate goals. Here, we will first detail how neural activity related to the shortest timescales of goal-directed behaviour (which requires maintenance of current states and goals in working memory) is sculpted by long-term knowledge and learning - that is, how the past informs present behaviour. Then, we will outline how learning across different timescales (from seconds to years) drives plasticity in the primate lateral PFC, from single neuron firing rates to mesoscale neuroimaging activity patterns. Finally, we will review how, over days and months of learning, dense local and long-range connectivity patterns in PFC facilitate longer-lasting changes in population activity by changing synaptic weights and recruiting additional neural resources to inform future behaviour. Our Review sheds light on how the machinery of plasticity in PFC circuits facilitates the integration of learned experiences across time to best guide adaptive behaviour.
Collapse
Affiliation(s)
- Jacob A Miller
- Wu Tsai Institute, Yale University, New Haven, CT, USA
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Christos Constantinidis
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA.
- Neuroscience Program, Vanderbilt University, Nashville, TN, USA.
- Department of Ophthalmology and Visual Sciences, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
7
|
Velázquez-Vargas CA, Taylor JA. Learning to Move and Plan like the Knight: Sequential Decision Making with a Novel Motor Mapping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.29.610359. [PMID: 39257833 PMCID: PMC11383687 DOI: 10.1101/2024.08.29.610359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Many skills that humans acquire throughout their lives, such as playing video games or sports, require substantial motor learning and multi-step planning. While both processes are typically studied separately, they are likely to interact during the acquisition of complex motor skills. In this work, we studied this interaction by assessing human performance in a sequential decision-making task that requires the learning of a non-trivial motor mapping. Participants were tasked to move a cursor from start to target locations in a grid world, using a standard keyboard. Notably, the specific keys were arbitrarily mapped to a movement rule resembling the Knight chess piece. In Experiment 1, we showed the learning of this mapping in the absence of planning, led to significant improvements in the task when presented with sequential decisions at a later stage. Computational modeling analysis revealed that such improvements resulted from an increased learning rate about the state transitions of the motor mapping, which also resulted in more flexible planning from trial to trial (less perseveration or habitual responses). In Experiment 2, we showed that incorporating mapping learning into the planning process, allows us to capture (1) differential task improvements for distinct planning horizons and (2) overall lower performance for longer horizons. Additionally, model analysis suggested that participants may limit their search to three steps ahead. We hypothesize that this limitation in planning horizon arises from capacity constraints in working memory, and may be the reason complex skills are often broken down into individual subroutines or components during learning.
Collapse
|
8
|
Nicholas J, Amlang C, Lin CYR, Montaser-Kouhsari L, Desai N, Pan MK, Kuo SH, Shohamy D. The Role of the Cerebellum in Learning to Predict Reward: Evidence from Cerebellar Ataxia. CEREBELLUM (LONDON, ENGLAND) 2024; 23:1355-1368. [PMID: 38066397 PMCID: PMC11161554 DOI: 10.1007/s12311-023-01633-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/02/2023] [Indexed: 01/25/2024]
Abstract
Recent findings in animals have challenged the traditional view of the cerebellum solely as the site of motor control, suggesting that the cerebellum may also be important for learning to predict reward from trial-and-error feedback. Yet, evidence for the role of the cerebellum in reward learning in humans is lacking. Moreover, open questions remain about which specific aspects of reward learning the cerebellum may contribute to. Here we address this gap through an investigation of multiple forms of reward learning in individuals with cerebellum dysfunction, represented by cerebellar ataxia cases. Nineteen participants with cerebellar ataxia and 57 age- and sex-matched healthy controls completed two separate tasks that required learning about reward contingencies from trial-and-error. To probe the selectivity of reward learning processes, the tasks differed in their underlying structure: while one task measured incremental reward learning ability alone, the other allowed participants to use an alternative learning strategy based on episodic memory alongside incremental reward learning. We found that individuals with cerebellar ataxia were profoundly impaired at reward learning from trial-and-error feedback on both tasks, but retained the ability to learn to predict reward based on episodic memory. These findings provide evidence from humans for a specific and necessary role for the cerebellum in incremental learning of reward associations based on reinforcement. More broadly, the findings suggest that alongside its role in motor learning, the cerebellum likely operates in concert with the basal ganglia to support reinforcement learning from reward.
Collapse
Affiliation(s)
- Jonathan Nicholas
- Department of Psychology, Columbia University, New York, NY, USA
- Zuckerman Mind Brain Behavior Institute, Columbia University, Quad 3D, 3227 Broadway, New York, NY, 10027, USA
| | - Christian Amlang
- Department of Neurology, Columbia University Medical Center, 650 W. 168th St, Rm 305, New York, NY, 10032, USA
- Initiative for Columbia Ataxia and Tremor, Columbia University Medical Center, New York, NY, USA
| | - Chi-Ying R Lin
- Department of Neurology, Baylor College of Medicine, Houston, TX, USA
| | | | - Natasha Desai
- Department of Neurology, Columbia University Medical Center, 650 W. 168th St, Rm 305, New York, NY, 10032, USA
- Initiative for Columbia Ataxia and Tremor, Columbia University Medical Center, New York, NY, USA
| | - Ming-Kai Pan
- Department of Medical Research, National Taiwan University Hospital, 100, Taipei, Taiwan
- Department and Graduate Institute of Pharmacology, National Taiwan University College of Medicine, 100, Taipei, Taiwan
- Cerebellar Research Center, National Taiwan University Hospital, Yun-Lin Branch, Yun-Lin, Taiwan
| | - Sheng-Han Kuo
- Department of Neurology, Columbia University Medical Center, 650 W. 168th St, Rm 305, New York, NY, 10032, USA.
- Initiative for Columbia Ataxia and Tremor, Columbia University Medical Center, New York, NY, USA.
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, NY, USA.
- Zuckerman Mind Brain Behavior Institute, Columbia University, Quad 3D, 3227 Broadway, New York, NY, 10027, USA.
- Kavli Institute for Brain Science, Columbia University, New York, NY, USA.
| |
Collapse
|
9
|
Velázquez-Vargas CA, Taylor JA. Working memory constraints for visuomotor retrieval strategies. J Neurophysiol 2024; 132:347-361. [PMID: 38919148 PMCID: PMC11427054 DOI: 10.1152/jn.00122.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/05/2024] [Accepted: 06/21/2024] [Indexed: 06/27/2024] Open
Abstract
Recent work has shown the fundamental role that cognitive strategies play in visuomotor adaptation. Although algorithmic strategies, such as mental rotation, are flexible and generalizable, they are computationally demanding. To avoid this computational cost, people can instead rely on memory retrieval of previously successful visuomotor solutions. However, such a strategy is likely subject to stimulus-response associations and rely heavily on working memory. In a series of five experiments, we sought to estimate the constraints in terms of capacity and precision of working memory retrieval for visuomotor adaptation. This was accomplished by leveraging different variations of visuomotor item-recognition and visuomotor rotation tasks where we associated unique rotations with specific targets in the workspace and manipulated the set size (i.e., number of rotation-target associations). Notably, from experiment 1 to 4, we found key signatures of working memory retrieval and not mental rotation. In particular, participants were less accurate and slower for larger set sizes and less recent items. Using a Bayesian latent-mixture model, we found that such decrease in performance was the result of increasing guessing behavior and less precise memories. In addition, we estimated that participants' working memory capacity was limited to two to five items, after which guessing increasingly dominated performance. Finally, in experiment 5, we showed how the constraints observed across experiments 1 to 4 can be overcome when relying on long-term memory retrieval. Our results point to the opportunity of studying other sources of memories where visuomotor solutions can be stored (e.g., episodic memories) to achieve successful adaptation.NEW & NOTEWORTHY We show that humans can adapt to feedback perturbations in different variations of the visuomotor rotation task by retrieving the successful solutions from working memory. In addition, using a Bayesian latent-mixture model, we reveal that guessing and low-precision memories are both responsible for the decrease in participants' performance as the number of solutions to memorize increases. These constraints can be overcome by relying on long-term memory retrieval resulting from extended practice with the visuomotor solutions.
Collapse
Affiliation(s)
| | - Jordan A Taylor
- Department of Psychology, Princeton University, Princeton, New Jersey, United States
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States
| |
Collapse
|
10
|
Zid M, Laurie VJ, Levine-Champagne A, Shourkeshti A, Harrell D, Herman AB, Ebitz RB. Humans forage for reward in reinforcement learning tasks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.08.602539. [PMID: 39026817 PMCID: PMC11257465 DOI: 10.1101/2024.07.08.602539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
How do we make good decisions in uncertain environments? In psychology and neuroscience, the classic answer is that we calculate the value of each option and then compare the values to choose the most rewarding, modulo some exploratory noise. An ethologist, conversely, would argue that we commit to one option until its value drops below a threshold, at which point we start exploring other options. In order to determine which view better describes human decision-making, we developed a novel, foraging-inspired sequential decision-making model and used it to ask whether humans compare to threshold ("Forage") or compare alternatives ("Reinforcement-Learn" [RL]). We found that the foraging model was a better fit for participant behavior, better predicted the participants' tendency to repeat choices, and predicted the existence of held-out participants with a pattern of choice that was almost impossible under RL. Together, these results suggest that humans use foraging computations, rather than RL, even in classic reinforcement learning tasks.
Collapse
Affiliation(s)
- Meriam Zid
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | - Veldon-James Laurie
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | | | - Akram Shourkeshti
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | - Dameon Harrell
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Alexander B. Herman
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, 55455, USA
| | - R. Becket Ebitz
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| |
Collapse
|
11
|
Hore A, Bandyopadhyay S, Chakrabarti S. Persistent spiking activity in neuromorphic circuits incorporating post-inhibitory rebound excitation. J Neural Eng 2024; 21:036048. [PMID: 38861961 DOI: 10.1088/1741-2552/ad56c8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 06/11/2024] [Indexed: 06/13/2024]
Abstract
Objective. This study introduces a novel approach for integrating the post-inhibitory rebound excitation (PIRE) phenomenon into a neuronal circuit. Excitatory and inhibitory synapses are designed to establish a connection between two hardware neurons, effectively forming a network. The model demonstrates the occurrence of PIRE under strong inhibitory input. Emphasizing the significance of incorporating PIRE in neuromorphic circuits, the study showcases generation of persistent activity within cyclic and recurrent spiking neuronal networks.Approach. The neuronal and synaptic circuits are designed and simulated in Cadence Virtuoso using TSMC 180 nm technology. The operating mechanism of the PIRE phenomenon integrated into a hardware neuron is discussed. The proposed circuit encompasses several parameters for effectively controlling multiple electrophysiological features of a neuron.Main results. The neuronal circuit has been tuned to match the response of a biological neuron. The efficiency of this circuit is evaluated by computing the average power dissipation and energy consumption per spike through simulation. The sustained firing of neural spikes is observed till 1.7 s using the two neuronal networks.Significance. Persistent activity has significant implications for various cognitive functions such as working memory, decision-making, and attention. Therefore, hardware implementation of these functions will require our PIRE-integrated model. Energy-efficient neuromorphic systems are useful in many artificial intelligence applications, including human-machine interaction, IoT devices, autonomous systems, and brain-computer interfaces.
Collapse
|
12
|
Bays PM, Schneegans S, Ma WJ, Brady TF. Representation and computation in visual working memory. Nat Hum Behav 2024; 8:1016-1034. [PMID: 38849647 DOI: 10.1038/s41562-024-01871-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 03/22/2024] [Indexed: 06/09/2024]
Abstract
The ability to sustain internal representations of the sensory environment beyond immediate perception is a fundamental requirement of cognitive processing. In recent years, debates regarding the capacity and fidelity of the working memory (WM) system have advanced our understanding of the nature of these representations. In particular, there is growing recognition that WM representations are not merely imperfect copies of a perceived object or event. New experimental tools have revealed that observers possess richer information about the uncertainty in their memories and take advantage of environmental regularities to use limited memory resources optimally. Meanwhile, computational models of visuospatial WM formulated at different levels of implementation have converged on common principles relating capacity to variability and uncertainty. Here we review recent research on human WM from a computational perspective, including the neural mechanisms that support it.
Collapse
Affiliation(s)
- Paul M Bays
- Department of Psychology, University of Cambridge, Cambridge, UK
| | | | - Wei Ji Ma
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
| | - Timothy F Brady
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
13
|
Ghaderi S, Amani Rad J, Hemami M, Khosrowabadi R. Dysfunctional feedback processing in male methamphetamine abusers: Evidence from neurophysiological and computational approaches. Neuropsychologia 2024; 197:108847. [PMID: 38460774 DOI: 10.1016/j.neuropsychologia.2024.108847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 01/24/2024] [Accepted: 02/28/2024] [Indexed: 03/11/2024]
Abstract
Methamphetamine use disorder (MUD) as a major public health risk is associated with dysfunctional neural feedback processing. Although dysfunctional feedback processing in people who are substance dependent has been explored in several behavioral, computational, and electrocortical studies, this mechanism in MUDs requires to be well understood. Furthermore, the current understanding of latent components of their behavior such as learning speed and exploration-exploitation dilemma is still limited. In addition, the association between the latent cognitive components and the related neural mechanisms also needs to be explored. Therefore, in this study, the underlying neurocognitive mechanisms of feedback processing of such impairment, and age/gender-matched healthy controls are evaluated within a probabilistic learning task with rewards and punishments. Mathematical modeling results based on the Q-learning paradigm suggested that MUDs show less sensitivity in distinguishing optimal options. Additionally, it may be worth noting that MUDs exhibited a slight decrease in their ability to learn from negative feedback compared to healthy controls. Also through the lens of underlying neural mechanisms, MUDs showed lower theta power at the medial-frontal areas while responding to negative feedback. However, other EEG measures of reinforcement learning including feedback-related negativity, parietal-P300, and activity flow from the medial frontal to lateral prefrontal regions, remained intact in MUDs. On the other hand, the elimination of the linkage between value sensitivity and medial-frontal theta activity in MUDs was observed. The observed dysfunction could be due to the adverse effects of methamphetamine on the cortico-striatal dopamine circuit, which is reflected in the anterior cingulate cortex activity as the most likely region responsible for efficient behavior adjustment. These findings could help us to pave the way toward tailored therapeutic approaches.
Collapse
Affiliation(s)
- Sadegh Ghaderi
- Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran
| | - Jamal Amani Rad
- Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran.
| | - Mohammad Hemami
- Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran
| | - Reza Khosrowabadi
- Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran.
| |
Collapse
|
14
|
Hassanzadeh Z, Bahrami F, Dortaj F. Exploring the dynamic interplay between learning and working memory within various cognitive contexts. Front Behav Neurosci 2024; 18:1304378. [PMID: 38420348 PMCID: PMC10899440 DOI: 10.3389/fnbeh.2024.1304378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 01/23/2024] [Indexed: 03/02/2024] Open
Abstract
Introduction The intertwined relationship between reinforcement learning and working memory in the brain is a complex subject, widely studied across various domains in neuroscience. Research efforts have focused on identifying the specific brain areas responsible for these functions, understanding their contributions in accomplishing the related tasks, and exploring their adaptability under conditions such as cognitive impairment or aging. Methods Numerous models have been introduced to formulate either these two subsystems of reinforcement learning and working memory separately or their combination and relationship in executing cognitive tasks. This study adopts the RLWM model as a computational framework to analyze the behavioral parameters of subjects with varying cognitive abilities due to age or cognitive status. A related RLWM task is employed to assess a group of subjects across different age groups and cognitive abilities, as measured by the Montreal Cognitive Assessment tool (MoCA). Results Analysis reveals a decline in overall performance accuracy and speed with differing age groups (young vs. middle-aged). Significant differences are observed in model parameters such as learning rate, WM decay, and decision noise. Furthermore, among the middle-aged group, distinctions emerge between subjects categorized as normal vs. MCI based on MoCA scores, notably in speed, performance accuracy, and decision noise.
Collapse
Affiliation(s)
- Zakieh Hassanzadeh
- Faculty of Psychology and Educational Sciences, Allameh Tabataba’i University, Tehran, Iran
| | - Fariba Bahrami
- School of Electrical and Computer Engineering College of Engineering, University of Tehran, Tehran, Iran
| | - Fariborz Dortaj
- Faculty of Psychology and Educational Sciences, Allameh Tabataba’i University, Tehran, Iran
| |
Collapse
|
15
|
Kirschner H, Nassar MR, Fischer AG, Frodl T, Meyer-Lotz G, Froböse S, Seidenbecher S, Klein TA, Ullsperger M. Transdiagnostic inflexible learning dynamics explain deficits in depression and schizophrenia. Brain 2024; 147:201-214. [PMID: 38058203 PMCID: PMC10766268 DOI: 10.1093/brain/awad362] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/25/2023] [Accepted: 10/10/2023] [Indexed: 12/08/2023] Open
Abstract
Deficits in reward learning are core symptoms across many mental disorders. Recent work suggests that such learning impairments arise by a diminished ability to use reward history to guide behaviour, but the neuro-computational mechanisms through which these impairments emerge remain unclear. Moreover, limited work has taken a transdiagnostic approach to investigate whether the psychological and neural mechanisms that give rise to learning deficits are shared across forms of psychopathology. To provide insight into this issue, we explored probabilistic reward learning in patients diagnosed with major depressive disorder (n = 33) or schizophrenia (n = 24) and 33 matched healthy controls by combining computational modelling and single-trial EEG regression. In our task, participants had to integrate the reward history of a stimulus to decide whether it is worthwhile to gamble on it. Adaptive learning in this task is achieved through dynamic learning rates that are maximal on the first encounters with a given stimulus and decay with increasing stimulus repetitions. Hence, over the course of learning, choice preferences would ideally stabilize and be less susceptible to misleading information. We show evidence of reduced learning dynamics, whereby both patient groups demonstrated hypersensitive learning (i.e. less decaying learning rates), rendering their choices more susceptible to misleading feedback. Moreover, there was a schizophrenia-specific approach bias and a depression-specific heightened sensitivity to disconfirmational feedback (factual losses and counterfactual wins). The inflexible learning in both patient groups was accompanied by altered neural processing, including no tracking of expected values in either patient group. Taken together, our results thus provide evidence that reduced trial-by-trial learning dynamics reflect a convergent deficit across depression and schizophrenia. Moreover, we identified disorder distinct learning deficits.
Collapse
Affiliation(s)
- Hans Kirschner
- Institute of Psychology, Otto-von-Guericke University, D-39106 Magdeburg, Germany
| | - Matthew R Nassar
- Robert J. and Nancy D. Carney Institute for Brain Science, Brown University, Providence, RI 02912-1821, USA
- Department of Neuroscience, Brown University, Providence, RI 02912-1821, USA
| | - Adrian G Fischer
- Department of Education and Psychology, Freie Universität Berlin, D-14195 Berlin, Germany
| | - Thomas Frodl
- Department of Psychiatry and Psychotherapy, Otto-von-Guericke University, D-39106 Magdeburg, Germany
- Department of Psychiatry, Psychotherapy and Psychosomatics, RWTH Aachen University, Aachen 52074, Germany
- German Center for Mental Health (DZPG), D-39106 Magdeburg, Germany
- Center for Intervention and Research on adaptive and maladaptive brain Circuits underlying mental health (C-I-R-C), Jena-Magdeburg-Halle, D-39106 Magdeburg, Germany
| | - Gabriela Meyer-Lotz
- Department of Psychiatry and Psychotherapy, Otto-von-Guericke University, D-39106 Magdeburg, Germany
| | - Sören Froböse
- Department of Psychiatry and Psychotherapy, Otto-von-Guericke University, D-39106 Magdeburg, Germany
| | - Stephanie Seidenbecher
- Department of Psychiatry and Psychotherapy, Otto-von-Guericke University, D-39106 Magdeburg, Germany
| | - Tilmann A Klein
- Institute of Psychology, Otto-von-Guericke University, D-39106 Magdeburg, Germany
- Center for Behavioral Brain Sciences, D-39106 Magdeburg, Germany
| | - Markus Ullsperger
- Institute of Psychology, Otto-von-Guericke University, D-39106 Magdeburg, Germany
- German Center for Mental Health (DZPG), D-39106 Magdeburg, Germany
- Center for Intervention and Research on adaptive and maladaptive brain Circuits underlying mental health (C-I-R-C), Jena-Magdeburg-Halle, D-39106 Magdeburg, Germany
- Center for Behavioral Brain Sciences, D-39106 Magdeburg, Germany
| |
Collapse
|
16
|
Liuzzi L, Pine DS, Fox NA, Averbeck BB. Changes in Behavior and Neural Dynamics across Adolescent Development. J Neurosci 2023; 43:8723-8732. [PMID: 37848282 PMCID: PMC10727120 DOI: 10.1523/jneurosci.0462-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 08/28/2023] [Accepted: 09/19/2023] [Indexed: 10/19/2023] Open
Abstract
Adolescence is an important developmental period, during which substantial changes occur in brain function and behavior. Several aspects of executive function, including response inhibition, improve during this period. Correspondingly, structural imaging studies have documented consistent decreases in cortical and subcortical gray matter volume, and postmortem histologic studies have found substantial (∼40%) decreases in excitatory synapses in prefrontal cortex. Recent computational modeling work suggests that the change in synaptic density underlie improvements in task performance. These models also predict changes in neural dynamics related to the depth of attractor basins, where deeper basins can underlie better task performance. In this study, we analyzed task-related neural dynamics in a large cohort of longitudinally followed subjects (male and female) spanning early to late adolescence. We found that age correlated positively with behavioral performance in the Eriksen Flanker task. Older subjects were also characterized by deeper attractor basins around task related evoked EEG potentials during specific cognitive operations. Thus, consistent with computational models examining the effects of excitatory synaptic pruning, older adolescents showed stronger attractor dynamics during task performance.SIGNIFICANCE STATEMENT There are well-documented changes in brain and behavior during adolescent development. However, there are few mechanistic theories that link changes in the brain to changes in behavior. Here, we tested a hypothesis, put forward on the basis of computational modeling, that pruning of excitatory synapses in cortex during adolescence changes neural dynamics. We found, consistent with the hypothesis, that variability around event-related potentials shows faster decay dynamics in older adolescent subjects. The faster decay dynamics are consistent with the hypothesis that synaptic pruning during adolescent development leads to stronger attractor basins in task-related neural activity.
Collapse
Affiliation(s)
- Lucrezia Liuzzi
- Emotion and Development Branch, National Institute of Mental Health, Bethesda, 20892, MD
| | - Daniel S Pine
- Emotion and Development Branch, National Institute of Mental Health, Bethesda, 20892, MD
| | - Nathan A Fox
- Department of Human Development and Quantitative Methodology, University of Maryland, College Park, MD 20742
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, Bethesda, 20892, MD
| |
Collapse
|
17
|
Sato Y, Sakai Y, Hirata S. State-transition-free reinforcement learning in chimpanzees (Pan troglodytes). Learn Behav 2023; 51:413-427. [PMID: 37369920 DOI: 10.3758/s13420-023-00591-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/07/2023] [Indexed: 06/29/2023]
Abstract
The outcome of an action often occurs after a delay. One solution for learning appropriate actions from delayed outcomes is to rely on a chain of state transitions. Another solution, which does not rest on state transitions, is to use an eligibility trace (ET) that directly bridges a current outcome and multiple past actions via transient memories. Previous studies revealed that humans (Homo sapiens) learned appropriate actions in a behavioral task in which solutions based on the ET were effective but transition-based solutions were ineffective. This suggests that ET may be used in human learning systems. However, no studies have examined nonhuman animals with an equivalent behavioral task. We designed a task for nonhuman animals following a previous human study. In each trial, participants chose one of two stimuli that were randomly selected from three stimulus types: a stimulus associated with a food reward delivered immediately, a stimulus associated with a reward delivered after a few trials, and a stimulus associated with no reward. The presented stimuli did not vary according to the participants' choices. To maximize the total reward, participants had to learn the value of the stimulus associated with a delayed reward. Five chimpanzees (Pan troglodytes) performed the task using a touchscreen. Two chimpanzees were able to learn successfully, indicating that learning mechanisms that do not depend on state transitions were involved in the learning processes. The current study extends previous ET research by proposing a behavioral task and providing empirical data from chimpanzees.
Collapse
Grants
- 16H06283 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 18H05524 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 19J22889 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- 26245069 Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science
- U04 Program for Leading Graduate Schools
Collapse
Affiliation(s)
- Yutaro Sato
- Wildlife Research Center, Kyoto University, Kyoto, Japan.
- University Administration Office, Headquarters for Management Strategy, Niigata University, Niigata, Japan.
| | - Yutaka Sakai
- Brain Science Institute, Tamagawa University, Tokyo, Japan
| | - Satoshi Hirata
- Wildlife Research Center, Kyoto University, Kyoto, Japan
| |
Collapse
|
18
|
Park H, Doh H, Lee E, Park H, Ahn WY. The neurocognitive role of working memory load when Pavlovian motivational control affects instrumental learning. PLoS Comput Biol 2023; 19:e1011692. [PMID: 38064498 PMCID: PMC10732416 DOI: 10.1371/journal.pcbi.1011692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 12/20/2023] [Accepted: 11/15/2023] [Indexed: 12/21/2023] Open
Abstract
Research suggests that a fast, capacity-limited working memory (WM) system and a slow, incremental reinforcement learning (RL) system jointly contribute to instrumental learning. Thus, situations that strain WM resources alter instrumental learning: under WM loads, learning becomes slow and incremental, the reliance on computationally efficient learning increases, and action selection becomes more random. It is also suggested that Pavlovian learning influences people's behavior during instrumental learning by providing hard-wired instinctive responses including approach to reward predictors and avoidance of punishment predictors. However, it remains unknown how constraints on WM resources affect instrumental learning under Pavlovian influence. Thus, we conducted a functional magnetic resonance imaging (fMRI) study (N = 49) in which participants completed an instrumental learning task with Pavlovian-instrumental conflict (the orthogonalized go/no-go task) both with and without extra WM load. Behavioral and computational modeling analyses revealed that WM load reduced the learning rate and increased random choice, without affecting Pavlovian bias. Model-based fMRI analysis revealed that WM load strengthened RPE signaling in the striatum. Moreover, under WM load, the striatum showed weakened connectivity with the ventromedial and dorsolateral prefrontal cortex when computing reward expectations. These results suggest that the limitation of cognitive resources by WM load promotes slow and incremental learning through the weakened cooperation between WM and RL; such limitation also makes action selection more random, but it does not directly affect the balance between instrumental and Pavlovian systems.
Collapse
Affiliation(s)
- Heesun Park
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Hoyoung Doh
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Eunhwi Lee
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Harhim Park
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Woo-Young Ahn
- Department of Psychology, Seoul National University, Seoul, Korea
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Korea
| |
Collapse
|
19
|
Yoo AH, Keglovits H, Collins AGE. Lowered inter-stimulus discriminability hurts incremental contributions to learning. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023; 23:1346-1364. [PMID: 37656373 PMCID: PMC10545593 DOI: 10.3758/s13415-023-01104-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 04/13/2023] [Indexed: 09/02/2023]
Abstract
How does the similarity between stimuli affect our ability to learn appropriate response associations for them? In typical laboratory experiments learning is investigated under somewhat ideal circumstances, where stimuli are easily discriminable. This is not representative of most real-life learning, where overlapping "stimuli" can result in different "rewards" and may be learned simultaneously (e.g., you may learn over repeated interactions that a specific dog is friendly, but that a very similar looking one isn't). With two experiments, we test how humans learn in three stimulus conditions: one "best case" condition in which stimuli have idealized and highly discriminable visual and semantic representations, and two in which stimuli have overlapping representations, making them less discriminable. We find that, unsurprisingly, decreasing stimuli discriminability decreases performance. We develop computational models to test different hypotheses about how reinforcement learning (RL) and working memory (WM) processes are affected by different stimulus conditions. Our results replicate earlier studies demonstrating the importance of both processes to capture behavior. However, our results extend previous studies by demonstrating that RL, and not WM, is affected by stimulus distinctness: people learn slower and have higher across-stimulus value confusion at decision when stimuli are more similar to each other. These results illustrate strong effects of stimulus type on learning and demonstrate the importance of considering parallel contributions of different cognitive processes when studying behavior.
Collapse
Affiliation(s)
- Aspen H Yoo
- Department of Psychology, University of California, Berkeley, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, USA
| | - Haley Keglovits
- Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, USA
| | - Anne G E Collins
- Department of Psychology, University of California, Berkeley, USA.
- Helen Wills Neuroscience Institute, University of California, Berkeley, USA.
| |
Collapse
|
20
|
Tichelaar JG, Sayalı C, Helmich RC, Cools R. Impulse control disorder in Parkinson's disease is associated with abnormal frontal value signalling. Brain 2023; 146:3676-3689. [PMID: 37192341 PMCID: PMC10473575 DOI: 10.1093/brain/awad162] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 04/18/2023] [Accepted: 04/26/2023] [Indexed: 05/18/2023] Open
Abstract
Dopaminergic medication is well established to boost reward- versus punishment-based learning in Parkinson's disease. However, there is tremendous variability in dopaminergic medication effects across different individuals, with some patients exhibiting much greater cognitive sensitivity to medication than others. We aimed to unravel the mechanisms underlying this individual variability in a large heterogeneous sample of early-stage patients with Parkinson's disease as a function of comorbid neuropsychiatric symptomatology, in particular impulse control disorders and depression. One hundred and ninety-nine patients with Parkinson's disease (138 ON medication and 61 OFF medication) and 59 healthy controls were scanned with functional MRI while they performed an established probabilistic instrumental learning task. Reinforcement learning model-based analyses revealed medication group differences in learning from gains versus losses, but only in patients with impulse control disorders. Furthermore, expected-value related brain signalling in the ventromedial prefrontal cortex was increased in patients with impulse control disorders ON medication compared with those OFF medication, while striatal reward prediction error signalling remained unaltered. These data substantiate the hypothesis that dopamine's effects on reinforcement learning in Parkinson's disease vary with individual differences in comorbid impulse control disorder and suggest they reflect deficient computation of value in medial frontal cortex, rather than deficient reward prediction error signalling in striatum. See Michael Browning (https://doi.org/10.1093/brain/awad248) for a scientific commentary on this article.
Collapse
Affiliation(s)
- Jorryt G Tichelaar
- Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, 6525EN Nijmegen, The Netherlands
- Radboud University Medical Center, Department of Neurology, Centre of Expertise for Parkinson and Movement Disorders, 6525GA Nijmegen, The Netherlands
| | - Ceyda Sayalı
- The Johns Hopkins University School of Medicine, Center for Psychedelic and Consciousness Research, Baltimore, MD 21224, USA
| | - Rick C Helmich
- Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, 6525EN Nijmegen, The Netherlands
- Radboud University Medical Center, Department of Neurology, Centre of Expertise for Parkinson and Movement Disorders, 6525GA Nijmegen, The Netherlands
| | - Roshan Cools
- Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, 6525EN Nijmegen, The Netherlands
- Radboud University Medical Center, Department of Psychiatry, 6525GA Nijmegen, The Netherlands
| |
Collapse
|
21
|
Sun Y, Kang P, Huang L, Wang H, Ku Y. Reward advantage over punishment for incentivizing visual working memory. Psychophysiology 2023; 60:e14300. [PMID: 36966450 DOI: 10.1111/psyp.14300] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 01/26/2023] [Accepted: 03/01/2023] [Indexed: 03/27/2023]
Abstract
The prospects of gaining reward and avoiding punishment widely influence human behavior. Despite of numerous attempts to investigate the influence of motivational signals on working memory (WM), whether the valence and the magnitude of motivational signals interactively influence WM performance remains unclear. To investigate this, the present study used a free-recall working memory task with EEG recording to compare the effect of incentive valence (reward or punishment), as well as the magnitude of incentives on visual WM. Behavioral results revealed that the presence of incentive signals improved WM precision when compared with no-incentive condition, and compared with punishing cues, rewarding cues led to greater facilitation in WM precision, as well as confidence ratings afterward. Moreover, event related potential (ERP) results suggested that compared with punishment, reward led to an earlier latency of late positive component (LPC), a larger amplitude of contingent negative variation (CNV) during the expectation period, and a larger P300 amplitude during the sample and delay periods. Furthermore, reward advantage over punishment in behavioral and neural results were correlated, such that individuals with larger CNV difference between reward and punishment conditions also report greater distinction in confidence ratings between the two conditions. In sum, our results demonstrate what and how rewarding cues cause more beneficial effects than punishing cues when incentivizing visual WM.
Collapse
Affiliation(s)
- Yurong Sun
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Center for Brain and Mental Well-Being, Department of Psychology, Sun Yat-sen University, Guangzhou, China
- School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| | - Pyungwon Kang
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Leyu Huang
- School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| | - Huimin Wang
- School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| | - Yixuan Ku
- Guangdong Provincial Key Laboratory of Brain Function and Disease, Center for Brain and Mental Well-Being, Department of Psychology, Sun Yat-sen University, Guangzhou, China
- Peng Cheng Laboratory, Shenzhen, China
| |
Collapse
|
22
|
Schumacher L, Bürkner PC, Voss A, Köthe U, Radev ST. Neural superstatistics for Bayesian estimation of dynamic cognitive models. Sci Rep 2023; 13:13778. [PMID: 37612320 PMCID: PMC10447473 DOI: 10.1038/s41598-023-40278-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023] Open
Abstract
Mathematical models of cognition are often memoryless and ignore potential fluctuations of their parameters. However, human cognition is inherently dynamic. Thus, we propose to augment mechanistic cognitive models with a temporal dimension and estimate the resulting dynamics from a superstatistics perspective. Such a model entails a hierarchy between a low-level observation model and a high-level transition model. The observation model describes the local behavior of a system, and the transition model specifies how the parameters of the observation model evolve over time. To overcome the estimation challenges resulting from the complexity of superstatistical models, we develop and validate a simulation-based deep learning method for Bayesian inference, which can recover both time-varying and time-invariant parameters. We first benchmark our method against two existing frameworks capable of estimating time-varying parameters. We then apply our method to fit a dynamic version of the diffusion decision model to long time series of human response times data. Our results show that the deep learning approach is very efficient in capturing the temporal dynamics of the model. Furthermore, we show that the erroneous assumption of static or homogeneous parameters will hide important temporal information.
Collapse
Affiliation(s)
- Lukas Schumacher
- Institute of Psychology, Heidelberg University, Heidelberg, Germany.
| | | | - Andreas Voss
- Institute of Psychology, Heidelberg University, Heidelberg, Germany
| | - Ullrich Köthe
- Computer Vision and Learning Lab, Heidelberg University, Heidelberg, Germany
| | - Stefan T Radev
- Cluster of Excellence STRUCTURES, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
23
|
Rac-Lubashevsky R, Cremer A, Collins AGE, Frank MJ, Schwabe L. Neural Index of Reinforcement Learning Predicts Improved Stimulus-Response Retention under High Working Memory Load. J Neurosci 2023; 43:3131-3143. [PMID: 36931706 PMCID: PMC10146488 DOI: 10.1523/jneurosci.1274-22.2023] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 01/19/2023] [Accepted: 02/20/2023] [Indexed: 03/19/2023] Open
Abstract
Human learning and decision-making are supported by multiple systems operating in parallel. Recent studies isolating the contributions of reinforcement learning (RL) and working memory (WM) have revealed a trade-off between the two. An interactive WM/RL computational model predicts that although high WM load slows behavioral acquisition, it also induces larger prediction errors in the RL system that enhance robustness and retention of learned behaviors. Here, we tested this account by parametrically manipulating WM load during RL in conjunction with EEG in both male and female participants and administered two surprise memory tests. We further leveraged single-trial decoding of EEG signatures of RL and WM to determine whether their interaction predicted robust retention. Consistent with the model, behavioral learning was slower for associations acquired under higher load but showed parametrically improved future retention. This paradoxical result was mirrored by EEG indices of RL, which were strengthened under higher WM loads and predictive of more robust future behavioral retention of learned stimulus-response contingencies. We further tested whether stress alters the ability to shift between the two systems strategically to maximize immediate learning versus retention of information and found that induced stress had only a limited effect on this trade-off. The present results offer a deeper understanding of the cooperative interaction between WM and RL and show that relying on WM can benefit the rapid acquisition of choice behavior during learning but impairs retention.SIGNIFICANCE STATEMENT Successful learning is achieved by the joint contribution of the dopaminergic RL system and WM. The cooperative WM/RL model was productive in improving our understanding of the interplay between the two systems during learning, demonstrating that reliance on RL computations is modulated by WM load. However, the role of WM/RL systems in the retention of learned stimulus-response associations remained unestablished. Our results show that increased neural signatures of learning, indicative of greater RL computation, under high WM load also predicted better stimulus-response retention. This result supports a trade-off between the two systems, where degraded WM increases RL processing, which improves retention. Notably, we show that this cooperative interplay remains largely unaffected by acute stress.
Collapse
Affiliation(s)
- Rachel Rac-Lubashevsky
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, Rhode Island 02912
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02912
| | - Anna Cremer
- Department of Cognitive Psychology, Universitat Hamburg, 20146 Hamburg, Germany
| | - Anne G E Collins
- Department of Psychology, University of California, Berkeley, Berkeley, California 94720-1650
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California 94720
| | - Michael J Frank
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, Rhode Island 02912
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02912
| | - Lars Schwabe
- Department of Cognitive Psychology, Universitat Hamburg, 20146 Hamburg, Germany
| |
Collapse
|
24
|
Zeng J, Meng J, Wang C, Leng W, Zhong X, Gong A, Bo S, Jiang C. High vagally mediated resting-state heart rate variability is associated with superior working memory function. Front Neurosci 2023; 17:1119405. [PMID: 36891458 PMCID: PMC9986304 DOI: 10.3389/fnins.2023.1119405] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 01/30/2023] [Indexed: 02/22/2023] Open
Abstract
Background Heart rate variability (HRV), a cardiac vagal tone indicator, has been proven to predict performance on some cognitive tasks that rely on the prefrontal cortex. However, the relationship between vagal tone and working memory remains understudied. This study explores the link between vagal tone and working memory function, combined with behavioral tasks and functional near-infrared spectroscopy (fNIRS). Methods A total of 42 undergraduate students were tested for 5-min resting-state HRV to obtain the root mean square of successive differences (rMSSD) data, and then divided into high and low vagal tone groups according to the median of rMSSD data. The two groups underwent the n-back test, and fNIRS was used to measure the neural activity in the test state. ANOVA and the independent sample t-test were performed to compare group mean differences, and the Pearson correlation coefficient was used for correlation analysis. Results The high vagal tone group had a shorter reaction time, higher accuracy, lower inverse efficiency score, and lower oxy-Hb concentration in the bilateral prefrontal cortex in the working memory tasks state. Furthermore, there were associations between behavioral performance, oxy-Hb concentration, and resting-state rMSSD. Conclusion Our findings suggest that high vagally mediated resting-state HRV is associated with working memory performance. High vagal tone means a higher efficiency of neural resources, beneficial to presenting a better working memory function.
Collapse
Affiliation(s)
- Jia Zeng
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China
| | - Jiao Meng
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China
| | - Chen Wang
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China
| | - Wenwu Leng
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China
| | - Xiaoke Zhong
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China
| | - Anmin Gong
- School of Information Engineering, Engineering University of People's Armed Police, Xi'an, China
| | - Shumin Bo
- School of Kinesiology and Health, Capital University of Physical Education and Sports, Beijing, China
| | - Changhao Jiang
- The Center of Neuroscience and Sports, Capital University of Physical Education and Sports, Beijing, China.,School of Kinesiology and Health, Capital University of Physical Education and Sports, Beijing, China
| |
Collapse
|
25
|
Abstract
In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus-response values that change incrementally. RL models consider any response type indiscriminately, ranging from more concretely defined motor choices (pressing a key with the index finger), to more general choices that can be executed in a number of ways (selecting dinner at the restaurant). However, does the learning process vary as a function of the choice type? In Experiment 1, we show that it does: Participants were slower and less accurate in learning correct choices of a general format compared with learning more concrete motor actions. Using computational modeling, we show that two mechanisms contribute to this. First, there was evidence of irrelevant credit assignment: The values of motor actions interfered with the values of other choice dimensions, resulting in more incorrect choices when the correct response was not defined by a single motor action; second, information integration for relevant general choices was slower. In Experiment 2, we replicated and further extended the findings from Experiment 1 by showing that slowed learning was attributable to weaker working memory use, rather than slowed RL. In both experiments, we ruled out the explanation that the difference in performance between two condition types was driven by difficulty/different levels of complexity. We conclude that defining a more abstract choice space used by multiple learning systems for credit assignment recruits executive resources, limiting how much such processes then contribute to fast learning.
Collapse
Affiliation(s)
| | - Amy Zou
- University of California, Berkeley
| | - Anne G E Collins
- University of California, Berkeley
- Helen Wills Neuroscience Institute Berkeley, CA
| |
Collapse
|
26
|
Ben-Artzi I, Luria R, Shahar N. Working memory capacity estimates moderate value learning for outcome-irrelevant features. Sci Rep 2022; 12:19677. [PMID: 36385131 PMCID: PMC9669000 DOI: 10.1038/s41598-022-21832-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 10/04/2022] [Indexed: 11/17/2022] Open
Abstract
To establish accurate action-outcome associations in the environment, individuals must refrain from assigning value to outcome-irrelevant features. However, studies have largely ignored the role of attentional control processes on action value updating. In the current study, we examined the extent to which working memory-a system that can filter and block the processing of irrelevant information in one's mind-also filters outcome-irrelevant information during value-based learning. For this aim, 174 individuals completed a well-established working memory capacity measurement and a reinforcement learning task designed to estimate outcome-irrelevant learning. We replicated previous studies showing a group-level tendency to assign value to tasks' response keys, despite clear instructions and practice suggesting they are irrelevant to the prediction of monetary outcomes. Importantly, individuals with higher working memory capacity were less likely to assign value to the outcome-irrelevant response keys, thus suggesting a significant moderation effect of working memory capacity on outcome-irrelevant learning. We discuss the role of working memory processing on value-based learning through the lens of a cognitive control failure.
Collapse
Affiliation(s)
- Ido Ben-Artzi
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel.
| | - Roy Luria
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Nitzan Shahar
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
27
|
Nicholas J, Daw ND, Shohamy D. Uncertainty alters the balance between incremental learning and episodic memory. eLife 2022; 11:81679. [PMID: 36458809 PMCID: PMC9810331 DOI: 10.7554/elife.81679] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 12/01/2022] [Indexed: 12/04/2022] Open
Abstract
A key question in decision-making is how humans arbitrate between competing learning and memory systems to maximize reward. We address this question by probing the balance between the effects, on choice, of incremental trial-and-error learning versus episodic memories of individual events. Although a rich literature has studied incremental learning in isolation, the role of episodic memory in decision-making has only recently drawn focus, and little research disentangles their separate contributions. We hypothesized that the brain arbitrates rationally between these two systems, relying on each in circumstances to which it is most suited, as indicated by uncertainty. We tested this hypothesis by directly contrasting contributions of episodic and incremental influence to decisions, while manipulating the relative uncertainty of incremental learning using a well-established manipulation of reward volatility. Across two large, independent samples of young adults, participants traded these influences off rationally, depending more on episodic information when incremental summaries were more uncertain. These results support the proposal that the brain optimizes the balance between different forms of learning and memory according to their relative uncertainties and elucidate the circumstances under which episodic memory informs decisions.
Collapse
Affiliation(s)
- Jonathan Nicholas
- Department of Psychology, Columbia UniversityNew YorkUnited States,Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia UniversityNew YorkUnited States
| | - Nathaniel D Daw
- Department of Psychology, Princeton UniversityPrincetonUnited States,Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| | - Daphna Shohamy
- Department of Psychology, Columbia UniversityNew YorkUnited States,Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia UniversityNew YorkUnited States,The Kavli Institute for Brain Science, Columbia UniversityNew YorkUnited States
| |
Collapse
|