1
|
Zheng Y, Mei S. Neural dissociation between reward and salience prediction errors through the lens of optimistic bias. Hum Brain Mapp 2023; 44:4545-4560. [PMID: 37334979 PMCID: PMC10365237 DOI: 10.1002/hbm.26398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/04/2023] [Accepted: 05/31/2023] [Indexed: 06/21/2023] Open
Abstract
The question of how the brain represents reward prediction errors is central to reinforcement learning and adaptive, goal-directed behavior. Previous studies have revealed prediction error representations in multiple electrophysiological signatures, but it remains elusive whether these electrophysiological correlates underlying prediction errors are sensitive to valence (in a signed form) or to salience (in an unsigned form). One possible reason concerns the loose correspondence between objective probability and subjective prediction resulting from the optimistic bias, that is, the tendency to overestimate the likelihood of encountering positive future events. In the present electroencephalography (EEG) study, we approached this question by directly measuring participants' idiosyncratic, trial-to-trial prediction errors elicited by subjective and objective probabilities across two experiments. We adopted monetary gain and loss feedback in Experiment 1 and positive and negative feedback as communicated by the same zero-value feedback in Experiment 2. We provided electrophysiological evidence in time and time-frequency domains supporting both reward and salience prediction error signals. Moreover, we showed that these electrophysiological signatures were highly flexible and sensitive to an optimistic bias and various forms of salience. Our findings shed new light on multiple presentations of prediction error in the human brain, which differ in format and functional role.
Collapse
Affiliation(s)
- Ya Zheng
- Department of PsychologyGuangzhou UniversityGuangzhouChina
| | - Shuting Mei
- School of Psychological and Cognitive SciencesPeking UniversityBeijingChina
| |
Collapse
|
2
|
Liu F, Jiang Y, Li S. The dissociating of reward feedback on familiarity and recollection processing: evidence from event-related potential. Neuroreport 2022; 33:429-436. [PMID: 35623088 DOI: 10.1097/wnr.0000000000001801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES Although previous studies have explored the effect of reward feedback on recognition memory, electrophysiological evidence for reward-enhanced memory and its underlying processing mechanisms remains unclear. METHODS This study adopts reward-learning and recognition memory tasks. Participants were asked to learn the reward values of two-color images (each color image had either reward or nonreward feedback) in the reward-learning task, and then tested their recognition memory performance with reward and nonreward feedback items. RESULTS Results demonstrated that the recognition memory performance of rewarded items was better than that of nonrewarded items. During the reward-learning period, nonreward feedback elicited larger feedback-related negativity (FRN) and P300 amplitudes compared with reward feedback. The findings indicated that participants mainly engaged in prediction error processing in the early stage, followed by comparing and context update of the learned items. During the recognition memory period, reward items elicited larger FN400 amplitude and smaller LPC amplitude compared with nonreward items. This suggests that reward item retrieval has deeper memory traces and can identify items faster, relying mainly on familiarity processing. Conversely, nonreward, as a general or inhibitory item, requires more detail and cognitive resources, that is, relies on recollection processing. CONCLUSIONS These findings indicated that participants had different process patterns between reward and nonreward items during recognition retrieval.
Collapse
Affiliation(s)
- Fangfang Liu
- School of Psychology, Northeast Normal University, Changchun, Jilin, China
| | | | | |
Collapse
|
3
|
Chen XJ, van den Berg B, Kwak Y. Reward and expectancy effects on neural signals of motor preparation and execution. Cortex 2022; 150:29-46. [DOI: 10.1016/j.cortex.2022.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 11/01/2021] [Accepted: 01/27/2022] [Indexed: 11/03/2022]
|
4
|
Reward prediction errors drive declarative learning irrespective of agency. Psychon Bull Rev 2021; 28:2045-2056. [PMID: 34131890 DOI: 10.3758/s13423-021-01952-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2021] [Indexed: 11/08/2022]
Abstract
Recent years have witnessed a steady increase in the number of studies investigating the role of reward prediction errors (RPEs) in declarative learning. Specifically, in several experimental paradigms, RPEs drive declarative learning, with larger and more positive RPEs enhancing declarative learning. However, it is unknown whether this RPE must derive from the participant's own response, or whether instead, any RPE is sufficient to obtain the learning effect. To test this, we generated RPEs in the same experimental paradigm where we combined an agency and a nonagency condition. We observed no interaction between RPE and agency, suggesting that any RPE (irrespective of its source) can drive declarative learning. This result holds implications for declarative learning theory.
Collapse
|
5
|
Rouhani N, Niv Y. Signed and unsigned reward prediction errors dynamically enhance learning and memory. eLife 2021; 10:e61077. [PMID: 33661094 PMCID: PMC8041467 DOI: 10.7554/elife.61077] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 02/26/2021] [Indexed: 02/05/2023] Open
Abstract
Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to behavior, we find that both RPEs contribute to learning by modulating a dynamically changing learning rate. We further characterize the effects of these RPE signals on memory and show that both signed and unsigned RPEs enhance memory, in line with midbrain dopamine and locus-coeruleus modulation of hippocampal plasticity, thereby reconciling separate findings in the literature.
Collapse
Affiliation(s)
- Nina Rouhani
- Chen Neuroscience Institute, California Institute of TechnologyPasadenaUnited States
| | - Yael Niv
- Department of Psychology, Princeton UniversityPrincetonUnited States
- Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| |
Collapse
|
6
|
Signed Reward Prediction Errors in the Ventral Striatum Drive Episodic Memory. J Neurosci 2020; 41:1716-1726. [PMID: 33334870 DOI: 10.1523/jneurosci.1785-20.2020] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 12/04/2020] [Accepted: 12/07/2020] [Indexed: 11/21/2022] Open
Abstract
Recent behavioral evidence implicates reward prediction errors (RPEs) as a key factor in the acquisition of episodic memory. Yet, important neural predictions related to the role of RPEs in episodic memory acquisition remain to be tested. Humans (both sexes) performed a novel variable-choice task where we experimentally manipulated RPEs and found support for key neural predictions with fMRI. Our results show that in line with previous behavioral observations, episodic memory accuracy increases with the magnitude of signed (i.e., better/worse-than-expected) RPEs (SRPEs). Neurally, we observe that SRPEs are encoded in the ventral striatum (VS). Crucially, we demonstrate through mediation analysis that activation in the VS mediates the experimental manipulation of SRPEs on episodic memory accuracy. In particular, SRPE-based responses in the VS (during learning) predict the strength of subsequent episodic memory (during recollection). Furthermore, functional connectivity between task-relevant processing areas (i.e., face-selective areas) and hippocampus and ventral striatum increased as a function of RPE value (during learning), suggesting a central role of these areas in episodic memory formation. Our results consolidate reinforcement learning theory and striatal RPEs as key factors subtending the formation of episodic memory.SIGNIFICANCE STATEMENT Recent behavioral research has shown that reward prediction errors (RPEs), a key concept of reinforcement learning theory, are crucial to the formation of episodic memories. In this study, we reveal the neural underpinnings of this process. Using fMRI, we show that signed RPEs (SRPEs) are encoded in the ventral striatum (VS), and crucially, that SRPE VS activity is responsible for the subsequent recollection accuracy of one-shot learned episodic memory associations.
Collapse
|
7
|
Learning to Synchronize: Midfrontal Theta Dynamics during Rule Switching. J Neurosci 2020; 41:1516-1528. [PMID: 33310756 DOI: 10.1523/jneurosci.1874-20.2020] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 11/24/2020] [Accepted: 12/02/2020] [Indexed: 11/21/2022] Open
Abstract
In recent years, several hierarchical extensions of well-known learning algorithms have been proposed. For example, when stimulus-action mappings vary across time or context, the brain may learn two or more stimulus-action mappings in separate modules, and additionally (at a hierarchically higher level) learn to appropriately switch between those modules. However, how the brain mechanistically coordinates neural communication to implement such hierarchical learning remains unknown. Therefore, the current study tests a recent computational model that proposed how midfrontal theta oscillations implement such hierarchical learning via the principle of binding by synchrony (Sync model). More specifically, the Sync model uses bursts at theta frequency to flexibly bind appropriate task modules by synchrony. The 64-channel EEG signal was recorded while 27 human subjects (female: 21, male: 6) performed a probabilistic reversal learning task. In line with the Sync model, postfeedback theta power showed a linear relationship with negative prediction errors, but not with positive prediction errors. This relationship was especially pronounced for subjects with better behavioral fit (measured via Akaike information criterion) of the Sync model. Also consistent with Sync model simulations, theta phase-coupling between midfrontal electrodes and temporoparietal electrodes was stronger after negative feedback. Our data suggest that the brain uses theta power and synchronization for flexibly switching between task rule modules, as is useful, for example, when multiple stimulus-action mappings must be retained and used.SIGNIFICANCE STATEMENT Everyday life requires flexibility in switching between several rules. A key question in understanding this ability is how the brain mechanistically coordinates such switches. The current study tests a recent computational framework (Sync model) that proposed how midfrontal theta oscillations coordinate activity in hierarchically lower task-related areas. In line with predictions of this Sync model, midfrontal theta power was stronger when rule switches were most likely (strong negative prediction error), especially in subjects who obtained a better model fit. Additionally, also theta phase connectivity between midfrontal and task-related areas was increased after negative feedback. Thus, the data provided support for the hypothesis that the brain uses theta power and synchronization for flexibly switching between rules.
Collapse
|
8
|
Ergo K, De Loof E, Debra G, Pastötter B, Verguts T. Failure to modulate reward prediction errors in declarative learning with theta (6 Hz) frequency transcranial alternating current stimulation. PLoS One 2020; 15:e0237829. [PMID: 33270685 PMCID: PMC7714179 DOI: 10.1371/journal.pone.0237829] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 11/18/2020] [Indexed: 12/26/2022] Open
Abstract
Recent evidence suggests that reward prediction errors (RPEs) play an important role in declarative learning, but its neurophysiological mechanism remains unclear. Here, we tested the hypothesis that RPEs modulate declarative learning via theta-frequency oscillations, which have been related to memory encoding in prior work. For that purpose, we examined the interaction between RPE and transcranial Alternating Current Stimulation (tACS) in declarative learning. Using a between-subject (real versus sham stimulation group), single-blind stimulation design, 76 participants learned 60 Dutch-Swahili word pairs, while theta-frequency (6 Hz) tACS was administered over the medial frontal cortex (MFC). Previous studies have implicated MFC in memory encoding. We replicated our previous finding of signed RPEs (SRPEs) boosting declarative learning; with larger and more positive RPEs enhancing memory performance. However, tACS failed to modulate the SRPE effect in declarative learning and did not affect memory performance. Bayesian statistics supported evidence for an absence of effect. Our study confirms a role of RPE in declarative learning, but also calls for standardized procedures in transcranial electrical stimulation.
Collapse
Affiliation(s)
- Kate Ergo
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Esther De Loof
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Gillian Debra
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | | | - Tom Verguts
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
9
|
Rouhani N, Norman KA, Niv Y, Bornstein AM. Reward prediction errors create event boundaries in memory. Cognition 2020; 203:104269. [PMID: 32563083 DOI: 10.1016/j.cognition.2020.104269] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2019] [Revised: 03/13/2020] [Accepted: 03/16/2020] [Indexed: 10/24/2022]
Abstract
We remember when things change. Particularly salient are experiences where there is a change in rewards, eliciting reward prediction errors (RPEs). How do RPEs influence our memory of those experiences? One idea is that this signal directly enhances the encoding of memory. Another, not mutually exclusive, idea is that the RPE signals a deeper change in the environment, leading to the mnemonic separation of subsequent experiences from what came before, thereby creating a new latent context and a more separate memory trace. We tested this in four experiments where participants learned to predict rewards associated with a series of trial-unique images. High-magnitude RPEs indicated a change in the underlying distribution of rewards. To test whether these large RPEs created a new latent context, we first assessed recognition priming for sequential pairs that included a high-RPE event or not (Exp. 1: n = 27 & Exp. 2: n = 83). We found evidence of recognition priming for the high-RPE event, indicating that the high-RPE event is bound to its predecessor in memory. Given that high-RPE events are themselves preferentially remembered (Rouhani, Norman, & Niv, 2018), we next tested whether there was an event boundary across a high-RPE event (i.e., excluding the high-RPE event itself; Exp. 3: n = 85). Here, sequential pairs across a high RPE no longer showed recognition priming whereas pairs within the same latent reward state did, providing initial evidence for an RPE-modulated event boundary. We then investigated whether RPE event boundaries disrupt temporal memory by asking participants to order and estimate the distance between two events that had either included a high-RPE event between them or not (Exp. 4). We found (n = 49) and replicated (n = 77) worse sequence memory for events across a high RPE. In line with our recognition priming results, we did not find sequence memory to be impaired between the high-RPE event and its predecessor, but instead found worse sequence memory for pairs across a high-RPE event. Moreover, greater distance between events at encoding led to better sequence memory for events across a low-RPE event, but not a high-RPE event, suggesting separate mechanisms for the temporal ordering of events within versus across a latent reward context. Altogether, these findings demonstrate that high-RPE events are both more strongly encoded, show intact links with their predecessor, and act as event boundaries that interrupt the sequential integration of events. We captured these effects in a variant of the Context Maintenance and Retrieval model (CMR; Polyn, Norman, & Kahana, 2009), modified to incorporate RPEs into the encoding process.
Collapse
Affiliation(s)
- Nina Rouhani
- Princeton Neuroscience Institute, Princeton University, United States of America; Department of Psychology, Princeton University, United States of America.
| | - Kenneth A Norman
- Princeton Neuroscience Institute, Princeton University, United States of America; Department of Psychology, Princeton University, United States of America
| | - Yael Niv
- Princeton Neuroscience Institute, Princeton University, United States of America; Department of Psychology, Princeton University, United States of America
| | - Aaron M Bornstein
- Department of Cognitive Sciences and Center for the Neurobiology of Learning and Memory, University of California, Irvine, United States of America
| |
Collapse
|
10
|
Ergo K, De Loof E, Verguts T. Reward Prediction Error and Declarative Memory. Trends Cogn Sci 2020; 24:388-397. [PMID: 32298624 DOI: 10.1016/j.tics.2020.02.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 02/03/2020] [Accepted: 02/22/2020] [Indexed: 01/04/2023]
Abstract
Learning based on reward prediction error (RPE) was originally proposed in the context of nondeclarative memory. We postulate that RPE may support declarative memory as well. Indeed, recent years have witnessed a number of independent empirical studies reporting effects of RPE on declarative memory. We provide a brief overview of these studies, identify emerging patterns, and discuss open issues such as the role of signed versus unsigned RPEs in declarative learning.
Collapse
Affiliation(s)
- Kate Ergo
- Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, B-9000 Ghent, Belgium
| | - Esther De Loof
- Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, B-9000 Ghent, Belgium
| | - Tom Verguts
- Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, B-9000 Ghent, Belgium.
| |
Collapse
|
11
|
Mason A, Lorimer A, Farrell S. Expected Value of Reward Predicts Episodic Memory for Incidentally Learnt Reward-Item Associations. COLLABRA: PSYCHOLOGY 2019. [DOI: 10.1525/collabra.217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In this paper, we draw connections between reward processing and cognition by behaviourally testing the implications of neurobiological theories of reward processing on memory. Single-cell neurophysiology in non-human primates and imaging work in humans suggests that the dopaminergic reward system responds to different components of reward: expected value; outcome or prediction error; and uncertainty of reward (Schultz et al., 2008). The literature on both incidental and motivated learning has focused on understanding how expected value and outcome—linked to increased activity in the reward system—lead to consolidation-related memory enhancements. In the current study, we additionally investigate the impact of reward uncertainty on human memory. The contribution of reward uncertainty—the spread of the reward probability distribution irrespective of the magnitude—has not been previously examined. To examine the effects of uncertainty on memory, a word-learning task was introduced, along with a surprise delayed recognition memory test. Using Bayesian model selection, we found evidence only for expected value as a predictor of memory performance. Our findings suggest that reward uncertainty does not enhance memory for individual items. This supports emerging evidence that an effect of uncertainty on memory is only observed in high compared to low risk environments.
Collapse
|