1
|
Chase HW. A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI. Front Psychol 2023; 14:1211528. [PMID: 38187436 PMCID: PMC10768009 DOI: 10.3389/fpsyg.2023.1211528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 11/28/2023] [Indexed: 01/09/2024] Open
Abstract
Introduction Computational models play an increasingly important role in describing variation in neural activation in human neuroimaging experiments, including evaluating individual differences in the context of psychiatric neuroimaging. In particular, reinforcement learning (RL) techniques have been widely adopted to examine neural responses to reward prediction errors and stimulus or action values, and how these might vary as a function of clinical status. However, there is a lack of consensus around the importance of the precision of free parameter estimation for these methods, particularly with regard to the learning rate. In the present study, I introduce a novel technique which may be used within a general linear model (GLM) to model the effect of mis-estimation of the learning rate on reward prediction error (RPE)-related neural responses. Methods Simulations employed a simple RL algorithm, which was used to generate hypothetical neural activations that would be expected to be observed in functional magnetic resonance imaging (fMRI) studies of RL. Similar RL models were incorporated within a GLM-based analysis method including derivatives, with individual differences in the resulting GLM-derived beta parameters being evaluated with respect to the free parameters of the RL model or being submitted to other validation analyses. Results Initial simulations demonstrated that the conventional approach to fitting RL models to RPE responses is more likely to reflect individual differences in a reinforcement efficacy construct (lambda) rather than learning rate (alpha). The proposed method, adding a derivative regressor to the GLM, provides a second regressor which reflects the learning rate. Validation analyses were performed including examining another comparable method which yielded highly similar results, and a demonstration of sensitivity of the method in presence of fMRI-like noise. Conclusion Overall, the findings underscore the importance of the lambda parameter for interpreting individual differences in RPE-coupled neural activity, and validate a novel neural metric of the modulation of such activity by individual differences in the learning rate. The method is expected to find application in understanding aberrant reinforcement learning across different psychiatric patient groups including major depression and substance use disorder.
Collapse
Affiliation(s)
- Henry W. Chase
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| |
Collapse
|
2
|
Sebri V, Triberti S, Granic GD, Pravettoni G. Reward-dependent dynamics and changes in risk taking in the Balloon Analogue Risk Task. JOURNAL OF COGNITIVE PSYCHOLOGY 2023. [DOI: 10.1080/20445911.2023.2181065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Affiliation(s)
- Valeria Sebri
- Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
- Applied Research Division for Cognitive and Psychological Science, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Stefano Triberti
- Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
- Applied Research Division for Cognitive and Psychological Science, IEO, European Institute of Oncology IRCCS, Milan, Italy
| | - Georg D. Granic
- Department of Applied Economics, Erasmus University Rotterdam, Rotterdam, the Netherlands
- Department of Marketing, University of Antwerp, Antwerp, Belgium
| | - Gabriella Pravettoni
- Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
- Applied Research Division for Cognitive and Psychological Science, IEO, European Institute of Oncology IRCCS, Milan, Italy
| |
Collapse
|
3
|
A neural and behavioral trade-off between value and uncertainty underlies exploratory decisions in normative anxiety. Mol Psychiatry 2022; 27:1573-1587. [PMID: 34725456 DOI: 10.1038/s41380-021-01363-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 10/10/2021] [Accepted: 10/14/2021] [Indexed: 11/08/2022]
Abstract
Exploration reduces uncertainty about the environment and improves the quality of future decisions, but at the cost of provisional uncertain and suboptimal outcomes. Although anxiety promotes intolerance to uncertainty, it remains unclear whether and by which mechanisms anxiety relates to exploratory decision-making. We use a dynamic three-armed-bandit task and find that higher trait-anxiety is associated with increased exploration, which in turn harms overall performance. We identify two distinct behavioral sources: first, decisions made by anxious individuals are guided toward reduction of uncertainty; and second, decisions are less guided by immediate value gains. These findings are similar in both loss and gain domains, and further demonstrate that an affective trait relates to exploration and results in an inverse-U-shaped relationship between anxiety and overall performance. Additional imaging data (fMRI) suggests that normative anxiety correlates negatively with the representation of expected-value in the dorsal-anterior-cingulate-cortex, and in contrast, positively with the representation of uncertainty in the anterior-insula. We conclude that a trade-off between value-gains and uncertainty-reduction entails maladaptive decision-making in individuals with higher normal-range anxiety.
Collapse
|
4
|
Teşileanu T, Golkar S, Nasiri S, Sengupta AM, Chklovskii DB. Neural Circuits for Dynamics-Based Segmentation of Time Series. Neural Comput 2022; 34:891-938. [PMID: 35026035 DOI: 10.1162/neco_a_01476] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 10/15/2021] [Indexed: 11/04/2022]
Abstract
The brain must extract behaviorally relevant latent variables from the signals streamed by the sensory organs. Such latent variables are often encoded in the dynamics that generated the signal rather than in the specific realization of the waveform. Therefore, one problem faced by the brain is to segment time series based on underlying dynamics. We present two algorithms for performing this segmentation task that are biologically plausible, which we define as acting in a streaming setting and all learning rules being local. One algorithm is model based and can be derived from an optimization problem involving a mixture of autoregressive processes. This algorithm relies on feedback in the form of a prediction error and can also be used for forecasting future samples. In some brain regions, such as the retina, the feedback connections necessary to use the prediction error for learning are absent. For this case, we propose a second, model-free algorithm that uses a running estimate of the autocorrelation structure of the signal to perform the segmentation. We show that both algorithms do well when tasked with segmenting signals drawn from autoregressive models with piecewise-constant parameters. In particular, the segmentation accuracy is similar to that obtained from oracle-like methods in which the ground-truth parameters of the autoregressive models are known. We also test our methods on data sets generated by alternating snippets of voice recordings. We provide implementations of our algorithms at https://github.com/ttesileanu/bio-time-series.
Collapse
Affiliation(s)
- Tiberiu Teşileanu
- Center for Computational Neuroscience, Flatiron Institute, New York, NY 10010, U.S.A.
| | - Siavash Golkar
- Center for Computational Neuroscience, Flatiron Institute, New York, NY 10010, U.S.A.
| | - Samaneh Nasiri
- Department of Neurology, Harvard Medical School, Boston, MA 02115, U.S.A.
| | - Anirvan M Sengupta
- Center for Computational Neuroscience, Flatiron Institute, New York, NY 10010, and Department of Physics and Astronomy, Rutgers University, Piscataway, NJ 08854, U.S.A.
| | - Dmitri B Chklovskii
- Center for Computational Neuroscience, Flatiron Institute, New York, NY 10010, and Neuroscience Institute, NYU Langone Medical Center, New York, NY, U.S.A.
| |
Collapse
|
5
|
Reward and fictive prediction error signals in ventral striatum: asymmetry between factual and counterfactual processing. Brain Struct Funct 2021; 226:1553-1569. [PMID: 33839955 DOI: 10.1007/s00429-021-02270-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Accepted: 03/27/2021] [Indexed: 10/21/2022]
Abstract
Reward prediction error, the difference between the expected and obtained reward, is known to act as a reinforcement learning neural signal. In the current study, we propose a model fitting approach that combines behavioral and neural data to fit computational models of reinforcement learning. Briefly, we penalized subject-specific fitted parameters that moved away too far from the group median, except when that deviation led to an improvement in the model's fit to neural responses. By means of a probabilistic monetary learning task and fMRI, we compared our approach with standard model fitting methods. Q-learning outperformed actor-critic at both behavioral and neural level, although the inclusion of neuroimaging data into model fitting improved the fit of actor-critic models. We observed both action-value and state-value prediction error signals in the striatum, while standard model fitting approaches failed to capture state-value signals. Finally, left ventral striatum correlated with reward prediction error while right ventral striatum with fictive prediction error, suggesting a functional hemispheric asymmetry regarding prediction-error driven learning.
Collapse
|
6
|
Thompson K, Nahmias E, Fani N, Kvaran T, Turner J, Tone E. The Prisoner's Dilemma paradigm provides a neurobiological framework for the social decision cascade. PLoS One 2021; 16:e0248006. [PMID: 33735226 PMCID: PMC7971531 DOI: 10.1371/journal.pone.0248006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 02/17/2021] [Indexed: 11/18/2022] Open
Abstract
To function during social interactions, we must be able to consider and coordinate our actions with other people's perspectives. This process unfolds from decision-making, to anticipation of that decision's consequences, to feedback about those consequences, in what can be described as a "cascade" of three phases. The iterated Prisoner's Dilemma (iPD) task, an economic-exchange game used to illustrate how people achieve stable cooperation over repeated interactions, provides a framework for examining this "social decision cascade". In the present study, we examined neural activity associated with the three phases of the cascade, which can be isolated during iPD game rounds. While undergoing functional magnetic resonance imaging (fMRI), 31 adult participants made a) decisions about whether to cooperate with a co-player for a monetary reward, b) anticipated the co-player's decision, and then c) learned the co-player's decision. Across all three phases, participants recruited the temporoparietal junction (TPJ) and the dorsomedial prefrontal cortex (dmPFC), regions implicated in numerous facets of social reasoning such as perspective-taking and the judgement of intentions. Additionally, a common distributed neural network underlies both decision-making and feedback appraisal; however, differences were identified in the magnitude of recruitment between both phases. Furthermore, there was limited evidence that anticipation following the decision to defect evoked a neural signature that is distinct from the signature of anticipation following the decision to cooperate. This study is the first to delineate the neural substrates of the entire social decision cascade in the context of the iPD game.
Collapse
Affiliation(s)
- Khalil Thompson
- Department of Psychology, Georgia State University, Atlanta, Georgia, United States of America
| | - Eddy Nahmias
- Department of Psychology, Georgia State University, Atlanta, Georgia, United States of America
| | - Negar Fani
- Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, Georgia, United States of America
| | - Trevor Kvaran
- Department of Psychology, Georgia State University, Atlanta, Georgia, United States of America
| | - Jessica Turner
- Department of Psychology, Georgia State University, Atlanta, Georgia, United States of America
| | - Erin Tone
- Department of Psychology, Georgia State University, Atlanta, Georgia, United States of America
| |
Collapse
|
7
|
Wang L, Yang G, Zheng Y, Li Z, Qi Y, Li Q, Liu X. Enhanced neural responses in specific phases of reward processing in individuals with Internet gaming disorder. J Behav Addict 2021; 10:99-111. [PMID: 33570505 PMCID: PMC8969865 DOI: 10.1556/2006.2021.00003] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 11/25/2020] [Accepted: 12/27/2020] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND AND AIMS Internet gaming disorder (IGD) has become a global health problem. The self-regulation model noted that a shift to reward system, whether due to overwhelming reward-seeking or impaired control, can lead to self-regulation failures, e.g., addiction. The present study focused on the reward processing of IGD, aiming to provide insights into the etiology of IGD. Reward processing includes three phases: reward anticipation, outcome monitoring and choice evaluation. However, it is not clear which phases of reward processing are different between individuals with IGD and healthy controls (HC). METHODS To address this issue, the present study asked 27 individuals with IGD and 26 HC to complete a roulette task during a functional MRI scan. RESULTS Compared with HC, individuals with IGD preferred to take risks in pursuit of high rewards behaviorally and showed exaggerated brain activity in the striatum (nucleus accumbens and caudate) during the reward anticipation and outcome monitoring but not during the choice evaluation. DISCUSSION These results reveal that the oversensitivity of the reward system to potential and positive rewards in college students with IGD drives them to approach risky options more frequently although they are able to assess the risk values of options and the correctness of decisions properly as HC do. CONCLUSIONS These findings provide partial support for the application of the self-regulation model to the IGD population. Moreover, this study enriches this model from the perspective of three phases of reward processing and provides specific targets for future research regarding effective treatment of IGD.
Collapse
Affiliation(s)
- Lingxiao Wang
- Department of Psychology, Beijing Key Laboratory of Learning and Cognition, Capital Normal University, Beijing, China,Center for Cognition and Brain Disorders, The Affiliated Hospital of Hangzhou Normal University, Hangzhou, Zhejiang Province, China,Institutes of Psychological Sciences, Hangzhou Normal University, Hangzhou, China,Zhejiang Key Laboratory for Research in Assessment of Cognitive Impairments, Hangzhou, Zhejiang Province, China
| | - Guochun Yang
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Ya Zheng
- Department of Psychology, Dalian Medical University, Dalian, China
| | - Zhenghan Li
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Yue Qi
- The Department of Psychology, Renmin University of China, Beijing, China,The Laboratory of the Department of Psychology, Renmin University of China, Beijing, China
| | - Qi Li
- Department of Psychology, Beijing Key Laboratory of Learning and Cognition, Capital Normal University, Beijing, China,CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China,Corresponding author. E-mail:
| | - Xun Liu
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
8
|
Smillie LD. What is reinforcement sensitivity? Neuroscience paradigms for approach‐avoidance process theories of personality. EUROPEAN JOURNAL OF PERSONALITY 2020. [DOI: 10.1002/per.674] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Reinforcement sensitivity is a concept proposed by Gray (1973) to describe the biological antecedents of personality, and has become the common mechanism among a family of personality theories concerning approach and avoidance processes. These theories suggest that 2–3 biobehavioural systems mediate the effects of reward and punishment on emotion and motivation, and that individual differences in the functioning of these systems manifest as personality. Identifying paradigms for operationalising reinforcement sensitivity is therefore critical for testing and developing these theories, and evaluating their footprint in personality space. In this paper I suggest that, while traditional self‐report paradigms in personality psychology may be less‐than‐ideal for this purpose, neuroscience paradigms may offer operations of reinforcement sensitivity at multiple levels of approach and avoidance processes. After brief reflection on the use of such methods in animal models—which first spawned the concept of reinforcement sensitivity—recent developments in four domains of neuroscience are reviewed. These are psychogenomics, psychopharmacology, neuroimaging and category‐learning. By exploring these paradigms as potential operations of reinforcement sensitivity we may enrich our understanding of the putative biobehavioural bases of personality. Copyright © 2008 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Luke D. Smillie
- Department of Psychology, Goldsmiths, University of London, London, UK
| |
Collapse
|
9
|
Smillie LD. The conceptualisation, measurement and scope of reinforcement sensitivity in the context of a neuroscience of personality. EUROPEAN JOURNAL OF PERSONALITY 2020. [DOI: 10.1002/per.687] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Reinforcement sensitivity theory (RST) is complex, and there are subtle differences between RST and other approach‐avoidance process theories of personality. However, most such theories posit a common biobehavioural mechanism underlying personality which we must therefore strive to understand: differential sensitivity to reinforcing stimuli. Reinforcement sensitivity is widely assessed using questionnaires, but should we treat such measures as (a) a proxy for reinforcement sensitivity itself (i.e. the underlying causes of personality) or (b) trait constructs potentially manifesting out of reinforcement sensitivity (i.e. the ‘surface’ of personality)? Might neuroscience paradigms, such as those I have reviewed in my target paper, provide an advantage over questionnaires in allowing us to move closer to (a), thereby improving both the measurement and our understanding of reinforcement sensitivity? Assuming we can achieve this, how useful is reinforcement sensitivity—and biological perspectives more generally—for explaining personality? These are the major questions raised in the discussion of my target paper, and among the most pertinent issues in this field today. Copyright © 2008 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Luke D. Smillie
- Department of Psychology, Goldsmiths, University of London, UK
| |
Collapse
|
10
|
Sevel L, Stennett B, Schneider V, Bush N, Nixon SJ, Robinson M, Boissoneault J. Acute Alcohol Intake Produces Widespread Decreases in Cortical Resting Signal Variability in Healthy Social Drinkers. Alcohol Clin Exp Res 2020; 44:1410-1419. [PMID: 32472620 PMCID: PMC7572592 DOI: 10.1111/acer.14381] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 05/14/2020] [Accepted: 05/16/2020] [Indexed: 11/28/2022]
Abstract
BACKGROUND Acute alcohol intoxication has wide-ranging neurobehavioral effects on psychomotor, attentional, inhibitory, and memory-related cognitive processes. These effects are mirrored in disruption of neural metabolism, functional activation, and functional network coherence. Metrics of intraregional neural dynamics such as regional signal variability (RSV) and brain entropy (BEN) may capture unique aspects of neural functional capacity in healthy and clinical populations; however, alcohol's influence on these metrics is unclear. The present study aimed to elucidate the influence of acute alcohol intoxication on RSV and to clarify these effects with subsequent BEN analyses. METHODS 26 healthy adults between 25 and 45 years of age (65.4% women) participated in 2 counterbalanced sessions. In one, participants consumed a beverage containing alcohol sufficient to produce a breath alcohol concentration of 0.08 g/dl. In the other, they consumed a placebo beverage. Approximately 35 minutes after beverage consumption, participants completed a 9-minute resting-state fMRI scan. Whole-brain, voxel-wise standard deviation was used to assess RSV, which was compared between sessions. Within clusters displaying alterations in RSV, sample entropy was calculated to assess BEN. RESULTS Compared to the placebo, alcohol intake resulted in widespread reductions in RSV in the bilateral middle frontal, right inferior frontal, right superior frontal, bilateral posterior cingulate, bilateral middle temporal, right supramarginal gyri, and bilateral inferior parietal lobule. Within these clusters, significant reductions in BEN were found in the bilateral middle frontal and right superior frontal gyri. No effects were noted in subcortical or cerebellar areas. CONCLUSIONS Findings indicate that alcohol intake produces diffuse reductions in RSV among structures associated with attentional processes. Within these structures, signal complexity was also reduced in a subset of frontal regions. Neurobehavioral effects of acute alcohol consumption may be partially driven by disruption of intraregional neural dynamics among regions involved in higher-order cognitive and attentional processes.
Collapse
Affiliation(s)
- Landrew Sevel
- Osher Center for Integrative Medicine at Vanderbilt, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Physical Medicine and Rehabilitation, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Bethany Stennett
- Center for Pain Research and Behavioral Health, University of Florida, Gainesville, FL, USA
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA
| | - Victor Schneider
- Center for Pain Research and Behavioral Health, University of Florida, Gainesville, FL, USA
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA
| | - Nicholas Bush
- Center for Pain Research and Behavioral Health, University of Florida, Gainesville, FL, USA
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA
| | - Sara Jo Nixon
- Department of Psychiatry, University of Florida, Gainesville, FL, USA
| | - Michael Robinson
- Center for Pain Research and Behavioral Health, University of Florida, Gainesville, FL, USA
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA
| | - Jeff Boissoneault
- Center for Pain Research and Behavioral Health, University of Florida, Gainesville, FL, USA
- Department of Clinical and Health Psychology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
11
|
Klasen M, Mathiak KA, Zvyagintsev M, Sarkheil P, Weber R, Mathiak K. Selective reward responses to violent success events during video games. Brain Struct Funct 2020; 225:57-69. [DOI: 10.1007/s00429-019-01986-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 11/14/2019] [Indexed: 01/10/2023]
|
12
|
Beltzer ML, Adams S, Beling PA, Teachman BA. Social anxiety and dynamic social reinforcement learning in a volatile environment. Clin Psychol Sci 2019; 7:1372-1388. [PMID: 32864197 DOI: 10.1177/2167702619858425] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Adaptive social behavior requires learning probabilities of social reward and punishment, and updating these probabilities when they change. Given prior research on aberrant reinforcement learning in affective disorders, this study examines how social anxiety affects probabilistic social reinforcement learning and dynamic updating of learned probabilities in a volatile environment. N=222 online participants completed questionnaires and a computerized ball-catching game with changing probabilities of reward and punishment. Dynamic learning rates were estimated to assess the relative importance ascribed to new information in response to volatility. Mixed-effects regression was used to analyze throw patterns as a function of social anxiety symptoms. Higher social anxiety predicted fewer throws to the previously punishing avatar and different learning rates after certain role changes, suggesting that social anxiety may be characterized by difficulty updating learned social probabilities. Socially anxious individuals may miss the chance to learn that a once-punishing situation no longer poses a threat.
Collapse
|
13
|
Tyler P, White SF, Thompson RW, Blair R. Applying a Cognitive Neuroscience Perspective to Disruptive Behavior Disorders: Implications for Schools. Dev Neuropsychol 2019; 44:17-42. [PMID: 29432037 PMCID: PMC6283690 DOI: 10.1080/87565641.2017.1334782] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
A cognitive neuroscience perspective seeks to understand behavior, in this case disruptive behavior disorders (DBD), in terms of dysfunction in cognitive processes underpinned by neural processes. While this type of approach has clear implications for clinical mental health practice, it also has implications for school-based assessment and intervention with children and adolescents who have disruptive behavior and aggression. This review articulates a cognitive neuroscience account of DBD by discussing the neurocognitive dysfunction related to emotional empathy, threat sensitivity, reinforcement-based decision-making, and response inhibition. The potential implications for current and future classroom-based assessments and interventions for students with these deficits are discussed.
Collapse
Affiliation(s)
- Patrick Tyler
- Center for Neurobehavioral Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
- Boys Town National Research Institute, Boys Town, Nebraska, USA
| | - Stuart F. White
- Center for Neurobehavioral Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | | | - R.J.R. Blair
- Center for Neurobehavioral Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| |
Collapse
|
14
|
Bowers ME, Buzzell GA, Bernat EM, Fox NA, Barker TV. Time-frequency approaches to investigating changes in feedback processing during childhood and adolescence. Psychophysiology 2018; 55:e13208. [PMID: 30112814 DOI: 10.1111/psyp.13208] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 05/02/2017] [Accepted: 05/06/2018] [Indexed: 12/13/2022]
Abstract
Processing feedback from the environment is an essential function during development to adapt behavior in advantageous ways. One measure of feedback processing, the feedback negativity (FN), is an ERP observed following the presentation of feedback. Findings detailing developmental changes in the FN have been mixed, possibly due to limitations in traditional ERP measurement methods. Recent work shows that both theta and delta frequency activity contribute to the FN; utilizing time-frequency methods to measure change in power and phase in these frequency bands may provide more accurate representation of feedback processing development in childhood and adolescence. We employ time-frequency power and intertrial phase synchrony measures, in addition to conventional time-domain ERP methods, to examine the development of feedback processing in the theta (4-7 Hz) and delta (.1-3 Hz) bands throughout adolescence. A sample of 54 female participants (8-17 years old) completed a gambling task while EEG was recorded. As expected, time-domain ERP amplitudes showed no association with age. In contrast, significant effects were observed for the time-frequency measures, with theta power decreasing with age and delta power increasing with age. For intertrial phase synchrony, delta synchrony increased with age, while age-related changes in theta synchrony differed for gains and losses. Collectively, these findings highlight the importance of considering time-frequency dynamics when exploring how the processing of feedback develops through late childhood and adolescence. In particular, the role of delta band activity and theta synchrony appear central to understanding age-related changes in the neural response to feedback.
Collapse
Affiliation(s)
- M E Bowers
- Neuroscience & Cognitive Science Program, University of Maryland, College Park, Maryland, USA
| | - G A Buzzell
- Department of Human Development & Quantitative Methodology, University of Maryland, College Park, Maryland, USA
| | - E M Bernat
- Neuroscience & Cognitive Science Program, University of Maryland, College Park, Maryland, USA.,Department of Psychology, University of Maryland, College Park, Maryland, USA
| | - N A Fox
- Neuroscience & Cognitive Science Program, University of Maryland, College Park, Maryland, USA.,Department of Human Development & Quantitative Methodology, University of Maryland, College Park, Maryland, USA
| | - T V Barker
- Prevention Science Institute, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
15
|
Deverett B, Koay SA, Oostland M, Wang SSH. Cerebellar involvement in an evidence-accumulation decision-making task. eLife 2018; 7:36781. [PMID: 30102151 PMCID: PMC6105309 DOI: 10.7554/elife.36781] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 08/11/2018] [Indexed: 12/18/2022] Open
Abstract
To make successful evidence-based decisions, the brain must rapidly and accurately transform sensory inputs into specific goal-directed behaviors. Most experimental work on this subject has focused on forebrain mechanisms. Using a novel evidence-accumulation task for mice, we performed recording and perturbation studies of crus I of the lateral posterior cerebellum, which communicates bidirectionally with numerous forebrain regions. Cerebellar inactivation led to a reduction in the fraction of correct trials. Using two-photon fluorescence imaging of calcium, we found that Purkinje cell somatic activity contained choice/evidence-related information. Decision errors were represented by dendritic calcium spikes, which in other contexts are known to drive cerebellar plasticity. We propose that cerebellar circuitry may contribute to computations that support accurate performance in this perceptual decision-making task.
Collapse
Affiliation(s)
- Ben Deverett
- Department of Molecular Biology, Princeton University, Princeton, United States.,Princeton Neuroscience Institute, Princeton University, Princeton, United States.,Rutgers Robert Wood Johnson Medical School, Piscataway, United States
| | - Sue Ann Koay
- Princeton Neuroscience Institute, Princeton University, Princeton, United States
| | - Marlies Oostland
- Princeton Neuroscience Institute, Princeton University, Princeton, United States
| | - Samuel S-H Wang
- Department of Molecular Biology, Princeton University, Princeton, United States.,Princeton Neuroscience Institute, Princeton University, Princeton, United States
| |
Collapse
|
16
|
D'Astolfo L, Rief W. Learning about Expectation Violation from Prediction Error Paradigms - A Meta-Analysis on Brain Processes Following a Prediction Error. Front Psychol 2017; 8:1253. [PMID: 28804467 PMCID: PMC5532445 DOI: 10.3389/fpsyg.2017.01253] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 07/10/2017] [Indexed: 11/13/2022] Open
Abstract
Modifying patients' expectations by exposing them to expectation violation situations (thus maximizing the difference between the expected and the actual situational outcome) is proposed to be a crucial mechanism for therapeutic success for a variety of different mental disorders. However, clinical observations suggest that patients often maintain their expectations regardless of experiences contradicting their expectations. It remains unclear which information processing mechanisms lead to modification or persistence of patients' expectations. Insight in the processing could be provided by Neuroimaging studies investigating prediction error (PE, i.e., neuronal reactions to non-expected stimuli). Two methods are often used to investigate the PE: (1) paradigms, in which participants passively observe PEs ("passive" paradigms) and (2) paradigms, which encourage a behavioral adaptation following a PE ("active" paradigms). These paradigms are similar to the methods used to induce expectation violations in clinical settings: (1) the confrontation with an expectation violation situation and (2) an enhanced confrontation in which the patient actively challenges his expectation. We used this similarity to gain insight in the different neuronal processing of the two PE paradigms. We performed a meta-analysis contrasting neuronal activity of PE paradigms encouraging a behavioral adaptation following a PE and paradigms enforcing passiveness following a PE. We found more neuronal activity in the striatum, the insula and the fusiform gyrus in studies encouraging behavioral adaptation following a PE. Due to the involvement of reward assessment and avoidance learning associated with the striatum and the insula we propose that the deliberate execution of action alternatives following a PE is associated with the integration of new information into previously existing expectations, therefore leading to an expectation change. While further research is needed to directly assess expectations of participants, this study provides new insights into the information processing mechanisms following an expectation violation.
Collapse
Affiliation(s)
- Lisa D'Astolfo
- Department of Clinical Psychology and Psychotherapy, Philipps University of MarburgMarburg, Germany
| | - Winfried Rief
- Department of Clinical Psychology and Psychotherapy, Philipps University of MarburgMarburg, Germany
| |
Collapse
|
17
|
Beyond negative valence: 2-week administration of a serotonergic antidepressant enhances both reward and effort learning signals. PLoS Biol 2017; 15:e2000756. [PMID: 28207733 PMCID: PMC5331946 DOI: 10.1371/journal.pbio.2000756] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Accepted: 01/19/2017] [Indexed: 12/21/2022] Open
Abstract
To make good decisions, humans need to learn about and integrate different sources of appetitive and aversive information. While serotonin has been linked to value-based decision-making, its role in learning is less clear, with acute manipulations often producing inconsistent results. Here, we show that when the effects of a selective serotonin reuptake inhibitor (SSRI, citalopram) are studied over longer timescales, learning is robustly improved. We measured brain activity with functional magnetic resonance imaging (fMRI) in volunteers as they performed a concurrent appetitive (money) and aversive (effort) learning task. We found that 2 weeks of citalopram enhanced reward and effort learning signals in a widespread network of brain regions, including ventromedial prefrontal and anterior cingulate cortex. At a behavioral level, this was accompanied by more robust reward learning. This suggests that serotonin can modulate the ability to learn via a mechanism that is independent of stimulus valence. Such effects may partly underlie SSRIs’ impact in treating psychological illnesses. Our results highlight both a specific function in learning for serotonin and the importance of studying its role across longer timescales. Drugs acting on the neurotransmitter serotonin in the brain are commonly prescribed to treat depression, but we still lack a complete understanding of their effects on the brain and behavior. We do, however, know that patients who suffer from depression learn about the links between their choices and pleasant and unpleasant outcomes in a different manner than healthy controls. Neural markers of learning are also weakened in depressed people. Here, we looked at the effects of a short-term course (2 weeks) of a serotonergic antidepressant on brain and behavior in healthy volunteers while they learnt to predict what consequences their choices had in a simple computer task. We found that the antidepressant increased how strongly brain areas concerned with predictions of pleasant and unpleasant consequences became active during learning of the task. At the same time, participants who had taken the antidepressant also performed better on the task. Our results suggest that serotonergic drugs might exert their beneficial clinical effects by changing how the brain learns.
Collapse
|
18
|
Nassar MR, Frank MJ. Taming the beast: extracting generalizable knowledge from computational models of cognition. Curr Opin Behav Sci 2016; 11:49-54. [PMID: 27574699 DOI: 10.1016/j.cobeha.2016.04.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Generalizing knowledge from experimental data requires constructing theories capable of explaining observations and extending beyond them. Computational modeling offers formal quantitative methods for generating and testing theories of cognition and neural processing. These techniques can be used to extract general principles from specific experimental measurements, but introduce dangers inherent to theory: model-based analyses are conditioned on a set of fixed assumptions that impact the interpretations of experimental data. When these conditions are not met, model-based results can be misleading or biased. Recent work in computational modeling has highlighted the implications of this problem and developed new methods for minimizing its negative impact. Here we discuss the issues that arise when data is interpreted through models and strategies for avoiding misinterpretation of data through model fitting.
Collapse
Affiliation(s)
- Matthew R Nassar
- Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University, Providence RI 02912-1821
| | - Michael J Frank
- Department of Cognitive, Linguistic and Psychological Sciences, Brown Institute for Brain Science, Brown University, Providence RI 02912-1821
| |
Collapse
|
19
|
Sepulveda P, Sitaram R, Rana M, Montalba C, Tejos C, Ruiz S. How feedback, motor imagery, and reward influence brain self-regulation using real-time fMRI. Hum Brain Mapp 2016; 37:3153-71. [PMID: 27272616 DOI: 10.1002/hbm.23228] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Revised: 04/15/2016] [Accepted: 04/18/2016] [Indexed: 02/05/2023] Open
Abstract
The learning process involved in achieving brain self-regulation is presumed to be related to several factors, such as type of feedback, reward, mental imagery, duration of training, among others. Explicitly instructing participants to use mental imagery and monetary reward are common practices in real-time fMRI (rtfMRI) neurofeedback (NF), under the assumption that they will enhance and accelerate the learning process. However, it is still not clear what the optimal strategy is for improving volitional control. We investigated the differential effect of feedback, explicit instructions and monetary reward while training healthy individuals to up-regulate the blood-oxygen-level dependent (BOLD) signal in the supplementary motor area (SMA). Four groups were trained in a two-day rtfMRI-NF protocol: GF with NF only, GF,I with NF + explicit instructions (motor imagery), GF,R with NF + monetary reward, and GF,I,R with NF + explicit instructions (motor imagery) + monetary reward. Our results showed that GF increased significantly their BOLD self-regulation from day-1 to day-2 and GF,R showed the highest BOLD signal amplitude in SMA during the training. The two groups who were instructed to use motor imagery did not show a significant learning effect over the 2 days. The additional factors, namely motor imagery and reward, tended to increase the intersubject variability in the SMA during the course of training. Whole brain univariate and functional connectivity analyses showed common as well as distinct patterns in the four groups, representing the varied influences of feedback, reward, and instructions on the brain. Hum Brain Mapp 37:3153-3171, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Pradyumna Sepulveda
- Biomedical Imaging Center, Pontificia Universidad Católica De Chile, Santiago, Chile.,Department of Electrical Engineering, Pontificia Universidad Católica De Chile, Santiago, Chile.,Laboratory of Brain-Machine Interfaces and Neuromodulation, Pontificia Universidad Católica De Chile, Santiago, Chile
| | - Ranganatha Sitaram
- Laboratory of Brain-Machine Interfaces and Neuromodulation, Pontificia Universidad Católica De Chile, Santiago, Chile.,Institute for Biological and Medical Engineering, Pontificia Universidad Católica De Chile, Santiago, Chile.,Department of Psychiatry, Faculty of Medicine, Interdisciplinary Center for Neuroscience, Pontificia Universidad Católica De Chile, Santiago, Chile.,Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen, Germany
| | - Mohit Rana
- Laboratory of Brain-Machine Interfaces and Neuromodulation, Pontificia Universidad Católica De Chile, Santiago, Chile.,Department of Psychiatry, Faculty of Medicine, Interdisciplinary Center for Neuroscience, Pontificia Universidad Católica De Chile, Santiago, Chile
| | - Cristian Montalba
- Biomedical Imaging Center, Pontificia Universidad Católica De Chile, Santiago, Chile
| | - Cristian Tejos
- Biomedical Imaging Center, Pontificia Universidad Católica De Chile, Santiago, Chile.,Department of Electrical Engineering, Pontificia Universidad Católica De Chile, Santiago, Chile
| | - Sergio Ruiz
- Laboratory of Brain-Machine Interfaces and Neuromodulation, Pontificia Universidad Católica De Chile, Santiago, Chile.,Department of Psychiatry, Faculty of Medicine, Interdisciplinary Center for Neuroscience, Pontificia Universidad Católica De Chile, Santiago, Chile.,Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen, Germany
| |
Collapse
|
20
|
Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2016; 15:435-59. [PMID: 25665667 DOI: 10.3758/s13415-015-0338-7] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Reinforcement learning describes motivated behavior in terms of two abstract signals. The representation of discrepancies between expected and actual rewards/punishments-prediction error-is thought to update the expected value of actions and predictive stimuli. Electrophysiological and lesion studies have suggested that mesostriatal prediction error signals control behavior through synaptic modification of cortico-striato-thalamic networks. Signals in the ventromedial prefrontal and orbitofrontal cortex are implicated in representing expected value. To obtain unbiased maps of these representations in the human brain, we performed a meta-analysis of functional magnetic resonance imaging studies that had employed algorithmic reinforcement learning models across a variety of experimental paradigms. We found that the ventral striatum (medial and lateral) and midbrain/thalamus represented reward prediction errors, consistent with animal studies. Prediction error signals were also seen in the frontal operculum/insula, particularly for social rewards. In Pavlovian studies, striatal prediction error signals extended into the amygdala, whereas instrumental tasks engaged the caudate. Prediction error maps were sensitive to the model-fitting procedure (fixed or individually estimated) and to the extent of spatial smoothing. A correlate of expected value was found in a posterior region of the ventromedial prefrontal cortex, caudal and medial to the orbitofrontal regions identified in animal studies. These findings highlight a reproducible motif of reinforcement learning in the cortico-striatal loops and identify methodological dimensions that may influence the reproducibility of activation patterns across studies.
Collapse
|
21
|
Effects of intrinsic motivation on feedback processing during learning. Neuroimage 2015; 119:175-86. [PMID: 26112370 DOI: 10.1016/j.neuroimage.2015.06.046] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2015] [Revised: 06/14/2015] [Accepted: 06/15/2015] [Indexed: 10/23/2022] Open
Abstract
Learning commonly requires feedback about the consequences of one's actions, which can drive learners to modify their behavior. Motivation may determine how sensitive an individual might be to such feedback, particularly in educational contexts where some students value academic achievement more than others. Thus, motivation for a task might influence the value placed on performance feedback and how effectively it is used to improve learning. To investigate the interplay between intrinsic motivation and feedback processing, we used functional magnetic resonance imaging (fMRI) during feedback-based learning before and after a novel manipulation based on motivational interviewing, a technique for enhancing treatment motivation in mental health settings. Because of its role in the reinforcement learning system, the striatum is situated to play a significant role in the modulation of learning based on motivation. Consistent with this idea, motivation levels during the task were associated with sensitivity to positive versus negative feedback in the striatum. Additionally, heightened motivation following a brief motivational interview was associated with increases in feedback sensitivity in the left medial temporal lobe. Our results suggest that motivation modulates neural responses to performance-related feedback, and furthermore that changes in motivation facilitate processing in areas that support learning and memory.
Collapse
|
22
|
Chen C, Takahashi T, Nakagawa S, Inoue T, Kusumi I. Reinforcement learning in depression: A review of computational research. Neurosci Biobehav Rev 2015; 55:247-67. [PMID: 25979140 DOI: 10.1016/j.neubiorev.2015.05.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Revised: 04/20/2015] [Accepted: 05/04/2015] [Indexed: 01/05/2023]
Abstract
Despite being considered primarily a mood disorder, major depressive disorder (MDD) is characterized by cognitive and decision making deficits. Recent research has employed computational models of reinforcement learning (RL) to address these deficits. The computational approach has the advantage in making explicit predictions about learning and behavior, specifying the process parameters of RL, differentiating between model-free and model-based RL, and the computational model-based functional magnetic resonance imaging and electroencephalography. With these merits there has been an emerging field of computational psychiatry and here we review specific studies that focused on MDD. Considerable evidence suggests that MDD is associated with impaired brain signals of reward prediction error and expected value ('wanting'), decreased reward sensitivity ('liking') and/or learning (be it model-free or model-based), etc., although the causality remains unclear. These parameters may serve as valuable intermediate phenotypes of MDD, linking general clinical symptoms to underlying molecular dysfunctions. We believe future computational research at clinical, systems, and cellular/molecular/genetic levels will propel us toward a better understanding of the disease.
Collapse
Affiliation(s)
- Chong Chen
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan.
| | - Taiki Takahashi
- Department of Behavioral Science/Center for Experimental Research in Social Sciences, Hokkaido University, Sapporo 060-0810, Japan
| | - Shin Nakagawa
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| | - Takeshi Inoue
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| | - Ichiro Kusumi
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan
| |
Collapse
|
23
|
Abstract
Interindividual differences in the effects of reward on performance are prevalent and poorly understood, with some individuals being more dependent than others on the rewarding outcomes of their actions. The origin of this variability in reward dependence is unknown. Here, we tested the relationship between reward dependence and brain structure in healthy humans. Subjects trained on a visuomotor skill-acquisition task and received performance feedback in the presence or absence of reward. Reward dependence was defined as the statistical trial-by-trial relation between reward and subsequent performance. We report a significant relationship between reward dependence and the lateral prefrontal cortex, where regional gray-matter volume predicted reward dependence but not feedback alone. Multivoxel pattern analysis confirmed the anatomical specificity of this relationship. These results identified a likely anatomical marker for the prospective influence of reward on performance, which may be of relevance in neurorehabilitative settings.
Collapse
|
24
|
Friedel E, Schlagenhauf F, Beck A, Dolan RJ, Huys QJ, Rapp MA, Heinz A. The effects of life stress and neural learning signals on fluid intelligence. Eur Arch Psychiatry Clin Neurosci 2015; 265:35-43. [PMID: 25142177 PMCID: PMC4311068 DOI: 10.1007/s00406-014-0519-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Accepted: 07/23/2014] [Indexed: 11/24/2022]
Abstract
Fluid intelligence (fluid IQ), defined as the capacity for rapid problem solving and behavioral adaptation, is known to be modulated by learning and experience. Both stressful life events (SLES) and neural correlates of learning [specifically, a key mediator of adaptive learning in the brain, namely the ventral striatal representation of prediction errors (PE)] have been shown to be associated with individual differences in fluid IQ. Here, we examine the interaction between adaptive learning signals (using a well-characterized probabilistic reversal learning task in combination with fMRI) and SLES on fluid IQ measures. We find that the correlation between ventral striatal BOLD PE and fluid IQ, which we have previously reported, is quantitatively modulated by the amount of reported SLES. Thus, after experiencing adversity, basic neuronal learning signatures appear to align more closely with a general measure of flexible learning (fluid IQ), a finding complementing studies on the effects of acute stress on learning. The results suggest that an understanding of the neurobiological correlates of trait variables like fluid IQ needs to take socioemotional influences such as chronic stress into account.
Collapse
Affiliation(s)
- Eva Friedel
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin Berlin, Campus Charité Mitte, Charitéplatz 1, 10117 Berlin, Germany
| | - Florian Schlagenhauf
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin Berlin, Campus Charité Mitte, Charitéplatz 1, 10117 Berlin, Germany ,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Anne Beck
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin Berlin, Campus Charité Mitte, Charitéplatz 1, 10117 Berlin, Germany
| | - Raymond J. Dolan
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Quentin J.M. Huys
- Gatsby Computational Neuroscience Unit, University College London, London, UK ,Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich and ETH Zurich, Zurich, Switzerland ,Department of Psychiatry, Psychotherapy and Psychosomatics, University Hospital of Psychiatry, Zurich, Switzerland
| | - Michael A. Rapp
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin Berlin, Campus Charité Mitte, Charitéplatz 1, 10117 Berlin, Germany ,Social and Preventive Medicine, University of Potsdam, Potsdam, Germany
| | - Andreas Heinz
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin Berlin, Campus Charité Mitte, Charitéplatz 1, 10117 Berlin, Germany ,Cluster of Excellence NeuroCure, Charite-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
25
|
Deserno L, Beck A, Huys QJM, Lorenz RC, Buchert R, Buchholz HG, Plotkin M, Kumakara Y, Cumming P, Heinze HJ, Grace AA, Rapp MA, Schlagenhauf F, Heinz A. Chronic alcohol intake abolishes the relationship between dopamine synthesis capacity and learning signals in the ventral striatum. Eur J Neurosci 2014; 41:477-86. [PMID: 25546072 DOI: 10.1111/ejn.12802] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 11/12/2014] [Indexed: 11/28/2022]
Abstract
Drugs of abuse elicit dopamine release in the ventral striatum, possibly biasing dopamine-driven reinforcement learning towards drug-related reward at the expense of non-drug-related reward. Indeed, in alcohol-dependent patients, reactivity in dopaminergic target areas is shifted from non-drug-related stimuli towards drug-related stimuli. Such 'hijacked' dopamine signals may impair flexible learning from non-drug-related rewards, and thus promote craving for the drug of abuse. Here, we used functional magnetic resonance imaging to measure ventral striatal activation by reward prediction errors (RPEs) during a probabilistic reversal learning task in recently detoxified alcohol-dependent patients and healthy controls (N = 27). All participants also underwent 6-[(18) F]fluoro-DOPA positron emission tomography to assess ventral striatal dopamine synthesis capacity. Neither ventral striatal activation by RPEs nor striatal dopamine synthesis capacity differed between groups. However, ventral striatal coding of RPEs correlated inversely with craving in patients. Furthermore, we found a negative correlation between ventral striatal coding of RPEs and dopamine synthesis capacity in healthy controls, but not in alcohol-dependent patients. Moderator analyses showed that the magnitude of the association between dopamine synthesis capacity and RPE coding depended on the amount of chronic, habitual alcohol intake. Despite the relatively small sample size, a power analysis supports the reported results. Using a multimodal imaging approach, this study suggests that dopaminergic modulation of neural learning signals is disrupted in alcohol dependence in proportion to long-term alcohol intake of patients. Alcohol intake may perpetuate itself by interfering with dopaminergic modulation of neural learning signals in the ventral striatum, thus increasing craving for habitual drug intake.
Collapse
Affiliation(s)
- Lorenz Deserno
- Department of Psychiatry and Psychotherapy, Charité-Universitätsmedizin Berlin, Campus Mitte, Charitéplatz 1, 10117, Berlin, Germany; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Chen C. Intelligence moderates reinforcement learning: a mini-review of the neural evidence. J Neurophysiol 2014; 113:3459-61. [PMID: 25185818 DOI: 10.1152/jn.00600.2014] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Accepted: 08/21/2014] [Indexed: 11/22/2022] Open
Abstract
Our understanding of the neural basis of reinforcement learning and intelligence, two key factors contributing to human strivings, has progressed significantly recently. However, the overlap of these two lines of research, namely, how intelligence affects neural responses during reinforcement learning, remains uninvestigated. A mini-review of three existing studies suggests that higher IQ (especially fluid IQ) may enhance the neural signal of positive prediction error in dorsolateral prefrontal cortex, dorsal anterior cingulate cortex, and striatum, several brain substrates of reinforcement learning or intelligence.
Collapse
Affiliation(s)
- Chong Chen
- Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| |
Collapse
|
27
|
Daniel R, Pollmann S. A universal role of the ventral striatum in reward-based learning: evidence from human studies. Neurobiol Learn Mem 2014; 114:90-100. [PMID: 24825620 DOI: 10.1016/j.nlm.2014.05.002] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 05/01/2014] [Accepted: 05/03/2014] [Indexed: 10/25/2022]
Abstract
Reinforcement learning enables organisms to adjust their behavior in order to maximize rewards. Electrophysiological recordings of dopaminergic midbrain neurons have shown that they code the difference between actual and predicted rewards, i.e., the reward prediction error, in many species. This error signal is conveyed to both the striatum and cortical areas and is thought to play a central role in learning to optimize behavior. However, in human daily life rewards are diverse and often only indirect feedback is available. Here we explore the range of rewards that are processed by the dopaminergic system in human participants, and examine whether it is also involved in learning in the absence of explicit rewards. While results from electrophysiological recordings in humans are sparse, evidence linking dopaminergic activity to the metabolic signal recorded from the midbrain and striatum with functional magnetic resonance imaging (fMRI) is available. Results from fMRI studies suggest that the human ventral striatum (VS) receives valuation information for a diverse set of rewarding stimuli. These range from simple primary reinforcers such as juice rewards over abstract social rewards to internally generated signals on perceived correctness, suggesting that the VS is involved in learning from trial-and-error irrespective of the specific nature of provided rewards. In addition, we summarize evidence that the VS can also be implicated when learning from observing others, and in tasks that go beyond simple stimulus-action-outcome learning, indicating that the reward system is also recruited in more complex learning tasks.
Collapse
Affiliation(s)
- Reka Daniel
- Department of Experimental Psychology, Otto-von-Guericke-Universität Magdeburg, D-39016 Magdeburg, Germany; Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, USA.
| | - Stefan Pollmann
- Department of Experimental Psychology, Otto-von-Guericke-Universität Magdeburg, D-39016 Magdeburg, Germany; Center for Behavioral Brain Sciences, D-39016 Magdeburg, Germany
| |
Collapse
|
28
|
Chuang LY, Huang CJ, Hung TM. The differences in frontal midline theta power between successful and unsuccessful basketball free throws of elite basketball players. Int J Psychophysiol 2013; 90:321-8. [PMID: 24126125 DOI: 10.1016/j.ijpsycho.2013.10.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2012] [Revised: 10/02/2013] [Accepted: 10/05/2013] [Indexed: 11/18/2022]
Abstract
During the preparatory period of motor skill, attention is considered as one of the most vital factors for athletic performance. Electroencephalographic (EEG) indices, such as occipital α, have been employed to explore the psychological state during the preparatory period in elite athletes. The main purpose of this study was to investigate the differences in frontal midline theta (Fm θ) power during the aiming period between successful and unsuccessful basketball free throws. Fifteen skilled male basketball players were recruited and asked to perform free throws. Electroencephalogram (EEG) data were collected 2seconds prior to the initiation of the free throw and segmented into four 0.5-s epochs. The lower theta (θ1, 4-6Hz) and upper theta (θ2, 6-8Hz) power values was contrasted between the successful and unsuccessful throws. Two 2×4×6 (performance×time×electrode) ANOVAs with repeated measures were conducted separately for θ1 and θ2 power. The results indicate that θ1 power at the Fz site and θ2 power at the Fz and the F4 sites fluctuated significantly during the preparatory period for an unsuccessful throw when compared with a successful throw. Additionally, a higher Fm θ2 power was observed at the beginning of the aiming period of a successful throw. This study suggests that a stable arousal and a relatively constant amount of attention to the task prior to motor execution may facilitate athletic performance.
Collapse
Affiliation(s)
- Lan-Ya Chuang
- Department of Physical Education, National Taiwan Normal University, No.162, Sec.1, Heping E. Rd., Da an Dist., Taipei City 106, Taiwan, Republic of China (R.O.C.).
| | | | | |
Collapse
|
29
|
Skatova A, Chan PA, Daw ND. Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task. Front Hum Neurosci 2013; 7:525. [PMID: 24027514 PMCID: PMC3760140 DOI: 10.3389/fnhum.2013.00525] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Accepted: 08/13/2013] [Indexed: 11/20/2022] Open
Abstract
Prominent computational models describe a neural mechanism for learning from reward prediction errors, and it has been suggested that variations in this mechanism are reflected in personality factors such as trait extraversion. However, although trait extraversion has been linked to improved reward learning, it is not yet known whether this relationship is selective for the particular computational strategy associated with error-driven learning, known as model-free reinforcement learning, vs. another strategy, model-based learning, which the brain is also known to employ. In the present study we test this relationship by examining whether humans' scores on an extraversion scale predict individual differences in the balance between model-based and model-free learning strategies in a sequentially structured decision task designed to distinguish between them. In previous studies with this task, participants have shown a combination of both types of learning, but with substantial individual variation in the balance between them. In the current study, extraversion predicted worse behavior across both sorts of learning. However, the hypothesis that extraverts would be selectively better at model-free reinforcement learning held up among a subset of the more engaged participants, and overall, higher task engagement was associated with a more selective pattern by which extraversion predicted better model-free learning. The findings indicate a relationship between a broad personality orientation and detailed computational learning mechanisms. Results like those in the present study suggest an intriguing and rich relationship between core neuro-computational mechanisms and broader life orientations and outcomes.
Collapse
Affiliation(s)
- Anya Skatova
- The School of Psychology, University of Nottingham Nottingham, UK ; Horizon Digital Economy Research, University of Nottingham Nottingham, UK ; Department of Psychology, Center for Neural Science, New York University New York, NY, USA
| | | | | |
Collapse
|
30
|
Clithero JA, Rangel A. Informatic parcellation of the network involved in the computation of subjective value. Soc Cogn Affect Neurosci 2013; 9:1289-302. [PMID: 23887811 DOI: 10.1093/scan/nst106] [Citation(s) in RCA: 431] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Understanding how the brain computes value is a basic question in neuroscience. Although individual studies have driven this progress, meta-analyses provide an opportunity to test hypotheses that require large collections of data. We carry out a meta-analysis of a large set of functional magnetic resonance imaging studies of value computation to address several key questions. First, what is the full set of brain areas that reliably correlate with stimulus values when they need to be computed? Second, is this set of areas organized into dissociable functional networks? Third, is a distinct network of regions involved in the computation of stimulus values at decision and outcome? Finally, are different brain areas involved in the computation of stimulus values for different reward modalities? Our results demonstrate the centrality of ventromedial prefrontal cortex (VMPFC), ventral striatum and posterior cingulate cortex (PCC) in the computation of value across tasks, reward modalities and stages of the decision-making process. We also find evidence of distinct subnetworks of co-activation within VMPFC, one involving central VMPFC and dorsal PCC and another involving more anterior VMPFC, left angular gyrus and ventral PCC. Finally, we identify a posterior-to-anterior gradient of value representations corresponding to concrete-to-abstract rewards.
Collapse
Affiliation(s)
- John A Clithero
- Division of the Humanities and Social Sciences and Computation and Neural Systems, California Institute of Technology, Pasadena, CA 91125, USA
| | - Antonio Rangel
- Division of the Humanities and Social Sciences and Computation and Neural Systems, California Institute of Technology, Pasadena, CA 91125, USA Division of the Humanities and Social Sciences and Computation and Neural Systems, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
31
|
Lei X, Chen C, Xue F, He Q, Chen C, Liu Q, Moyzis RK, Xue G, Cao Z, Li J, Li H, Zhu B, Liu Y, Hsu ASC, Li J, Dong Q. Fiber connectivity between the striatum and cortical and subcortical regions is associated with temperaments in Chinese males. Neuroimage 2013; 89:226-34. [PMID: 23618602 DOI: 10.1016/j.neuroimage.2013.04.043] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Revised: 03/16/2013] [Accepted: 04/11/2013] [Indexed: 11/29/2022] Open
Abstract
The seven-factor biopsychosocial model of personality distinguished four biologically based temperaments and three psychosocially based characters. Previous studies have suggested that the four temperaments-novelty seeking (NS), reward dependence (RD), harm avoidance (HA), and persistence (P)-have their respective neurobiological correlates, especially in the striatum-connected subcortical and cortical networks. However, few studies have investigated their neurobiological basis in the form of fiber connectivity between brain regions. This study correlated temperaments with fiber connectivity between the striatum and subcortical and cortical hub regions in a sample of 50 Chinese adult males. Generally consistent with our hypotheses, results showed that: (1) NS was positively correlated with fiber connectivity from the medial and lateral orbitofrontal cortex (mOFC, lOFC) and amygdala to the striatum; (2) RD was positively correlated with fiber connectivity from the mOFC, posterior cingulate cortex/retrosplenial cortex (PCC), hippocampus, and amygdala to the striatum; (3) HA was positively linked to fiber connectivity from the dorsolateral prefrontal cortex (dlPFC) and PCC to the striatum; and (4) P was positively linked to fiber connectivity from the mOFC to the striatum. These results extended the research on the neurobiological basis of temperaments by identifying their anatomical fiber connectivity correlates within the subcortical-cortical neural networks.
Collapse
Affiliation(s)
- Xuemei Lei
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China; Department of Psychology and Social Behavior, University of California, Irvine, CA, USA
| | - Chuansheng Chen
- Department of Psychology and Social Behavior, University of California, Irvine, CA, USA.
| | - Feng Xue
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Qinghua He
- Institute of Genomics and Bioinformatics, University of California, Irvine, CA, USA
| | - Chunhui Chen
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Qi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Robert K Moyzis
- Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA, USA; Institute of Genomics and Bioinformatics, University of California, Irvine, CA, USA
| | - Gui Xue
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China; Department of Psychology, University of Southern California, Los Angeles, CA 90089, USA
| | - Zhongyu Cao
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Jin Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - He Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Bi Zhu
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Yuyun Liu
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Anna Shan Chun Hsu
- Department of Psychology and Social Behavior, University of California, Irvine, CA, USA
| | - Jun Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Qi Dong
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China.
| |
Collapse
|
32
|
Abstract
The present study investigates two aspects of decision making that have yet to be explored within a dynamic environment, (1) comparing the accuracy of cue-outcome knowledge under conditions in which knowledge acquisition is either through Prediction or Choice, and (2) examining the effects of reward on both Prediction and Choice. In the present study participants either learnt about the cue-outcome relations in the environment by choosing cue values in order to maintain an outcome to criterion (Choice-based decision making), or learnt to predict the outcome from seeing changes to the cue values (Prediction-based decision making). During training participants received outcome feedback and one of four types of reward manipulations: Positive Reward, Negative Reward, Both Positive + Negative Reward, No Reward. After training both groups of learners were tested on prediction and choice-based tasks. In the main, the findings revealed that cue-outcome knowledge was more accurate when knowledge acquisition was Choice-based rather than Prediction-based. During learning Negative Reward adversely affected Choice-based decision making while Positive Reward adversely affected predictive-based decision making. During the test phase only performance on tests of choice was adversely affected by having received Positive Reward or Negative Reward during training. This article proposes that the adverse effects of reward may reflect the additional demands placed on processing rewards which compete for cognitive resources required to perform the main goal of the task. This in turn implies that, rather than facilitate decision making, the presentation of rewards can interfere with Choice-based and Prediction-based decisions.
Collapse
Affiliation(s)
- Magda Osman
- Biological and Experimental Psychology Centre, School of Biological and Chemical Sciences, Queen Mary College, University of London London, UK
| |
Collapse
|
33
|
Nakao T, Ohira H, Northoff G. Distinction between Externally vs. Internally Guided Decision-Making: Operational Differences, Meta-Analytical Comparisons and Their Theoretical Implications. Front Neurosci 2012; 6:31. [PMID: 22403525 PMCID: PMC3293150 DOI: 10.3389/fnins.2012.00031] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 02/18/2012] [Indexed: 11/13/2022] Open
Abstract
Most experimental studies of decision-making have specifically examined situations in which a single less-predictable correct answer exists (externally guided decision-making under uncertainty). Along with such externally guided decision-making, there are instances of decision-making in which no correct answer based on external circumstances is available for the subject (internally guided decision-making). Such decisions are usually made in the context of moral decision-making as well as in preference judgment, where the answer depends on the subject's own, i.e., internal, preferences rather than on external, i.e., circumstantial, criteria. The neuronal and psychological mechanisms that allow guidance of decisions based on more internally oriented criteria in the absence of external ones remain unclear. This study was undertaken to compare decision-making of these two kinds empirically and theoretically. First, we reviewed studies of decision-making to clarify experimental-operational differences between externally guided and internally guided decision-making. Second, using multi-level kernel density analysis, a whole-brain-based quantitative meta-analysis of neuroimaging studies was performed. Our meta-analysis revealed that the neural network used predominantly for internally guided decision-making differs from that for externally guided decision-making under uncertainty. This result suggests that studying only externally guided decision-making under uncertainty is insufficient to account for decision-making processes in the brain. Finally, based on the review and results of the meta-analysis, we discuss the differences and relations between decision-making of these two types in terms of their operational, neuronal, and theoretical characteristics.
Collapse
Affiliation(s)
- Takashi Nakao
- Mind, Brain Imaging and Neuroethics, Institute of Mental Health Research, Royal Ottawa Health Care Group, University of Ottawa Ottawa, ON, Canada
| | | | | |
Collapse
|
34
|
Schlagenhauf F, Rapp MA, Huys QJM, Beck A, Wüstenberg T, Deserno L, Buchholz HG, Kalbitzer J, Buchert R, Bauer M, Kienast T, Cumming P, Plotkin M, Kumakura Y, Grace AA, Dolan RJ, Heinz A. Ventral striatal prediction error signaling is associated with dopamine synthesis capacity and fluid intelligence. Hum Brain Mapp 2012; 34:1490-9. [PMID: 22344813 DOI: 10.1002/hbm.22000] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Revised: 10/11/2011] [Accepted: 11/08/2011] [Indexed: 11/10/2022] Open
Abstract
Fluid intelligence represents the capacity for flexible problem solving and rapid behavioral adaptation. Rewards drive flexible behavioral adaptation, in part via a teaching signal expressed as reward prediction errors in the ventral striatum, which has been associated with phasic dopamine release in animal studies. We examined a sample of 28 healthy male adults using multimodal imaging and biological parametric mapping with (1) functional magnetic resonance imaging during a reversal learning task and (2) in a subsample of 17 subjects also with positron emission tomography using 6-[(18) F]fluoro-L-DOPA to assess dopamine synthesis capacity. Fluid intelligence was measured using a battery of nine standard neuropsychological tests. Ventral striatal BOLD correlates of reward prediction errors were positively correlated with fluid intelligence and, in the right ventral striatum, also inversely correlated with dopamine synthesis capacity (FDOPA K inapp). When exploring aspects of fluid intelligence, we observed that prediction error signaling correlates with complex attention and reasoning. These findings indicate that individual differences in the capacity for flexible problem solving relate to ventral striatal activation during reward-related learning, which in turn proved to be inversely associated with ventral striatal dopamine synthesis capacity.
Collapse
Affiliation(s)
- Florian Schlagenhauf
- Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité-Universitätsmedizin Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
van de Vijver I, Ridderinkhof KR, Cohen MX. Frontal Oscillatory Dynamics Predict Feedback Learning and Action Adjustment. J Cogn Neurosci 2011; 23:4106-21. [DOI: 10.1162/jocn_a_00110] [Citation(s) in RCA: 122] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Frontal oscillatory dynamics in the theta (4–8 Hz) and beta (20–30 Hz) frequency bands have been implicated in cognitive control processes. Here we investigated the changes in coordinated activity within and between frontal brain areas during feedback-based response learning. In a time estimation task, participants learned to press a button after specific, randomly selected time intervals (300–2000 msec) using the feedback after each button press (correct, too fast, too slow). Consistent with previous findings, theta-band activity over medial frontal scalp sites (presumably reflecting medial frontal cortex activity) was stronger after negative feedback, whereas beta-band activity was stronger after positive feedback. Theta-band power predicted learning only after negative feedback, and beta-band power predicted learning after positive and negative feedback. Furthermore, negative feedback increased theta-band intersite phase synchrony (a millisecond resolution measure of functional connectivity) among right lateral prefrontal, medial frontal, and sensorimotor sites. These results demonstrate the importance of frontal theta- and beta-band oscillations and intersite communication in the realization of reinforcement learning.
Collapse
|
36
|
Reward system and temporal pole contributions to affective evaluation during a first person shooter video game. BMC Neurosci 2011; 12:66. [PMID: 21749711 PMCID: PMC3146896 DOI: 10.1186/1471-2202-12-66] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 07/12/2011] [Indexed: 11/18/2022] Open
Abstract
Background Violent content in video games evokes many concerns but there is little research concerning its rewarding aspects. It was demonstrated that playing a video game leads to striatal dopamine release. It is unclear, however, which aspects of the game cause this reward system activation and if violent content contributes to it. We combined functional Magnetic Resonance Imaging (fMRI) with individual affect measures to address the neuronal correlates of violence in a video game. Results Thirteen male German volunteers played a first-person shooter game (Tactical Ops: Assault on Terror) during fMRI measurement. We defined success as eliminating opponents, and failure as being eliminated themselves. Affect was measured directly before and after game play using the Positive and Negative Affect Schedule (PANAS). Failure and success events evoked increased activity in visual cortex but only failure decreased activity in orbitofrontal cortex and caudate nucleus. A negative correlation between negative affect and responses to failure was evident in the right temporal pole (rTP). Conclusions The deactivation of the caudate nucleus during failure is in accordance with its role in reward-prediction error: it occurred whenever subject missed an expected reward (being eliminated rather than eliminating the opponent). We found no indication that violence events were directly rewarding for the players. We addressed subjective evaluations of affect change due to gameplay to study the reward system. Subjects reporting greater negative affect after playing the game had less rTP activity associated with failure. The rTP may therefore be involved in evaluating the failure events in a social context, to regulate the players' mood.
Collapse
|
37
|
Mars RB, Shea NJ, Kolling N, Rushworth MFS. Model-based analyses: Promises, pitfalls, and example applications to the study of cognitive control. Q J Exp Psychol (Hove) 2011; 65:252-67. [PMID: 20437297 PMCID: PMC3335278 DOI: 10.1080/17470211003668272] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
We discuss a recent approach to investigating cognitive control, which has the potential to deal with some of the challenges inherent in this endeavour. In a model-based approach, the researcher defines a formal, computational model that performs the task at hand and whose performance matches that of a research participant. The internal variables in such a model might then be taken as proxies for latent variables computed in the brain. We discuss the potential advantages of such an approach for the study of the neural underpinnings of cognitive control and its pitfalls, and we make explicit the assumptions underlying the interpretation of data obtained using this approach.
Collapse
Affiliation(s)
- Rogier B Mars
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
| | | | | | | |
Collapse
|
38
|
Smillie LD, Cooper AJ, Tharp IJ, Pelling EL. Individual differences in cognitive control: The role of psychoticism and working memory in set-shifting. Br J Psychol 2010; 100:629-43. [DOI: 10.1348/000712608x382094] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
39
|
Liu X, Hairston J, Schrier M, Fan J. Common and distinct networks underlying reward valence and processing stages: a meta-analysis of functional neuroimaging studies. Neurosci Biobehav Rev 2010; 35:1219-36. [PMID: 21185861 DOI: 10.1016/j.neubiorev.2010.12.012] [Citation(s) in RCA: 703] [Impact Index Per Article: 50.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2010] [Revised: 12/01/2010] [Accepted: 12/16/2010] [Indexed: 10/18/2022]
Abstract
To better understand the reward circuitry in human brain, we conducted activation likelihood estimation (ALE) and parametric voxel-based meta-analyses (PVM) on 142 neuroimaging studies that examined brain activation in reward-related tasks in healthy adults. We observed several core brain areas that participated in reward-related decision making, including the nucleus accumbens (NAcc), caudate, putamen, thalamus, orbitofrontal cortex (OFC), bilateral anterior insula, anterior cingulate cortex (ACC) and posterior cingulate cortex (PCC), as well as cognitive control regions in the inferior parietal lobule and prefrontal cortex (PFC). The NAcc was commonly activated by both positive and negative rewards across various stages of reward processing (e.g., anticipation, outcome, and evaluation). In addition, the medial OFC and PCC preferentially responded to positive rewards, whereas the ACC, bilateral anterior insula, and lateral PFC selectively responded to negative rewards. Reward anticipation activated the ACC, bilateral anterior insula, and brain stem, whereas reward outcome more significantly activated the NAcc, medial OFC, and amygdala. Neurobiological theories of reward-related decision making should therefore take distributed and interrelated representations of reward valuation and valence assessment into account.
Collapse
Affiliation(s)
- Xun Liu
- Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China.
| | | | | | | |
Collapse
|
40
|
Smillie LD, Cooper AJ, Pickering AD. Individual differences in reward-prediction-error: extraversion and feedback-related negativity. Soc Cogn Affect Neurosci 2010; 6:646-52. [PMID: 20855297 DOI: 10.1093/scan/nsq078] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Medial frontal scalp-recorded negativity occurring ∼200-300 ms post-stimulus [known as feedback-related negativity (FRN)] is attenuated following unpredicted reward and potentiated following unpredicted non-reward. This encourages the view that FRN may partly reflect dopaminergic 'reward-prediction-error' signalling. We examined the influence of a putatively dopamine-based personality trait, extraversion (N = 30), and a dopamine-related gene polymorphism, DRD2/ANKK1 (N = 24), on FRN during an associative reward-learning paradigm. FRN was most negative following unpredicted non-reward and least-negative following unpredicted reward. A difference wave contrasting these conditions was significantly more pronounced for extraverted participants than for introverts, with a similar but non-significant trend for participants carrying at least one copy of the A1 allele of the DRD2/ANKK1 gene compared with those without the allele. Extraversion was also significantly higher in A1 allele carriers. Results have broad relevance to neuroscience and personality research concerning reward processing and dopamine function.
Collapse
Affiliation(s)
- Luke D Smillie
- Department of Psychology, Goldsmiths, University of London, London, SE14 6NW, UK.
| | | | | |
Collapse
|
41
|
Ichikawa N, Siegle GJ, Dombrovski A, Ohira H. Subjective and model-estimated reward prediction: association with the feedback-related negativity (FRN) and reward prediction error in a reinforcement learning task. Int J Psychophysiol 2010; 78:273-83. [PMID: 20858518 DOI: 10.1016/j.ijpsycho.2010.09.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Revised: 08/20/2010] [Accepted: 09/06/2010] [Indexed: 11/19/2022]
Abstract
In this study, we examined whether the feedback-related negativity (FRN) is associated with both subjective and objective (model-estimated) reward prediction errors (RPE) per trial in a reinforcement learning task in healthy adults (n=25). The level of RPE was assessed by 1) subjective ratings per trial and by 2) a computational model of reinforcement learning. As results, model-estimated RPE was highly correlated with subjective RPE (r=.82), and the grand-averaged ERP waves based on the trials with high and low model-estimated RPE showed the significant difference only in the time period of the FRN component (p<.05). Regardless of the time course of learning, FRN was associated with both subjective and model-estimated RPEs within subject (r=.47, p<.001; r=.40, p<.05) and between subjects (r=.33, p<.05; r=.41, p<.005) only in the Learnable condition where the internal reward prediction varied enough with a behavior-reward contingency.
Collapse
|
42
|
Campbell-Meiklejohn DK, Bach DR, Roepstorff A, Dolan RJ, Frith CD. How the opinion of others affects our valuation of objects. Curr Biol 2010; 20:1165-70. [PMID: 20619815 PMCID: PMC2908235 DOI: 10.1016/j.cub.2010.04.055] [Citation(s) in RCA: 187] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2009] [Revised: 04/20/2010] [Accepted: 04/21/2010] [Indexed: 11/27/2022]
Abstract
The opinions of others can easily affect how much we value things. We investigated what happens in our brain when we agree with others about the value of an object and whether or not there is evidence, at the neural level, for social conformity through which we change object valuation. Using functional magnetic resonance imaging we independently modeled (1) learning reviewer opinions about a piece of music, (2) reward value while receiving a token for that music, and (3) their interaction in 28 healthy adults. We show that agreement with two “expert” reviewers on music choice produces activity in a region of ventral striatum that also responds when receiving a valued object. It is known that the magnitude of activity in the ventral striatum reflects the value of reward-predicting stimuli [1–8]. We show that social influence on the value of an object is associated with the magnitude of the ventral striatum response to receiving it. This finding provides clear evidence that social influence mediates very basic value signals in known reinforcement learning circuitry [9–12]. Influence at such a low level could contribute to rapid learning and the swift spread of values throughout a population.
Collapse
|
43
|
Cavanagh JF, Frank MJ, Klein TJ, Allen JJB. Frontal theta links prediction errors to behavioral adaptation in reinforcement learning. Neuroimage 2009; 49:3198-209. [PMID: 19969093 DOI: 10.1016/j.neuroimage.2009.11.080] [Citation(s) in RCA: 304] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2009] [Revised: 11/05/2009] [Accepted: 11/26/2009] [Indexed: 11/18/2022] Open
Abstract
Investigations into action monitoring have consistently detailed a frontocentral voltage deflection in the event-related potential (ERP) following the presentation of negatively valenced feedback, sometimes termed the feedback-related negativity (FRN). The FRN has been proposed to reflect a neural response to prediction errors during reinforcement learning, yet the single-trial relationship between neural activity and the quanta of expectation violation remains untested. Although ERP methods are not well suited to single-trial analyses, the FRN has been associated with theta band oscillatory perturbations in the medial prefrontal cortex. Mediofrontal theta oscillations have been previously associated with expectation violation and behavioral adaptation and are well suited to single-trial analysis. Here, we recorded EEG activity during a probabilistic reinforcement learning task and fit the performance data to an abstract computational model (Q-learning) for calculation of single-trial reward prediction errors. Single-trial theta oscillatory activities following feedback were investigated within the context of expectation (prediction error) and adaptation (subsequent reaction time change). Results indicate that interactive medial and lateral frontal theta activities reflect the degree of negative and positive reward prediction error in the service of behavioral adaptation. These different brain areas use prediction error calculations for different behavioral adaptations, with medial frontal theta reflecting the utilization of prediction errors for reaction time slowing (specifically following errors), but lateral frontal theta reflecting prediction errors leading to working memory-related reaction time speeding for the correct choice.
Collapse
Affiliation(s)
- James F Cavanagh
- Department of Psychology, University of Arizona, Tucson, AZ, USA.
| | | | | | | |
Collapse
|
44
|
Right frontal cortex generates reward-related theta-band oscillatory activity. Neuroimage 2009; 48:415-22. [DOI: 10.1016/j.neuroimage.2009.06.076] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2009] [Revised: 05/28/2009] [Accepted: 06/29/2009] [Indexed: 11/22/2022] Open
|
45
|
Kahnt T, Park SQ, Cohen MX, Beck A, Heinz A, Wrase J. Dorsal striatal-midbrain connectivity in humans predicts how reinforcements are used to guide decisions. J Cogn Neurosci 2009; 21:1332-45. [PMID: 18752410 DOI: 10.1162/jocn.2009.21092] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
It has been suggested that the target areas of dopaminergic midbrain neurons, the dorsal (DS) and ventral striatum (VS), are differently involved in reinforcement learning especially as actor and critic. Whereas the critic learns to predict rewards, the actor maintains action values to guide future decisions. The different midbrain connections to the DS and the VS seem to play a critical role in this functional distinction. Here, subjects performed a dynamic, reward-based decision-making task during fMRI acquisition. A computational model of reinforcement learning was used to estimate the different effects of positive and negative reinforcements on future decisions for each subject individually. We found that activity in both the DS and the VS correlated with reward prediction errors. Using functional connectivity, we show that the DS and the VS are differentially connected to different midbrain regions (possibly corresponding to the substantia nigra [SN] and the ventral tegmental area [VTA], respectively). However, only functional connectivity between the DS and the putative SN predicted the impact of different reinforcement types on future behavior. These results suggest that connections between the putative SN and the DS are critical for modulating action values in the DS according to both positive and negative reinforcements to guide future decision making.
Collapse
Affiliation(s)
- Thorsten Kahnt
- Department of Psychiatry and Psychotherapy, Charité-Universitätsmedizin Berlin (Charité Campus Mitte), Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
46
|
Peterson DA, Elliott C, Song DD, Makeig S, Sejnowski TJ, Poizner H. Probabilistic reversal learning is impaired in Parkinson's disease. Neuroscience 2009; 163:1092-101. [PMID: 19628022 DOI: 10.1016/j.neuroscience.2009.07.033] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2009] [Revised: 07/14/2009] [Accepted: 07/16/2009] [Indexed: 11/17/2022]
Abstract
In many everyday settings, the relationship between our choices and their potentially rewarding outcomes is probabilistic and dynamic. In addition, the difficulty of the choices can vary widely. Although a large body of theoretical and empirical evidence suggests that dopamine mediates rewarded learning, the influence of dopamine in probabilistic and dynamic rewarded learning remains unclear. We adapted a probabilistic rewarded learning task originally used to study firing rates of dopamine cells in primate substantia nigra pars compacta [Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H (2006) Midbrain dopamine neurons encode decisions for future action. Nat Neurosci 9:1057-1063] for use as a reversal learning task with humans. We sought to investigate how the dopamine depletion in Parkinson's disease (PD) affects probabilistic reward learning and adaptation to a reversal in reward contingencies. Over the course of 256 trials subjects learned to choose the more favorable from among pairs of images with small or large differences in reward probabilities. During a subsequent otherwise identical reversal phase, the reward probability contingencies for the stimuli were reversed. Seventeen PD patients of mild to moderate severity were studied off of their dopaminergic medications and compared to 15 age-matched controls. Compared to controls, PD patients had distinct pre- and post-reversal deficiencies depending upon the difficulty of the choices they had to learn. The patients also exhibited compromised adaptability to the reversal. A computational model of the subjects' trial-by-trial choices demonstrated that the adaptability was sensitive to the gain with which patients weighted pre-reversal feedback. Collectively, the results implicate the nigral dopaminergic system in learning to make choices in environments with probabilistic and dynamic reward contingencies.
Collapse
Affiliation(s)
- D A Peterson
- Institute for Neural Computation, University of California-San Diego, CA, USA
| | | | | | | | | | | |
Collapse
|
47
|
Golestani N, Zatorre RJ. Individual differences in the acquisition of second language phonology. BRAIN AND LANGUAGE 2009; 109:55-67. [PMID: 18295875 DOI: 10.1016/j.bandl.2008.01.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2007] [Revised: 01/08/2008] [Accepted: 01/14/2008] [Indexed: 05/24/2023]
Abstract
Perceptual training was employed to characterize individual differences in non-native speech sound learning. Fifty-nine adult English speakers were trained to distinguish the Hindi dental-retroflex contrast, as well as a tonal pitch contrast. Training resulted in overall group improvement in the ability to identify and to discriminate the phonetic and the tonal contrasts, but there were considerable individual differences in performance. A category boundary effect during the post-training discrimination of the Hindi but not of the tonal contrast suggests different learning mechanisms for these two stimulus types. Specifically, our results suggest that successful learning of the speech sounds involves the formation of a long-term memory category representation for the new speech sound.
Collapse
Affiliation(s)
- Narly Golestani
- Cognitive Neuroscience Unit/Montreal Neurological Institute, McGill University, Montreal, Que., Canada.
| | | |
Collapse
|
48
|
Klucharev V, Hytönen K, Rijpkema M, Smidts A, Fernández G. Reinforcement Learning Signal Predicts Social Conformity. Neuron 2009; 61:140-51. [DOI: 10.1016/j.neuron.2008.11.027] [Citation(s) in RCA: 312] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2008] [Revised: 08/28/2008] [Accepted: 11/12/2008] [Indexed: 01/27/2023]
|
49
|
Cohen MX, Frank MJ. Neurocomputational models of basal ganglia function in learning, memory and choice. Behav Brain Res 2008; 199:141-56. [PMID: 18950662 DOI: 10.1016/j.bbr.2008.09.029] [Citation(s) in RCA: 138] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2008] [Revised: 09/24/2008] [Accepted: 09/24/2008] [Indexed: 11/24/2022]
Abstract
The basal ganglia (BG) are critical for the coordination of several motor, cognitive, and emotional functions and become dysfunctional in several pathological states ranging from Parkinson's disease to Schizophrenia. Here we review principles developed within a neurocomputational framework of BG and related circuitry which provide insights into their functional roles in behavior. We focus on two classes of models: those that incorporate aspects of biological realism and constrained by functional principles, and more abstract mathematical models focusing on the higher level computational goals of the BG. While the former are arguably more "realistic", the latter have a complementary advantage in being able to describe functional principles of how the system works in a relatively simple set of equations, but are less suited to making specific hypotheses about the roles of specific nuclei and neurophysiological processes. We review the basic architecture and assumptions of these models, their relevance to our understanding of the neurobiological and cognitive functions of the BG, and provide an update on the potential roles of biological details not explicitly incorporated in existing models. Empirical studies ranging from those in transgenic mice to dopaminergic manipulation, deep brain stimulation, and genetics in humans largely support model predictions and provide the basis for further refinement. Finally, we discuss possible future directions and possible ways to integrate different types of models.
Collapse
Affiliation(s)
- Michael X Cohen
- Department of Psychology, Program in Neuroscience, University of Arizona, 1503 E University Blvd, Tucson, AZ 85721, United States
| | | |
Collapse
|
50
|
Neurocomputational mechanisms of reinforcement-guided learning in humans: a review. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2008; 8:113-25. [PMID: 18589502 DOI: 10.3758/cabn.8.2.113] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Adapting decision making according to dynamic and probabilistic changes in action-reward contingencies is critical for survival in a competitive and resource-limited world. Much research has focused on elucidating the neural systems and computations that underlie how the brain identifies whether the consequences of actions are relatively good or bad. In contrast, less empirical research has focused on the mechanisms by which reinforcements might be used to guide decision making. Here, I review recent studies in which an attempt to bridge this gap has been made by characterizing how humans use reward information to guide and optimize decision making. Regions that have been implicated in reinforcement processing, including the striatum, orbitofrontal cortex, and anterior cingulate, also seem to mediate how reinforcements are used to adjust subsequent decision making. This research provides insights into why the brain devotes resources to evaluating reinforcements and suggests a direction for future research, from studying the mechanisms of reinforcement processing to studying the mechanisms of reinforcement learning.
Collapse
|