1
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
2
|
Fahey MP, Yee DM, Leng X, Tarlow M, Shenhav A. Motivational context determines the impact of aversive outcomes on mental effort allocation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.27.564461. [PMID: 37961466 PMCID: PMC10634922 DOI: 10.1101/2023.10.27.564461] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
It is well known that people will exert effort on a task if sufficiently motivated, but how they distribute these efforts across different strategies (e.g., efficiency vs. caution) remains uncertain. Past work has shown that people invest effort differently for potential positive outcomes (rewards) versus potential negative outcomes (penalties). However, this research failed to account for differences in the context in which negative outcomes motivate someone - either as punishment or reinforcement. It is therefore unclear whether effort profiles differ as a function of outcome valence, motivational context, or both. Using computational modeling and our novel Multi-Incentive Control Task, we show that the influence of aversive outcomes on one's effort profile is entirely determined by their motivational context. Participants (N:91) favored increased caution in response to larger penalties for incorrect responses, and favored increased efficiency in response to larger reinforcement for correct responses, whether positively or negatively incentivized.
Collapse
Affiliation(s)
- Mahalia Prater Fahey
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Debbie M Yee
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Xiamin Leng
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Maisy Tarlow
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| | - Amitai Shenhav
- Cognitive, Linguistic, and Psychological Sciences, Brown University Carney Institute for Brain Science, Brown University
| |
Collapse
|
3
|
Fang S, Law SF, Ji X, Liu Q, Zhang P, Zhong R, Li H, Wang X, Yao S, Wang X. Potential neuropsychological mechanism involved in the transition from suicide ideation to action - a resting-state fMRI study implicating the insula. Eur Psychiatry 2023; 66:e69. [PMID: 37694389 PMCID: PMC10594382 DOI: 10.1192/j.eurpsy.2023.2444] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/06/2023] [Accepted: 08/08/2023] [Indexed: 09/12/2023] Open
Abstract
BACKGROUND Understanding the neural mechanism underlying the transition from suicidal ideation to action is crucial but remains unclear. To explore this mechanism, we combined resting-state functional connectivity (rsFC) and computational modeling to investigate differences between those who attempted suicide(SA) and those who hold only high levels of suicidal ideation(HSI). METHODS A total of 120 MDD patients were categorized into SA group (n=47) and HSI group (n=73). All participants completed a resting-state functional MRI scan, with three subregions of the insula and the dorsal anterior cingulate cortex (dACC) being chosen as the region of interest (ROI) in seed-to-voxel analyses. Additionally, 86 participants completed the balloon analogue risk task (BART), and a five-parameter Bayesian modeling of BART was estimated. RESULTS In the SA group, the FC between the ventral anterior insula (vAI) and the superior/middle frontal gyrus (vAI-SFG, vAI-MFG), as well as the FC between posterior insula (pI) and MFG (pI-MFG), were lower than those in HSI group. The correlation analysis showed a negative correlation between the FC of vAI-SFG and psychological pain avoidance in SA group, whereas a positive correlation in HSI group. Furthermore, the FC of vAI-MFG displayed a negative correlation with loss aversion in SA group, while a positive correlation was found with psychological pain avoidance in HSI group. CONCLUSION In current study, two distinct neural mechanisms were identified in the insula which involving in the progression from suicidal ideation to action. Dysfunction in vAI FCs may gradually stabilize as individuals experience heightened psychological pain, and a shift from positive to negative correlation patterns of vAI-MFC may indicate a transition from state to trait impairment. Additionally, the dysfunction in PI FC may lead to a lowered threshold for suicide by blunting the perception of physical harm.
Collapse
Affiliation(s)
- Shulin Fang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Samuel F. Law
- Department of Psychiatry, University of Toronto, Toronto, Ontario, Canada
| | - Xinlei Ji
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Qinyu Liu
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Panwen Zhang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- Shanghai Songjiang Jiuting Middle School, Shanghai, China
| | - Runqing Zhong
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Huanhuan Li
- Department of Psychology, Renmin University of China, Beijing, China
| | - Xiaosheng Wang
- Department of Human Anatomy and Neurobiology, Xiangya School of Medicine, Central South University, Hunan, China
| | - Shuqiao Yao
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Xiang Wang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| |
Collapse
|
4
|
Kim H, Hur JK, Kwon M, Kim S, Zoh Y, Ahn WY. Causal role of the dorsolateral prefrontal cortex in modulating the balance between Pavlovian and instrumental systems in the punishment domain. PLoS One 2023; 18:e0286632. [PMID: 37267307 PMCID: PMC10237433 DOI: 10.1371/journal.pone.0286632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 05/19/2023] [Indexed: 06/04/2023] Open
Abstract
Previous literature suggests that a balance between Pavlovian and instrumental decision-making systems is critical for optimal decision-making. Pavlovian bias (i.e., approach toward reward-predictive stimuli and avoid punishment-predictive stimuli) often contrasts with the instrumental response. Although recent neuroimaging studies have identified brain regions that may be related to Pavlovian bias, including the dorsolateral prefrontal cortex (dlPFC), it is unclear whether a causal relationship exists. Therefore, we investigated whether upregulation of the dlPFC using transcranial current direct stimulation (tDCS) would reduce Pavlovian bias. In this double-blind study, participants were assigned to the anodal or the sham group; they received stimulation over the right dlPFC for 3 successive days. On the last day, participants performed a reinforcement learning task known as the orthogonalized go/no-go task; this was used to assess each participant's degree of Pavlovian bias in reward and punishment domains. We used computational modeling and hierarchical Bayesian analysis to estimate model parameters reflecting latent cognitive processes, including Pavlovian bias, go bias, and choice randomness. Several computational models were compared; the model with separate Pavlovian bias parameters for reward and punishment domains demonstrated the best model fit. When using a behavioral index of Pavlovian bias, the anodal group showed significantly lower Pavlovian bias in the punishment domain, but not in the reward domain, compared with the sham group. In addition, computational modeling showed that Pavlovian bias parameter in the punishment domain was lower in the anodal group than in the sham group, which is consistent with the behavioral findings. The anodal group also showed a lower go bias and choice randomness, compared with the sham group. These findings suggest that anodal tDCS may lead to behavioral suppression or change in Pavlovian bias in the punishment domain, which will help to improve comprehension of the causal neural mechanism.
Collapse
Affiliation(s)
- Hyeonjin Kim
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Jihyun K. Hur
- Department of Psychology, Yale University, New Haven, Connecticut, United States of America
| | - Mina Kwon
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Soyeon Kim
- Department of Psychology, Seoul National University, Seoul, Korea
| | - Yoonseo Zoh
- Department of Psychology, Princeton University, Princeton, New Jersey, United States of America
| | - Woo-Young Ahn
- Department of Psychology, Seoul National University, Seoul, Korea
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Korea
| |
Collapse
|
5
|
Yamamori Y, Robinson OJ. Computational perspectives on human fear and anxiety. Neurosci Biobehav Rev 2023; 144:104959. [PMID: 36375584 PMCID: PMC10564627 DOI: 10.1016/j.neubiorev.2022.104959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 10/25/2022] [Accepted: 11/09/2022] [Indexed: 11/12/2022]
Abstract
Fear and anxiety are adaptive emotions that serve important defensive functions, yet in excess, they can be debilitating and lead to poor mental health. Computational modelling of behaviour provides a mechanistic framework for understanding the cognitive and neurobiological bases of fear and anxiety, and has seen increasing interest in the field. In this brief review, we discuss recent developments in the computational modelling of human fear and anxiety. Firstly, we describe various reinforcement learning strategies that humans employ when learning to predict or avoid threat, and how these relate to symptoms of fear and anxiety. Secondly, we discuss initial efforts to explore, through a computational lens, approach-avoidance conflict paradigms that are popular in animal research to measure fear- and anxiety-relevant behaviours. Finally, we discuss negative biases in decision-making in the face of uncertainty in anxiety.
Collapse
Affiliation(s)
- Yumeya Yamamori
- Institute of Cognitive Neuroscience, University College London, UK.
| | - Oliver J Robinson
- Institute of Cognitive Neuroscience, University College London, UK; Clinical, Educational and Health Psychology, University College London, UK
| |
Collapse
|
6
|
Myers CE, Interian A, Moustafa AA. A practical introduction to using the drift diffusion model of decision-making in cognitive psychology, neuroscience, and health sciences. Front Psychol 2022; 13:1039172. [PMID: 36571016 PMCID: PMC9784241 DOI: 10.3389/fpsyg.2022.1039172] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 10/27/2022] [Indexed: 12/14/2022] Open
Abstract
Recent years have seen a rapid increase in the number of studies using evidence-accumulation models (such as the drift diffusion model, DDM) in the fields of psychology and neuroscience. These models go beyond observed behavior to extract descriptions of latent cognitive processes that have been linked to different brain substrates. Accordingly, it is important for psychology and neuroscience researchers to be able to understand published findings based on these models. However, many articles using (and explaining) these models assume that the reader already has a fairly deep understanding of (and interest in) the computational and mathematical underpinnings, which may limit many readers' ability to understand the results and appreciate the implications. The goal of this article is therefore to provide a practical introduction to the DDM and its application to behavioral data - without requiring a deep background in mathematics or computational modeling. The article discusses the basic ideas underpinning the DDM, and explains the way that DDM results are normally presented and evaluated. It also provides a step-by-step example of how the DDM is implemented and used on an example dataset, and discusses methods for model validation and for presenting (and evaluating) model results. Supplementary material provides R code for all examples, along with the sample dataset described in the text, to allow interested readers to replicate the examples themselves. The article is primarily targeted at psychologists, neuroscientists, and health professionals with a background in experimental cognitive psychology and/or cognitive neuroscience, who are interested in understanding how DDMs are used in the literature, as well as some who may to go on to apply these approaches in their own work.
Collapse
Affiliation(s)
- Catherine E. Myers
- Research and Development Service, VA New Jersey Health Care System, East Orange, NJ, United States
- Department of Pharmacology, Physiology and Neuroscience, New Jersey Medical School, Rutgers University, Newark, NJ, United States
| | - Alejandro Interian
- Mental Health and Behavioral Sciences, VA New Jersey Health Care System, Lyons, NJ, United States
- Department of Psychiatry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ, United States
| | - Ahmed A. Moustafa
- Department of Human Anatomy and Physiology, The Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa
- School of Psychology, Faculty of Society and Design, Bond University, Robina, QLD, Australia
| |
Collapse
|
7
|
Liu Q, Zhong R, Ji X, Law S, Xiao F, Wei Y, Fang S, Kong X, Zhang X, Yao S, Wang X. Decision-making biases in suicide attempters with major depressive disorder: A computational modeling study using the balloon analog risk task (BART). Depress Anxiety 2022; 39:845-857. [PMID: 36329675 DOI: 10.1002/da.23291] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/30/2022] [Accepted: 10/22/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND In the last decade, suicidality has been increasingly theorized as a distinct phenomenon from major depressive disorder (MDD), with unique psychological and neural mechanisms, rather than being mostly a severe symptom of MDD. Although decision-making biases have been widely reported in suicide attempters with MDD, little is known regarding what components of these biases can be distinguished from depressiveness itself. METHODS Ninety-three patients with current MDD (40 with suicide attempts [SA group] and 53 without suicide attempts [NS group]) and 65 healthy controls (HCs) completed psychometric assessments and the balloon analog risk task (BART). To analyze and compare decision-making components among the three groups, we applied a five-parameter Bayesian computational modeling. RESULTS Psychological assessments showed that the SA group had greater suicidal ideation and psychological pain avoidance than the NS group. Computational modeling showed that both MDD groups had higher risk preference and lower ability to learn and adapt from within-task observations than HCs, without differences between the SA and NS patient groups. The SA group also had higher loss aversion than the NS and HC groups, which had similar loss aversion. CONCLUSIONS Our BART and computational modeling findings suggest that psychological pain avoidance and loss aversion may be important suicide risk factor that are distinguishable from depression illness itself.
Collapse
Affiliation(s)
- Qinyu Liu
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Runqing Zhong
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Xinlei Ji
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Samuel Law
- Department of Psychiatry, University of Toronto, Ontario, Toronto, Canada
| | - Fan Xiao
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Yiming Wei
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Shulin Fang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Xinyuan Kong
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Xiaocui Zhang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Shuqiao Yao
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| | - Xiang Wang
- Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China.,China National Clinical Research Center on Mental Disorders (Xiangya), Changsha, Hunan, China
| |
Collapse
|
8
|
Weber I, Zorowitz S, Niv Y, Bennett D. The effects of induced positive and negative affect on Pavlovian-instrumental interactions. Cogn Emot 2022; 36:1343-1360. [PMID: 35929878 PMCID: PMC9852069 DOI: 10.1080/02699931.2022.2109600] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 07/19/2022] [Accepted: 07/26/2022] [Indexed: 01/22/2023]
Abstract
Across species, animals have an intrinsic drive to approach appetitive stimuli and to withdraw from aversive stimuli. In affective science, influential theories of emotion link positive affect with strengthened behavioural approach and negative affect with avoidance. Based on these theories, we predicted that individuals' positive and negative affect levels should particularly influence their behaviour when innate Pavlovian approach/avoidance tendencies conflict with learned instrumental behaviours. Here, across two experiments - exploratory Experiment 1 (N = 91) and a preregistered confirmatory Experiment 2 (N = 335) - we assessed how induced positive and negative affect influenced Pavlovian-instrumental interactions in a reward/punishment Go/No-Go task. Contrary to our hypotheses, we found no evidence for a main effect of positive/negative affect on either approach/avoidance behaviour or Pavlovian-instrumental interactions. However, we did find evidence that the effects of induced affect on behaviour were moderated by individual differences in self-reported behavioural inhibition and gender. Exploratory computational modelling analyses explained these demographic moderating effects as arising from positive correlations between demographic factors and individual differences in the strength of Pavlovian-instrumental interactions. These findings serve to sharpen our understanding of the effects of positive and negative affect on instrumental behaviour.
Collapse
Affiliation(s)
- Isla Weber
- Princeton Neuroscience Institute, Princeton University, Princeton, USA
| | - Sam Zorowitz
- Princeton Neuroscience Institute, Princeton University, Princeton, USA
| | - Yael Niv
- Princeton Neuroscience Institute, Princeton University, Princeton, USA
- Department of Psychology, Princeton University, Princeton, USA
| | - Daniel Bennett
- School of Psychological Sciences, Monash University, Clayton, Australia
| |
Collapse
|
9
|
Colas JT, Dundon NM, Gerraty RT, Saragosa‐Harris NM, Szymula KP, Tanwisuth K, Tyszka JM, van Geen C, Ju H, Toga AW, Gold JI, Bassett DS, Hartley CA, Shohamy D, Grafton ST, O'Doherty JP. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T. Hum Brain Mapp 2022; 43:4750-4790. [PMID: 35860954 PMCID: PMC9491297 DOI: 10.1002/hbm.25988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 11/12/2022] Open
Abstract
The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| | - Neil M. Dundon
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Department of Child and Adolescent Psychiatry, Psychotherapy, and PsychosomaticsUniversity of FreiburgFreiburg im BreisgauGermany
| | - Raphael T. Gerraty
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Center for Science and SocietyColumbia UniversityNew YorkNew YorkUSA
| | - Natalie M. Saragosa‐Harris
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of CaliforniaLos AngelesCaliforniaUSA
| | - Karol P. Szymula
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Koranis Tanwisuth
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Department of PsychologyUniversity of CaliforniaBerkeleyCaliforniaUSA
| | - J. Michael Tyszka
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
| | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harang Ju
- Neuroscience Graduate GroupUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Arthur W. Toga
- Laboratory of Neuro ImagingUSC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Joshua I. Gold
- Department of NeuroscienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dani S. Bassett
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Electrical and Systems EngineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of NeurologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Physics and AstronomyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Santa Fe InstituteSanta FeNew MexicoUSA
| | - Catherine A. Hartley
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Center for Neural ScienceNew York UniversityNew YorkNew YorkUSA
| | - Daphna Shohamy
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Kavli Institute for Brain ScienceColumbia UniversityNew YorkNew YorkUSA
| | - Scott T. Grafton
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
| | - John P. O'Doherty
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| |
Collapse
|
10
|
Karvelis P, Diaconescu AO. A Computational Model of Hopelessness and Active-Escape Bias in Suicidality. COMPUTATIONAL PSYCHIATRY (CAMBRIDGE, MASS.) 2022; 6:34-59. [PMID: 38774778 PMCID: PMC11104346 DOI: 10.5334/cpsy.80] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 02/15/2022] [Indexed: 12/27/2022]
Abstract
Currently, psychiatric practice lacks reliable predictive tools and a sufficiently detailed mechanistic understanding of suicidal thoughts and behaviors (STB) to provide timely and personalized interventions. Developing computational models of STB that integrate across behavioral, cognitive and neural levels of analysis could help better understand STB vulnerabilities and guide personalized interventions. To that end, we present a computational model based on the active inference framework. With this model, we show that several STB risk markers - hopelessness, Pavlovian bias and active-escape bias - are interrelated via the drive to maximize one's model evidence. We propose four ways in which these effects can arise: (1) increased learning from aversive outcomes, (2) reduced belief decay in response to unexpected outcomes, (3) increased stress sensitivity and (4) reduced sense of stressor controllability. These proposals stem from considering the neurocircuits implicated in STB: how the locus coeruleus - norepinephrine (LC-NE) system together with the amygdala (Amy), the dorsal prefrontal cortex (dPFC) and the anterior cingulate cortex (ACC) mediate learning in response to acute stress and volatility as well as how the dorsal raphe nucleus - serotonin (DRN-5-HT) system together with the ventromedial prefrontal cortex (vmPFC) mediate stress reactivity based on perceived stressor controllability. We validate the model by simulating performance in an Avoid/Escape Go/No-Go task replicating recent behavioral findings. This serves as a proof of concept and provides a computational hypothesis space that can be tested empirically and be used to distinguish planful versus impulsive STB subtypes. We discuss the relevance of the proposed model for treatment response prediction, including pharmacotherapy and psychotherapy, as well as sex differences as it relates to stress reactivity and suicide risk.
Collapse
Affiliation(s)
- Povilas Karvelis
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Ontario, Canada
| | - Andreea O. Diaconescu
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health (CAMH), Toronto, Ontario, Canada
- University of Toronto, Department of Psychiatry, Toronto, Ontario, Canada
- Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada
- Department of Psychology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
11
|
Yee DM, Leng X, Shenhav A, Braver TS. Aversive motivation and cognitive control. Neurosci Biobehav Rev 2022; 133:104493. [PMID: 34910931 PMCID: PMC8792354 DOI: 10.1016/j.neubiorev.2021.12.016] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 11/12/2021] [Accepted: 12/09/2021] [Indexed: 02/03/2023]
Abstract
Aversive motivation plays a prominent role in driving individuals to exert cognitive control. However, the complexity of behavioral responses attributed to aversive incentives creates significant challenges for developing a clear understanding of the neural mechanisms of this motivation-control interaction. We review the animal learning, systems neuroscience, and computational literatures to highlight the importance of experimental paradigms that incorporate both motivational context manipulations and mixed motivational components (e.g., bundling of appetitive and aversive incentives). Specifically, we postulate that to understand aversive incentive effects on cognitive control allocation, a critical contextual factor is whether such incentives are associated with negative reinforcement or punishment. We further illustrate how the inclusion of mixed motivational components in experimental paradigms enables increased precision in the measurement of aversive influences on cognitive control. A sharpened experimental and theoretical focus regarding the manipulation and assessment of distinct motivational dimensions promises to advance understanding of the neural, monoaminergic, and computational mechanisms that underlie the interaction of motivation and cognitive control.
Collapse
Affiliation(s)
- Debbie M Yee
- Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA; Department of Psychological and Brain Sciences, Washington University in Saint Louis, USA.
| | - Xiamin Leng
- Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA
| | - Amitai Shenhav
- Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA
| | - Todd S Braver
- Department of Psychological and Brain Sciences, Washington University in Saint Louis, USA
| |
Collapse
|
12
|
Paulus MP, Thompson WK. Computational approaches and machine learning for individual-level treatment predictions. Psychopharmacology (Berl) 2021; 238:1231-1239. [PMID: 31134293 PMCID: PMC6879811 DOI: 10.1007/s00213-019-05282-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 05/17/2019] [Indexed: 12/24/2022]
Abstract
RATIONALE The impact of neuroscience-based approaches for psychiatry on pragmatic clinical decision-making has been limited. Although neuroscience has provided insights into basic mechanisms of neural function, these insights have not improved the ability to generate better assessments, prognoses, diagnoses, or treatment of psychiatric conditions. OBJECTIVES To integrate the emerging findings in machine learning and computational psychiatry to address the question: what measures that are not derived from the patient's self-assessment or the assessment by a trained professional can be used to make more precise predictions about the individual's current state, the individual's future disease trajectory, or the probability to respond to a particular intervention? RESULTS Currently, the ability to use individual differences to predict differential outcomes is very modest possibly related to the fact that the effect sizes of interventions are small. There is emerging evidence of genetic and neuroimaging-based heterogeneity of psychiatric disorders, which contributes to imprecise predictions. Although the use of machine learning tools to generate clinically actionable predictions is still in its infancy, these approaches may identify subgroups enabling more precise predictions. In addition, computational psychiatry might provide explanatory disease models based on faulty updating of internal values or beliefs. CONCLUSIONS There is a need for larger studies, clinical trials using machine learning, or computational psychiatry model parameters predictions as actionable outcomes, comparing alternative explanatory computational models, and using translational approaches that apply similar paradigms and models in humans and animals.
Collapse
Affiliation(s)
- Martin P Paulus
- Laureate Institute for Brain Research, 6655 S Ave Tulsa, Yale, OK, 74136-3326, USA.
| | - Wesley K Thompson
- Family Medicine and Public Health, University of California San Diego, San Diego, CA, USA
| |
Collapse
|
13
|
Binti Affandi AH, Pike AC, Robinson OJ. Threat of shock promotes passive avoidance, but not active avoidance. Eur J Neurosci 2021; 55:2571-2580. [PMID: 33714211 DOI: 10.1111/ejn.15184] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/29/2021] [Accepted: 03/09/2021] [Indexed: 11/28/2022]
Abstract
Anxiety and stress are adaptive responses to threat that promote harm avoidance. In particular, prior work has shown that anxiety induced in humans using threat of unpredictable shock promotes behavioral inhibition in the face of harm. This is consistent with the idea that anxiety promotes passive avoidance-that is, withholding approach actions that could lead to harm. However, harm can also be avoided through active avoidance, where a (withdrawal) action is taken to avoid harm. Here, we provide the first direct within-study comparison of the effects of threat of shock on active and passive avoidance. We operationalize passive avoidance as withholding a button press response in the face of negative outcomes, and active avoidance as lifting/releasing a button press in the face of negative outcomes. We explore the impact of threat of unpredictable shock on the learning of these behavioral responses (alongside matched responses to rewards) within a single cognitive task. We predicted that threat of shock would promote both active and passive avoidance, and that this would be driven by increased reliance on Pavlovian bias, as parameterized within reinforcement-learning models. Consistent with our predictions, we provide evidence that threat of shock promotes passive avoidance as conceptualized by our task. However, inconsistent with predictions, we found no evidence that threat of shock promoted active avoidance, nor evidence of elevated Pavlovian bias in any condition. One hypothetical framework with which to understand these findings is that anxiety promotes passive over active harm avoidance strategies in order to conserve energy while avoiding harm.
Collapse
Affiliation(s)
- Aida Helana Binti Affandi
- Anxiety Lab, Neuroscience and Mental Health Group, Institute of Cognitive Neuroscience, University College London, London, UK
| | - Alexandra C Pike
- Anxiety Lab, Neuroscience and Mental Health Group, Institute of Cognitive Neuroscience, University College London, London, UK
| | - Oliver Joe Robinson
- Anxiety Lab, Neuroscience and Mental Health Group, Institute of Cognitive Neuroscience, University College London, London, UK
| |
Collapse
|
14
|
Miletić S, Boag RJ, Trutti AC, Stevenson N, Forstmann BU, Heathcote A. A new model of decision processing in instrumental learning tasks. eLife 2021; 10:e63055. [PMID: 33501916 PMCID: PMC7880686 DOI: 10.7554/elife.63055] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/26/2021] [Indexed: 01/12/2023] Open
Abstract
Learning and decision-making are interactive processes, yet cognitive modeling of error-driven learning and decision-making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision-making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.
Collapse
Affiliation(s)
- Steven Miletić
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Russell J Boag
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Anne C Trutti
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
- Leiden University, Department of PsychologyLeidenNetherlands
| | - Niek Stevenson
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Birte U Forstmann
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
| | - Andrew Heathcote
- University of Amsterdam, Department of PsychologyAmsterdamNetherlands
- University of Newcastle, School of PsychologyNewcastleAustralia
| |
Collapse
|
15
|
Huys QJM, Browning M, Paulus MP, Frank MJ. Advances in the computational understanding of mental illness. Neuropsychopharmacology 2021; 46:3-19. [PMID: 32620005 PMCID: PMC7688938 DOI: 10.1038/s41386-020-0746-4] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 06/11/2020] [Accepted: 06/15/2020] [Indexed: 12/11/2022]
Abstract
Computational psychiatry is a rapidly growing field attempting to translate advances in computational neuroscience and machine learning into improved outcomes for patients suffering from mental illness. It encompasses both data-driven and theory-driven efforts. Here, recent advances in theory-driven work are reviewed. We argue that the brain is a computational organ. As such, an understanding of the illnesses arising from it will require a computational framework. The review divides work up into three theoretical approaches that have deep mathematical connections: dynamical systems, Bayesian inference and reinforcement learning. We discuss both general and specific challenges for the field, and suggest ways forward.
Collapse
Affiliation(s)
- Quentin J M Huys
- Division of Psychiatry and Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK.
- Camden and Islington NHS Trust, London, UK.
| | - Michael Browning
- Computational Psychiatry Lab, Department of Psychiatry, University of Oxford, Oxford, UK
- Oxford Health NHS Trust, Oxford, UK
| | - Martin P Paulus
- Laureate Institute For Brain Research (LIBR), Tulsa, OK, USA
| | - Michael J Frank
- Cognitive, Linguistic & Psychological Sciences, Neuroscience Graduate Program, Brown University, Providence, RI, USA
- Carney Center for Computational Brain Science, Carney Institute for Brain Science Psychiatry and Human Behavior, Brown University, Providence, RI, USA
| |
Collapse
|
16
|
Millner AJ, Robinaugh DJ, Nock MK. Advancing the Understanding of Suicide: The Need for Formal Theory and Rigorous Descriptive Research. Trends Cogn Sci 2020; 24:704-716. [PMID: 32680678 PMCID: PMC7429350 DOI: 10.1016/j.tics.2020.06.007] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 06/10/2020] [Accepted: 06/17/2020] [Indexed: 01/05/2023]
Abstract
Suicide is a leading cause of death worldwide and perhaps the most puzzling and devastating of all human behaviors. Suicide research has primarily been guided by verbal theories containing vague constructs and poorly specified relationships. We propose two fundamental changes required to move toward a mechanistic understanding of suicide. First, we must formalize theories of suicide, expressing them as mathematical or computational models. Second, we must conduct rigorous descriptive research, prioritizing direct observation and precise measurement of suicidal thoughts and behaviors and of the factors posited to cause them. Together, theory formalization and rigorous descriptive research will facilitate abductive theory construction and strong theory testing, thereby improving the understanding and prevention of suicide and related behaviors.
Collapse
Affiliation(s)
- Alexander J Millner
- Harvard University, Cambridge, MA, USA; Franciscan Children's, Brighton, MA, USA.
| | - Donald J Robinaugh
- Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Matthew K Nock
- Harvard University, Cambridge, MA, USA; Franciscan Children's, Brighton, MA, USA; Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
17
|
Shinn M, Lam NH, Murray JD. A flexible framework for simulating and fitting generalized drift-diffusion models. eLife 2020; 9:56938. [PMID: 32749218 PMCID: PMC7462609 DOI: 10.7554/elife.56938] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 08/03/2020] [Indexed: 01/10/2023] Open
Abstract
The drift-diffusion model (DDM) is an important decision-making model in cognitive neuroscience. However, innovations in model form have been limited by methodological challenges. Here, we introduce the generalized drift-diffusion model (GDDM) framework for building and fitting DDM extensions, and provide a software package which implements the framework. The GDDM framework augments traditional DDM parameters through arbitrary user-defined functions. Models are solved numerically by directly solving the Fokker-Planck equation using efficient numerical methods, yielding a 100-fold or greater speedup over standard methodology. This speed allows GDDMs to be fit to data using maximum likelihood on the full response time (RT) distribution. We demonstrate fitting of GDDMs within our framework to both animal and human datasets from perceptual decision-making tasks, with better accuracy and fewer parameters than several DDMs implemented using the latest methodology, to test hypothesized decision-making mechanisms. Overall, our framework will allow for decision-making model innovation and novel experimental designs.
Collapse
Affiliation(s)
- Maxwell Shinn
- Department of Psychiatry, Yale University, New Haven, United States.,Interdepartmental Neuroscience Program, Yale University, New Haven, United States
| | - Norman H Lam
- Department of Physics, Yale University, New Haven, United States
| | - John D Murray
- Department of Psychiatry, Yale University, New Haven, United States.,Interdepartmental Neuroscience Program, Yale University, New Haven, United States.,Department of Physics, Yale University, New Haven, United States
| |
Collapse
|
18
|
Peterburs J, Frieling A, Bellebaum C. Asymmetric coupling of action and outcome valence in active and observational feedback learning. PSYCHOLOGICAL RESEARCH 2020; 85:1553-1566. [PMID: 32322967 PMCID: PMC8211594 DOI: 10.1007/s00426-020-01340-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 04/07/2020] [Indexed: 11/23/2022]
Abstract
Learning to execute a response to obtain a reward or to inhibit a response to avoid punishment is much easier than learning the reverse, which has been referred to as “Pavlovian” biases. Despite a growing body of research into similarities and differences between active and observational learning, it is as yet unclear if Pavlovian learning biases are specific for active task performance, i.e., learning from feedback provided for one’s own actions, or if they persist also when learning by observing another person’s actions and subsequent outcomes. The present study, therefore, investigated the influence of action and outcome valence in active and observational feedback learning. Healthy adult volunteers completed a go/nogo task that decoupled outcome valence (win/loss) and action (execution/inhibition) either actively or by observing a virtual co-player’s responses and subsequent feedback. Moreover, in a more naturalistic follow-up experiment, pairs of subjects were tested with the same task, with one subject as active learner and the other as observational learner. The results revealed Pavlovian learning biases both in active and in observational learning, with learning of go responses facilitated in the context of reward obtainment, and learning of nogo responses facilitated in the context of loss avoidance. Although the neural correlates of active and observational feedback learning have been shown to differ to some extent, these findings suggest similar mechanisms to underlie both types of learning with respect to the influence of Pavlovian biases. Moreover, performance levels and result patterns were similar in those observational learners who had observed a virtual co-player and those who had completed the task together with an active learner, suggesting that inclusion of a virtual co-player in a computerized task provides an effective manipulation of agency.
Collapse
Affiliation(s)
- Jutta Peterburs
- Department of Biological Psychology, Institute of Experimental Psychology, Heinrich-Heine-University Düsseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany.
| | - Alena Frieling
- Department of Biological Psychology, Institute of Experimental Psychology, Heinrich-Heine-University Düsseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Christian Bellebaum
- Department of Biological Psychology, Institute of Experimental Psychology, Heinrich-Heine-University Düsseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| |
Collapse
|
19
|
Paulus MP. Driven by Pain, Not Gain: Computational Approaches to Aversion-Related Decision Making in Psychiatry. Biol Psychiatry 2020; 87:359-367. [PMID: 31653478 PMCID: PMC7012695 DOI: 10.1016/j.biopsych.2019.08.025] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 08/02/2019] [Accepted: 08/28/2019] [Indexed: 12/21/2022]
Abstract
Although it is well known that "losses loom larger than gains," computational approaches to aversion-related decision making (ARDM) for psychiatric disorders is an underdeveloped area. Computational models of ARDM have been implemented primarily as state-dependent reinforcement learning models with bias parameters to quantify Pavlovian associations, and differential learning rates to quantify instrumental updating have been shown to depend on context, involve complex cost calculations, and include the consideration of counterfactual outcomes. Little is known about how individual differences influence these models relevant to anxiety-related conditions or addiction-related dysfunction. It is argued that model parameters reflecting 1) Pavlovian biases in the context of reinforcement learning or 2) hyperprecise prior beliefs in the context of active inference play an important role in the emergence of dysfunctional avoidance behaviors. The neural implementation of ARDM includes brain areas that are important for valuation (ventromedial prefrontal cortex) and positive reinforcement-related prediction errors (ventral striatum), but also aversive processing (insular cortex and cerebellum). Computational models of ARDM will help to establish a quantitative explanatory account of the development of anxiety disorders and addiction, but such models also face several challenges, including limited evidence for stability of individual differences, relatively low reliability of tasks, and disorder heterogeneity. Thus, it will be necessary to develop robust, reliable, and model-based experimental probes; recruit larger sample sizes; and use single case experimental designs for better pragmatic and explanatory biological models of psychiatric disorders.
Collapse
Affiliation(s)
- Martin P Paulus
- Laureate Institute for Brain Research, Tulsa, Oklahoma; Department of Psychiatry, University of California, San Diego, La Jolla, California.
| |
Collapse
|
20
|
Miletić S, Boag RJ, Forstmann BU. Mutual benefits: Combining reinforcement learning with sequential sampling models. Neuropsychologia 2019; 136:107261. [PMID: 31733237 DOI: 10.1016/j.neuropsychologia.2019.107261] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/21/2019] [Accepted: 11/10/2019] [Indexed: 12/21/2022]
Abstract
Reinforcement learning models of error-driven learning and sequential-sampling models of decision making have provided significant insight into the neural basis of a variety of cognitive processes. Until recently, model-based cognitive neuroscience research using both frameworks has evolved separately and independently. Recent efforts have illustrated the complementary nature of both modelling traditions and showed how they can be integrated into a unified theoretical framework, explaining trial-by-trial dependencies in choice behavior as well as response time distributions. Here, we review a theoretical background of integrating the two classes of models, and review recent empirical efforts towards this goal. We furthermore argue that the integration of both modelling traditions provides mutual benefits for both fields, and highlight promises of this approach for cognitive modelling and model-based cognitive neuroscience.
Collapse
Affiliation(s)
- Steven Miletić
- University of Amsterdam, Department of Psychology, Amsterdam, the Netherlands.
| | - Russell J Boag
- University of Amsterdam, Department of Psychology, Amsterdam, the Netherlands
| | - Birte U Forstmann
- University of Amsterdam, Department of Psychology, Amsterdam, the Netherlands
| |
Collapse
|
21
|
Seymour B. Pain: A Precision Signal for Reinforcement Learning and Control. Neuron 2019; 101:1029-1041. [PMID: 30897355 DOI: 10.1016/j.neuron.2019.01.055] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 01/18/2019] [Accepted: 01/27/2019] [Indexed: 12/18/2022]
Abstract
Since noxious stimulation usually leads to the perception of pain, pain has traditionally been considered sensory nociception. But its variability and sensitivity to a broad array of cognitive and motivational factors have meant it is commonly viewed as inherently imprecise and intangibly subjective. However, the core function of pain is motivational-to direct both short- and long-term behavior away from harm. Here, we illustrate that a reinforcement learning model of pain offers a mechanistic understanding of how the brain supports this, illustrating the underlying computational architecture of the pain system. Importantly, it explains why pain is tuned by multiple factors and necessarily supported by a distributed network of brain regions, recasting pain as a precise and objectifiable control signal.
Collapse
Affiliation(s)
- Ben Seymour
- Center for Information and Neural Networks, National Institute of Information and Communications Technology, 1-4 Yamadaoka, Suita, Osaka 565-0871, Japan; Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK.
| |
Collapse
|
22
|
Shahar N, Hauser TU, Moutoussis M, Moran R, Keramati M, Dolan RJ. Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling. PLoS Comput Biol 2019; 15:e1006803. [PMID: 30759077 PMCID: PMC6391008 DOI: 10.1371/journal.pcbi.1006803] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 02/26/2019] [Accepted: 01/17/2019] [Indexed: 01/10/2023] Open
Abstract
A well-established notion in cognitive neuroscience proposes that multiple brain systems contribute to choice behaviour. These include: (1) a model-free system that uses values cached from the outcome history of alternative actions, and (2) a model-based system that considers action outcomes and the transition structure of the environment. The widespread use of this distinction, across a range of applications, renders it important to index their distinct influences with high reliability. Here we consider the two-stage task, widely considered as a gold standard measure for the contribution of model-based and model-free systems to human choice. We tested the internal/temporal stability of measures from this task, including those estimated via an established computational model, as well as an extended model using drift-diffusion. Drift-diffusion modeling suggested that both choice in the first stage, and RTs in the second stage, are directly affected by a model-based/free trade-off parameter. Both parameter recovery and the stability of model-based estimates were poor but improved substantially when both choice and RT were used (compared to choice only), and when more trials (than conventionally used in research practice) were included in our analysis. The findings have implications for interpretation of past and future studies based on the use of the two-stage task, as well as for characterising the contribution of model-based processes to choice behaviour.
Collapse
Affiliation(s)
- Nitzan Shahar
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Tobias U. Hauser
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Michael Moutoussis
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Rani Moran
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | - Mehdi Keramati
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| | | | - Raymond J. Dolan
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
| |
Collapse
|
23
|
Dorsal striatal dopamine D1 receptor availability predicts an instrumental bias in action learning. Proc Natl Acad Sci U S A 2018; 116:261-270. [PMID: 30563856 PMCID: PMC6320523 DOI: 10.1073/pnas.1816704116] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The brain’s dopaminergic pathways are crucially important for adaptive behavior. They are thought to enable us to approach rewards and stay away from punishments. During learning, dopaminergic reward prediction errors are thought to reinforce previously rewarded actions, so they become easier to repeat. This dopaminergic activity could lead to a systematic bias by which rewarded actions are more readily learned than rewarded inactions. We present two findings. First, dopamine receptors in cortex, dorsal striatum, and nucleus accumbens provide distinct sources of variance in the human brain. Second, the boost in an individual’s learning rate from previously rewarded actions is dependent on the dopamine receptor density in dorsal striatum, a central structure in the dopaminergic circuit. Learning to act to obtain reward and inhibit to avoid punishment is easier compared with learning the opposite contingencies. This coupling of action and valence is often thought of as a Pavlovian bias, although recent research has shown it may also emerge through instrumental mechanisms. We measured this learning bias with a rewarded go/no-go task in 60 adults of different ages. Using computational modeling, we characterized the bias as being instrumental. To assess the role of endogenous dopamine (DA) in the expression of this bias, we quantified DA D1 receptor availability using positron emission tomography (PET) with the radioligand [11C]SCH23390. Using principal-component analysis on the binding potentials in a number of cortical and striatal regions of interest, we demonstrated that cortical, dorsal striatal, and ventral striatal areas provide independent sources of variance in DA D1 receptor availability. Interindividual variation in the dorsal striatal component was related to the strength of the instrumental bias during learning. These data suggest at least three anatomical sources of variance in DA D1 receptor availability separable using PET in humans, and we provide evidence that human dorsal striatal DA D1 receptors are involved in the modulation of instrumental learning biases.
Collapse
|
24
|
Abstract
In order to discover the most rewarding actions, agents must collect information about their environment, potentially foregoing reward. The optimal solution to this "explore-exploit" dilemma is often computationally challenging, but principled algorithmic approximations exist. These approximations utilize uncertainty about action values in different ways. Some random exploration algorithms scale the level of choice stochasticity with the level of uncertainty. Other directed exploration algorithms add a "bonus" to action values with high uncertainty. Random exploration algorithms are sensitive to total uncertainty across actions, whereas directed exploration algorithms are sensitive to relative uncertainty. This paper reports a multi-armed bandit experiment in which total and relative uncertainty were orthogonally manipulated. We found that humans employ both exploration strategies, and that these strategies are independently controlled by different uncertainty computations.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University
| |
Collapse
|