1
|
Anlló H, Bavard S, Benmarrakchi F, Bonagura D, Cerrotti F, Cicue M, Gueguen M, Guzmán EJ, Kadieva D, Kobayashi M, Lukumon G, Sartorio M, Yang J, Zinchenko O, Bahrami B, Silva Concha J, Hertz U, Konova AB, Li J, O'Madagain C, Navajas J, Reyes G, Sarabi-Jamab A, Shestakova A, Sukumaran B, Watanabe K, Palminteri S. Comparing experience- and description-based economic preferences across 11 countries. Nat Hum Behav 2024; 8:1554-1567. [PMID: 38877287 DOI: 10.1038/s41562-024-01894-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/19/2024] [Indexed: 06/16/2024]
Abstract
Recent evidence indicates that reward value encoding in humans is highly context dependent, leading to suboptimal decisions in some cases, but whether this computational constraint on valuation is a shared feature of human cognition remains unknown. Here we studied the behaviour of n = 561 individuals from 11 countries of markedly different socioeconomic and cultural makeup. Our findings show that context sensitivity was present in all 11 countries. Suboptimal decisions generated by context manipulation were not explained by risk aversion, as estimated through a separate description-based choice task (that is, lotteries) consisting of matched decision offers. Conversely, risk aversion significantly differed across countries. Overall, our findings suggest that context-dependent reward value encoding is a feature of human cognition that remains consistently present across different countries, as opposed to description-based decision-making, which is more permeable to cultural factors.
Collapse
Affiliation(s)
- Hernán Anlló
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, Paris, France.
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan.
- Intercultural Cognitive Network, Paris, France.
| | - Sophie Bavard
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, Paris, France
- Intercultural Cognitive Network, Paris, France
- General Psychology Lab, Hamburg University, Hamburg, Germany
| | - FatimaEzzahra Benmarrakchi
- Intercultural Cognitive Network, Paris, France
- School of Collective Intelligence, Université Mohammed VI Polytechnique, Rabat, Morocco
| | - Darla Bonagura
- Intercultural Cognitive Network, Paris, France
- Department of Psychiatry, University Behavioral Health Care and Brain Health Institute, Rutgers University-New Brunswick, Piscataway, NJ, USA
| | - Fabien Cerrotti
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, Paris, France
- Intercultural Cognitive Network, Paris, France
| | - Mirona Cicue
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Maelle Gueguen
- Intercultural Cognitive Network, Paris, France
- Department of Psychiatry, University Behavioral Health Care and Brain Health Institute, Rutgers University-New Brunswick, Piscataway, NJ, USA
| | - Eugenio José Guzmán
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Dzerassa Kadieva
- International Laboratory for Social Neurobiology, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Maiko Kobayashi
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Gafari Lukumon
- School of Collective Intelligence, Université Mohammed VI Polytechnique, Rabat, Morocco
| | - Marco Sartorio
- Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina
| | - Jiong Yang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
| | - Oksana Zinchenko
- Intercultural Cognitive Network, Paris, France
- Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Bahador Bahrami
- Intercultural Cognitive Network, Paris, France
- Department of Psychology, Ludwig Maximilian University, Munich, Germany
| | - Jaime Silva Concha
- Intercultural Cognitive Network, Paris, France
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Uri Hertz
- Intercultural Cognitive Network, Paris, France
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Anna B Konova
- Intercultural Cognitive Network, Paris, France
- Department of Psychiatry, University Behavioral Health Care and Brain Health Institute, Rutgers University-New Brunswick, Piscataway, NJ, USA
| | - Jian Li
- Intercultural Cognitive Network, Paris, France
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing, China
| | - Cathal O'Madagain
- Intercultural Cognitive Network, Paris, France
- School of Collective Intelligence, Université Mohammed VI Polytechnique, Rabat, Morocco
| | - Joaquin Navajas
- Intercultural Cognitive Network, Paris, France
- Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Escuela de Negocios, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Gabriel Reyes
- Intercultural Cognitive Network, Paris, France
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Atiye Sarabi-Jamab
- Intercultural Cognitive Network, Paris, France
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Anna Shestakova
- Intercultural Cognitive Network, Paris, France
- Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Bhasi Sukumaran
- Intercultural Cognitive Network, Paris, France
- Department of Clinical Psychology, SRM Medical College Hospital and Research Centre, Chennai, India
| | - Katsumi Watanabe
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
- Intercultural Cognitive Network, Paris, France
| | - Stefano Palminteri
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, Paris, France.
- Intercultural Cognitive Network, Paris, France.
- Departement d'études cognitives, Ecole normale supérieure, PSL Research University, Paris, France.
| |
Collapse
|
2
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
3
|
Blanco-Pozo M, Akam T, Walton ME. Dopamine-independent effect of rewards on choices through hidden-state inference. Nat Neurosci 2024; 27:286-297. [PMID: 38216649 PMCID: PMC10849965 DOI: 10.1038/s41593-023-01542-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 12/01/2023] [Indexed: 01/14/2024]
Abstract
Dopamine is implicated in adaptive behavior through reward prediction error (RPE) signals that update value estimates. There is also accumulating evidence that animals in structured environments can use inference processes to facilitate behavioral flexibility. However, it is unclear how these two accounts of reward-guided decision-making should be integrated. Using a two-step task for mice, we show that dopamine reports RPEs using value information inferred from task structure knowledge, alongside information about reward rate and movement. Nonetheless, although rewards strongly influenced choices and dopamine activity, neither activating nor inhibiting dopamine neurons at trial outcome affected future choice. These data were recapitulated by a neural network model where cortex learned to track hidden task states by predicting observations, while basal ganglia learned values and actions via RPEs. This shows that the influence of rewards on choices can stem from dopamine-independent information they convey about the world's state, not the dopaminergic RPEs they produce.
Collapse
Affiliation(s)
- Marta Blanco-Pozo
- Department of Experimental Psychology, Oxford University, Oxford, UK.
- Wellcome Centre for Integrative Neuroimaging, Oxford University, Oxford, UK.
| | - Thomas Akam
- Department of Experimental Psychology, Oxford University, Oxford, UK.
- Wellcome Centre for Integrative Neuroimaging, Oxford University, Oxford, UK.
| | - Mark E Walton
- Department of Experimental Psychology, Oxford University, Oxford, UK.
- Wellcome Centre for Integrative Neuroimaging, Oxford University, Oxford, UK.
| |
Collapse
|
4
|
Pereira AR, Alemi M, Cerqueira-Nunes M, Monteiro C, Galhardo V, Cardoso-Cruz H. Dynamics of Lateral Habenula-Ventral Tegmental Area Microcircuit on Pain-Related Cognitive Dysfunctions. Neurol Int 2023; 15:1303-1319. [PMID: 37987455 PMCID: PMC10660716 DOI: 10.3390/neurolint15040082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/20/2023] [Accepted: 10/25/2023] [Indexed: 11/22/2023] Open
Abstract
Chronic pain is a health problem that affects the ability to work and perform other activities, and it generally worsens over time. Understanding the complex pain interaction with brain circuits could help predict which patients are at risk of developing central dysfunctions. Increasing evidence from preclinical and clinical studies suggests that aberrant activity of the lateral habenula (LHb) is associated with depressive symptoms characterized by excessive negative focus, leading to high-level cognitive dysfunctions. The primary output region of the LHb is the ventral tegmental area (VTA), through a bidirectional connection. Recently, there has been growing interest in the complex interactions between the LHb and VTA, particularly regarding their crucial roles in behavior regulation and their potential involvement in the pathological impact of chronic pain on cognitive functions. In this review, we briefly discuss the structural and functional roles of the LHb-VTA microcircuit and their impact on cognition and mood disorders in order to support future studies addressing brain plasticity during chronic pain conditions.
Collapse
Affiliation(s)
- Ana Raquel Pereira
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| | - Mobina Alemi
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| | - Mariana Cerqueira-Nunes
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
- Programa Doutoral em Neurociências, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| | - Clara Monteiro
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| | - Vasco Galhardo
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| | - Helder Cardoso-Cruz
- Instituto de Investigação e Inovação em Saúde—Pain Neurobiology Group, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; (A.R.P.); (M.A.); (M.C.-N.); (C.M.); (V.G.)
- Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
- Departamento de Biomedicina—Unidade de Biologia Experimental, Faculdade de Medicina, Universidade do Porto, Rua Doutor Plácido da Costa, 4200-450 Porto, Portugal
| |
Collapse
|
5
|
Nelli S, Braun L, Dumbalska T, Saxe A, Summerfield C. Neural knowledge assembly in humans and neural networks. Neuron 2023; 111:1504-1516.e9. [PMID: 36898375 PMCID: PMC10618408 DOI: 10.1016/j.neuron.2023.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/21/2022] [Accepted: 02/09/2023] [Indexed: 03/11/2023]
Abstract
Human understanding of the world can change rapidly when new information comes to light, such as when a plot twist occurs in a work of fiction. This flexible "knowledge assembly" requires few-shot reorganization of neural codes for relations among objects and events. However, existing computational theories are largely silent about how this could occur. Here, participants learned a transitive ordering among novel objects within two distinct contexts before exposure to new knowledge that revealed how they were linked. Blood-oxygen-level-dependent (BOLD) signals in dorsal frontoparietal cortical areas revealed that objects were rapidly and dramatically rearranged on the neural manifold after minimal exposure to linking information. We then adapt online stochastic gradient descent to permit similar rapid knowledge assembly in a neural network model.
Collapse
Affiliation(s)
- Stephanie Nelli
- Department of Cognitive Science, Occidental College, Los Angeles, CA 90041, USA; Department of Experimental Psychology, University of Oxford, Oxford OX2 6GC, UK.
| | - Lukas Braun
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GC, UK
| | | | - Andrew Saxe
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GC, UK; Gatsby Unit & Sainsbury Wellcome Centre, University College London, London W1T 4JG, UK; CIFAR Azrieli Global Scholars Program, CIFAR, Toronto, ON M5G 1M1, Canada
| | | |
Collapse
|
6
|
Garvert MM, Saanum T, Schulz E, Schuck NW, Doeller CF. Hippocampal spatio-predictive cognitive maps adaptively guide reward generalization. Nat Neurosci 2023; 26:615-626. [PMID: 37012381 PMCID: PMC10076220 DOI: 10.1038/s41593-023-01283-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 02/15/2023] [Indexed: 04/05/2023]
Abstract
The brain forms cognitive maps of relational knowledge-an organizing principle thought to underlie our ability to generalize and make inferences. However, how can a relevant map be selected in situations where a stimulus is embedded in multiple relational structures? Here, we find that both spatial and predictive cognitive maps influence generalization in a choice task, where spatial location determines reward magnitude. Mirroring behavior, the hippocampus not only builds a map of spatial relationships but also encodes the experienced transition structure. As the task progresses, participants' choices become more influenced by spatial relationships, reflected in a strengthening of the spatial map and a weakening of the predictive map. This change is driven by orbitofrontal cortex, which represents the degree to which an outcome is consistent with the spatial rather than the predictive map and updates hippocampal representations accordingly. Taken together, this demonstrates how hippocampal cognitive maps are used and updated flexibly for inference.
Collapse
Affiliation(s)
- Mona M Garvert
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
- Max Planck Research Group NeuroCode, Max Planck Institute for Human Development, Berlin, Germany.
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.
| | - Tankred Saanum
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany
| | - Eric Schulz
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany
| | - Nicolas W Schuck
- Max Planck Research Group NeuroCode, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- Institute of Psychology, Universität Hamburg, Hamburg, Germany
| | - Christian F Doeller
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer's Disease NTNU, Trondheim, Norway.
- Wilhelm Wundt Institute of Psychology, Leipzig University, Leipzig, Germany.
| |
Collapse
|
7
|
Anlló H, Bavard S, Benmarrakchi F, Bonagura D, Cerrotti F, Cicue M, Gueguen M, Guzmán EJ, Kadieva D, Kobayashi M, Lukumon G, Sartorio M, Yang J, Zinchenko O, Bahrami B, Concha JS, Hertz U, Konova AB, Li J, O’Madagain C, Navajas J, Reyes G, Sarabi-Jamab A, Shestakova A, Sukumaran B, Watanabe K, Palminteri S. Outcome context-dependence is not WEIRD: Comparing reinforcement- and description-based economic preferences worldwide. RESEARCH SQUARE 2023:rs.3.rs-2621222. [PMID: 36909645 PMCID: PMC10002789 DOI: 10.21203/rs.3.rs-2621222/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Recent evidence indicates that reward value encoding in humans is highly context-dependent, leading to suboptimal decisions in some cases. But whether this computational constraint on valuation is a shared feature of human cognition remains unknown. To address this question, we studied the behavior of individuals from across 11 countries of markedly different socioeconomic and cultural makeup using an experimental approach that reliably captures context effects in reinforcement learning. Our findings show that all samples presented evidence of similar sensitivity to context. Crucially, suboptimal decisions generated by context manipulation were not explained by risk aversion, as estimated through a separate description-based choice task (i.e., lotteries) consisting of matched decision offers. Conversely, risk aversion significantly differed across countries. Overall, our findings suggest that context-dependent reward value encoding is a hardcoded feature of human cognition, while description-based decision-making is significantly sensitive to cultural factors.
Collapse
Affiliation(s)
- Hernán Anlló
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, ENS-PSL, Paris, France
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
- Intercultural Cognitive Network
| | - Sophie Bavard
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, ENS-PSL, Paris, France
- Intercultural Cognitive Network
- General Psychology Lab, Hamburg University, Hamburg, Germany
| | - FatimaZzahra Benmarrakchi
- Intercultural Cognitive Network
- School of Collective Intelligence, Universite Mohammed VI Polytechnique, Rabat, Morocco
| | - Darla Bonagura
- Intercultural Cognitive Network
- Department of Psychiatry, University Behavioural Health Care, & the Brain Health Institute, Rutgers University—New Brunswick, Piscataway, USA
| | - Fabien Cerrotti
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, ENS-PSL, Paris, France
- Intercultural Cognitive Network
| | - Mirona Cicue
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Maelle Gueguen
- Intercultural Cognitive Network
- Department of Psychiatry, University Behavioural Health Care, & the Brain Health Institute, Rutgers University—New Brunswick, Piscataway, USA
| | - Eugenio José Guzmán
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Dzerassa Kadieva
- International Laboratory for Social Neurobiology, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Maiko Kobayashi
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Gafari Lukumon
- School of Collective Intelligence, Universite Mohammed VI Polytechnique, Rabat, Morocco
| | - Marco Sartorio
- Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina
| | - Jiong Yang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of behaviour and Mental Health, Peking University, Beijing, China
| | - Oksana Zinchenko
- Intercultural Cognitive Network
- Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Bahador Bahrami
- IDG/McGovern Institute for Brain Research, Peking University, Beijing, China
- Department of Psychology, Ludwig Maximilian University, Munich, Germany
| | - Jaime Silva Concha
- Intercultural Cognitive Network
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Uri Hertz
- Intercultural Cognitive Network
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Anna B. Konova
- Intercultural Cognitive Network
- Department of Psychiatry, University Behavioural Health Care, & the Brain Health Institute, Rutgers University—New Brunswick, Piscataway, USA
| | - Jian Li
- Intercultural Cognitive Network
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of behaviour and Mental Health, Peking University, Beijing, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing, China
| | - Cathal O’Madagain
- Intercultural Cognitive Network
- School of Collective Intelligence, Universite Mohammed VI Polytechnique, Rabat, Morocco
| | - Joaquin Navajas
- Intercultural Cognitive Network
- Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Escuela de Negocios, Universidad Torcuato Di Tella, Buenos Aires, Argentina
- Consejo Nacional de Investigaciones Cientifícas y Técnicas (CONICET), Argentina
| | - Gabriel Reyes
- Intercultural Cognitive Network
- Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile
| | - Atiye Sarabi-Jamab
- Intercultural Cognitive Network
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Anna Shestakova
- Intercultural Cognitive Network
- Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia
| | - Bhasi Sukumaran
- Intercultural Cognitive Network
- Department of Clinical Psychology, SRM Medical College Hospital & Research Centre, Chennai, India
| | - Katsumi Watanabe
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
- Intercultural Cognitive Network
| | - Stefano Palminteri
- Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, ENS-PSL, Paris, France
- Intercultural Cognitive Network
| |
Collapse
|
8
|
Wimmer GE, Liu Y, McNamee DC, Dolan RJ. Distinct replay signatures for prospective decision-making and memory preservation. Proc Natl Acad Sci U S A 2023; 120:e2205211120. [PMID: 36719914 PMCID: PMC9963918 DOI: 10.1073/pnas.2205211120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 12/05/2022] [Indexed: 02/01/2023] Open
Abstract
Theories of neural replay propose that it supports a range of functions, most prominently planning and memory consolidation. Here, we test the hypothesis that distinct signatures of replay in the same task are related to model-based decision-making ("planning") and memory preservation. We designed a reward learning task wherein participants utilized structure knowledge for model-based evaluation, while at the same time had to maintain knowledge of two independent and randomly alternating task environments. Using magnetoencephalography and multivariate analysis, we first identified temporally compressed sequential reactivation, or replay, both prior to choice and following reward feedback. Before choice, prospective replay strength was enhanced for the current task-relevant environment when a model-based planning strategy was beneficial. Following reward receipt, and consistent with a memory preservation role, replay for the alternative distal task environment was enhanced as a function of decreasing recency of experience with that environment. Critically, these planning and memory preservation relationships were selective to pre-choice and post-feedback periods, respectively. Our results provide support for key theoretical proposals regarding the functional role of replay and demonstrate that the relative strength of planning and memory-related signals are modulated by ongoing computational and task demands.
Collapse
Affiliation(s)
- G. Elliott Wimmer
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, LondonWC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, LondonWC1N 3BG, UK
| | - Yunzhe Liu
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing100875, China
- Chinese Institute for Brain Research, Beijing100875, China
| | - Daniel C. McNamee
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, LondonWC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, LondonWC1N 3BG, UK
- Neuroscience Programme, Champalimaud Research, Lisbon1400-038, Portugal
| | - Raymond J. Dolan
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, LondonWC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, LondonWC1N 3BG, UK
- State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing100875, China
| |
Collapse
|
9
|
Horwath EA, Rouhani N, DuBrow S, Murty VP. Value restructures the organization of free recall. Cognition 2023; 231:105315. [PMID: 36399901 PMCID: PMC9839530 DOI: 10.1016/j.cognition.2022.105315] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 10/12/2022] [Accepted: 10/22/2022] [Indexed: 11/17/2022]
Abstract
A large body of research illustrates the prioritization of goal-relevant information in memory; however, it is unclear how reward-related memories are organized. Using a rewarded free recall paradigm, we investigated how reward motivation structures the organization of memory around temporal and higher-order contexts. To better understand these processes, we simulated our findings using a reward-modulated variant of the Context Maintenance and Retrieval Model (CMR; Polyn et al., 2009). In the first study, we found that reward did not influence temporal clustering, but instead shifted the organization of memory based on reward category. Further, we showed that a reward-modulated learning rate and source features of CMR most accurately depict reward's enhancement on memory and clustering by value. In a second study, we showed that reward-memory effects can exist in both extended periods of sustained motivation and frequent changes in motivation, by showing equivalent reward effects using mixed- and pure-list motivation manipulations. However, we showed that a reward-modulated learning rate in isolation can support reward's enhancement of memory in pure-list contexts. Overall, we conclude that reward-related memories are adaptively organized by higher-order value information, and contextual binding to value contexts may only be necessary when rewards are intermittent versus sustained.
Collapse
Affiliation(s)
- Elizabeth A Horwath
- Department of Psychology, Temple University, Philadelphia, PA, United States of America
| | - Nina Rouhani
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, United States of America
| | - Sarah DuBrow
- Department of Psychology, University of Oregon, Eugene, OR, United States of America
| | - Vishnu P Murty
- Department of Psychology, Temple University, Philadelphia, PA, United States of America.
| |
Collapse
|
10
|
Polti I, Nau M, Kaplan R, van Wassenhove V, Doeller CF. Rapid encoding of task regularities in the human hippocampus guides sensorimotor timing. eLife 2022; 11:e79027. [PMID: 36317500 PMCID: PMC9625083 DOI: 10.7554/elife.79027] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 10/02/2022] [Indexed: 11/17/2022] Open
Abstract
The brain encodes the statistical regularities of the environment in a task-specific yet flexible and generalizable format. Here, we seek to understand this process by bridging two parallel lines of research, one centered on sensorimotor timing, and the other on cognitive mapping in the hippocampal system. By combining functional magnetic resonance imaging (fMRI) with a fast-paced time-to-contact (TTC) estimation task, we found that the hippocampus signaled behavioral feedback received in each trial as well as performance improvements across trials along with reward-processing regions. Critically, it signaled performance improvements independent from the tested intervals, and its activity accounted for the trial-wise regression-to-the-mean biases in TTC estimation. This is in line with the idea that the hippocampus supports the rapid encoding of temporal context even on short time scales in a behavior-dependent manner. Our results emphasize the central role of the hippocampus in statistical learning and position it at the core of a brain-wide network updating sensorimotor representations in real time for flexible behavior.
Collapse
Affiliation(s)
- Ignacio Polti
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Matthias Nau
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Raphael Kaplan
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Department of Basic Psychology, Clinical Psychology, and Psychobiology, Universitat Jaume ICastellón de la PlanaSpain
| | - Virginie van Wassenhove
- CEA DRF/Joliot, NeuroSpin; INSERM, Cognitive Neuroimaging Unit; CNRS, Université Paris-SaclayGif-Sur-YvetteFrance
| | - Christian F Doeller
- Kavli Institute for Systems Neuroscience, Centre for Neural Computation, The Egil and Pauline Braathen and Fred Kavli Centre for Cortical Microcircuits, Jebsen Centre for Alzheimer’s Disease, Norwegian University of Science and TechnologyTrondheimNorway
- Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
- Wilhelm Wundt Institute of Psychology, Leipzig UniversityLeipzigGermany
| |
Collapse
|
11
|
Colas JT, Dundon NM, Gerraty RT, Saragosa‐Harris NM, Szymula KP, Tanwisuth K, Tyszka JM, van Geen C, Ju H, Toga AW, Gold JI, Bassett DS, Hartley CA, Shohamy D, Grafton ST, O'Doherty JP. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T. Hum Brain Mapp 2022; 43:4750-4790. [PMID: 35860954 PMCID: PMC9491297 DOI: 10.1002/hbm.25988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 11/12/2022] Open
Abstract
The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| | - Neil M. Dundon
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Department of Child and Adolescent Psychiatry, Psychotherapy, and PsychosomaticsUniversity of FreiburgFreiburg im BreisgauGermany
| | - Raphael T. Gerraty
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Center for Science and SocietyColumbia UniversityNew YorkNew YorkUSA
| | - Natalie M. Saragosa‐Harris
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of CaliforniaLos AngelesCaliforniaUSA
| | - Karol P. Szymula
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Koranis Tanwisuth
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Department of PsychologyUniversity of CaliforniaBerkeleyCaliforniaUSA
| | - J. Michael Tyszka
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
| | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harang Ju
- Neuroscience Graduate GroupUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Arthur W. Toga
- Laboratory of Neuro ImagingUSC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Joshua I. Gold
- Department of NeuroscienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dani S. Bassett
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Electrical and Systems EngineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of NeurologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Physics and AstronomyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Santa Fe InstituteSanta FeNew MexicoUSA
| | - Catherine A. Hartley
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Center for Neural ScienceNew York UniversityNew YorkNew YorkUSA
| | - Daphna Shohamy
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Kavli Institute for Brain ScienceColumbia UniversityNew YorkNew YorkUSA
| | - Scott T. Grafton
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
| | - John P. O'Doherty
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| |
Collapse
|
12
|
Samborska V, Butler JL, Walton ME, Behrens TEJ, Akam T. Complementary task representations in hippocampus and prefrontal cortex for generalizing the structure of problems. Nat Neurosci 2022; 25:1314-1326. [PMID: 36171429 PMCID: PMC9534768 DOI: 10.1038/s41593-022-01149-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 07/19/2022] [Indexed: 11/16/2022]
Abstract
Humans and other animals effortlessly generalize prior knowledge to solve novel problems, by abstracting common structure and mapping it onto new sensorimotor specifics. To investigate how the brain achieves this, in this study, we trained mice on a series of reversal learning problems that shared the same structure but had different physical implementations. Performance improved across problems, indicating transfer of knowledge. Neurons in medial prefrontal cortex (mPFC) maintained similar representations across problems despite their different sensorimotor correlates, whereas hippocampal (dCA1) representations were more strongly influenced by the specifics of each problem. This was true for both representations of the events that comprised each trial and those that integrated choices and outcomes over multiple trials to guide an animal’s decisions. These data suggest that prefrontal cortex and hippocampus play complementary roles in generalization of knowledge: PFC abstracts the common structure among related problems, and hippocampus maps this structure onto the specifics of the current situation. Samborska et al. trained mice on a set of problems with the same structure but different physical layouts to study generalization. Neurons in prefrontal cortex generalized across problems, whereas those in hippocampus were more problem specific.
Collapse
Affiliation(s)
- Veronika Samborska
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| | - James L Butler
- Department of Clinical and Movement Neurosciences, University College London, London, UK.,Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | - Mark E Walton
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.,Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Timothy E J Behrens
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK. .,Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK. .,Wellcome Centre for Human Neuroimaging, University College London, London, UK.
| | - Thomas Akam
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.,Department of Experimental Psychology, University of Oxford, Oxford, UK
| |
Collapse
|
13
|
Efficient coding of cognitive variables underlies dopamine response and choice behavior. Nat Neurosci 2022; 25:738-748. [PMID: 35668173 DOI: 10.1038/s41593-022-01085-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 04/26/2022] [Indexed: 11/26/2022]
Abstract
Reward expectations based on internal knowledge of the external environment are a core component of adaptive behavior. However, internal knowledge may be inaccurate or incomplete due to errors in sensory measurements. Some features of the environment may also be encoded inaccurately to minimize representational costs associated with their processing. In this study, we investigated how reward expectations are affected by features of internal representations by studying behavior and dopaminergic activity while mice make time-based decisions. We show that several possible representations allow a reinforcement learning agent to model animals' overall performance during the task. However, only a small subset of highly compressed representations simultaneously reproduced the co-variability in animals' choice behavior and dopaminergic activity. Strikingly, these representations predict an unusual distribution of response times that closely match animals' behavior. These results inform how constraints of representational efficiency may be expressed in encoding representations of dynamic cognitive variables used for reward-based computations.
Collapse
|
14
|
Abir Y, Marvin CB, van Geen C, Leshkowitz M, Hassin RR, Shohamy D. An energizing role for motivation in information-seeking during the early phase of the COVID-19 pandemic. Nat Commun 2022; 13:2310. [PMID: 35484153 PMCID: PMC9050882 DOI: 10.1038/s41467-022-30011-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 04/07/2022] [Indexed: 11/18/2022] Open
Abstract
The COVID-19 pandemic has highlighted the importance of understanding and managing information seeking behavior. Information-seeking in humans is often viewed as irrational rather than utility maximizing. Here, we hypothesized that this apparent disconnect between utility and information-seeking is due to a latent third variable, motivation. We quantified information-seeking, learning, and COVID-19-related concern (which we used as a proxy for motivation regarding COVID-19 and the changes in circumstance it caused) in a US-based sample (n = 5376) during spring 2020. We found that self-reported levels of COVID-19 concern were associated with directed seeking of COVID-19-related content and better memory for such information. Interestingly, this specific motivational state was also associated with a general enhancement of information-seeking for content unrelated to COVID-19. These effects were associated with commensurate changes to utility expectations and were dissociable from the influence of non-specific anxiety. Thus, motivation both directs and energizes epistemic behavior, linking together utility and curiosity. Information-seeking behavior in humans is often viewed as irrational rather than utility maximizing. Here the authors describe data obtained in Spring 2020 showing that participants’ concern about COVID-19 was related not only to their drive to seek information about the virus, but also to their curiosity about other more general topics.
Collapse
Affiliation(s)
- Yaniv Abir
- Department of Psychology, Columbia University, New York, NY, USA.
| | | | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.,Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Maya Leshkowitz
- Department of Cognitive Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ran R Hassin
- Department of Psychology and The Federmann Center for the Study of Rationality, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, NY, USA.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.,Kavli Institute for Brain Science, Columbia University, New York, NY, USA
| |
Collapse
|
15
|
Rmus M, Ritz H, Hunter LE, Bornstein AM, Shenhav A. Humans can navigate complex graph structures acquired during latent learning. Cognition 2022; 225:105103. [PMID: 35364400 PMCID: PMC9201735 DOI: 10.1016/j.cognition.2022.105103] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 03/09/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
Abstract
Humans appear to represent many forms of knowledge in associative networks whose nodes are multiply connected, including sensory, spatial, and semantic. Recent work has shown that explicitly augmenting artificial agents with such graph-structured representations endows them with more human-like capabilities of compositionality and transfer learning. An open question is how humans acquire these representations. Previously, it has been shown that humans can learn to navigate graph-structured conceptual spaces on the basis of direct experience with trajectories that intentionally draw the network contours (Schapiro, Kustner, & Turk-Browne, 2012; Schapiro, Turk-Browne, Botvinick, & Norman, 2016), or through direct experience with rewards that covary with the underlying associative distance (Wu, Schulz, Speekenbrink, Nelson, & Meder, 2018). Here, we provide initial evidence that this capability is more general, extending to learning to reason about shortest-path distances across a graph structure acquired across disjoint experiences with randomized edges of the graph - a form of latent learning. In other words, we show that humans can infer graph structures, assembling them from disordered experiences. We further show that the degree to which individuals learn to reason correctly and with reference to the structure of the graph corresponds to their propensity, in a separate task, to use model-based reinforcement learning to achieve rewards. This connection suggests that the correct acquisition of graph-structured relationships is a central ability underlying forward planning and reasoning, and may be a core computation across the many domains in which graph-based reasoning is advantageous.
Collapse
Affiliation(s)
- Milena Rmus
- Department of Psychology, University of California, Berkeley, USA.
| | - Harrison Ritz
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA
| | | | - Aaron M Bornstein
- Department of Cognitive Sciences, University of California, Irvine, USA; Center for the Neurobiology of Learning and Memory, University of California, Irvine, USA
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA
| |
Collapse
|
16
|
Reward learning and working memory: Effects of massed versus spaced training and post-learning delay period. Mem Cognit 2021; 50:312-324. [PMID: 34519968 PMCID: PMC8821056 DOI: 10.3758/s13421-021-01233-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/18/2021] [Indexed: 11/29/2022]
Abstract
Neuroscience research has illuminated the mechanisms supporting learning from reward feedback, demonstrating a critical role for the striatum and midbrain dopamine system. However, in humans, short-term working memory that is dependent on frontal and parietal cortices can also play an important role, particularly in commonly used paradigms in which learning is relatively condensed in time. Given the growing use of reward-based learning tasks in translational studies in computational psychiatry, it is important to understand the extent of the influence of working memory and also how core gradual learning mechanisms can be better isolated. In our experiments, we manipulated the spacing between repetitions along with a post-learning delay preceding a test phase. We found that learning was slower for stimuli repeated after a long delay (spaced-trained) compared to those repeated immediately (massed-trained), likely reflecting the remaining contribution of feedback learning mechanisms when working memory is not available. For massed learning, brief interruptions led to drops in subsequent performance, and individual differences in working memory capacity positively correlated with overall performance. Interestingly, when tested after a delay period but not immediately, relative preferences decayed in the massed condition and increased in the spaced condition. Our results provide additional support for a large role of working memory in reward-based learning in temporally condensed designs. We suggest that spacing training within or between sessions is a promising approach to better isolate and understand mechanisms supporting gradual reward-based learning, with particular importance for understanding potential learning dysfunctions in addiction and psychiatric disorders.
Collapse
|
17
|
Abstract
ABSTRACT Efficacy of treatment is heavily dependent on experience and expectations. Moreover, humans can generalize from one experience to a perceptually similar but novel situation. We investigated whether and how this applies to pain relief, using ecologically valid tonic pain stimuli treated by surreptitiously lowering the applied temperature. Using different face cues, participants experienced better treatment from one physician than another. Participants were then tested on 6 additional face cues perceptually lying between both faces. Our data from 2 independent samples (N = 18 and N = 39) show a treatment experience effect, ie, for physically identical treatments, the initially superior physician was reported to deliver stronger pain relief. More importantly, the other faces on the perceptual continuum showed a graded effect of pain relief, indicating placebo generalization. Introducing a paradigm feasible to induce placebo pain relief, we show that the generic learning principle of generalization can explain carryover effects between learned and novel treatment situations.
Collapse
|
18
|
Ma SS, Li CSR, Zhang S, Worhunsky PD, Zhou N, Zhang JT, Liu L, Yao YW, Fang XY. Altered functional network activities for behavioral adjustments and Bayesian learning in young men with Internet gaming disorder. J Behav Addict 2021; 10:112-122. [PMID: 33704083 PMCID: PMC8969861 DOI: 10.1556/2006.2021.00010] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 11/02/2020] [Accepted: 02/06/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND AND AIMS Deficits in cognitive control represent a core feature of addiction. Internet Gaming Disorder (IGD) offers an ideal model to study the mechanisms underlying cognitive control deficits in addiction, eliminating the confounding effects of substance use. Studies have reported behavioral and neural deficits in reactive control in IGD, but it remains unclear whether individuals with IGD are compromised in proactive control or behavioral adjustment by learning from the changing contexts. METHODS Here, fMRI data of 21 male young adults with IGD and 21 matched healthy controls (HC) were collected during a stop-signal task. We employed group independent component analysis to investigate group differences in temporally coherent, large-scale functional network activities during post-error slowing, the typical type of behavioral adjustments. We also employed a Bayesian belief model to quantify the trial-by-trial learning of the likelihood of stop signal - P(Stop) - a broader process underlying behavioral adjustment, and identified the alterations in functional network responses to P(Stop). RESULTS The results showed diminished engagement of the fronto-parietal network during post-error slowing, and weaker activity in the ventral attention and anterior default mode network in response to P(Stop) in IGD relative to HC. DISCUSSION AND CONCLUSIONS These results add to the literatures by suggesting deficits in updating and anticipating conflicts as well as in behavioral adjustment according to contextual information in individuals with IGD.
Collapse
Affiliation(s)
- Shan-Shan Ma
- Institute of Developmental Psychology, Beijing Normal University, Beijing, China
| | - Chiang-Shan R. Li
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA,Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Sheng Zhang
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Patrick D. Worhunsky
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Nan Zhou
- Faculty of Education, Beijing Normal University, Beijing, China
| | - Jin-Tao Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China,IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China,Beijing Key Lab of Applied Experimental Psychology, School of Psychology, Beijing Normal University, Beijing, China,Corresponding authors. E-mail: (J.-T. Zhang) (X.-Y. Fang)
| | - Lu Liu
- Institute of Developmental Psychology, Beijing Normal University, Beijing, China,German Institute of Human Nutrition Potsdam-Rehbruecke, 14558Nuthetal, Germany
| | - Yuan-Wei Yao
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China,Einstein Center for Neurosciences Berlin, Charitéplatz 1, 10117Berlin, Germany
| | - Xiao-Yi Fang
- Institute of Developmental Psychology, Beijing Normal University, Beijing, China,Corresponding authors. E-mail: (J.-T. Zhang) (X.-Y. Fang)
| |
Collapse
|
19
|
Baram AB, Muller TH, Nili H, Garvert MM, Behrens TEJ. Entorhinal and ventromedial prefrontal cortices abstract and generalize the structure of reinforcement learning problems. Neuron 2021; 109:713-723.e7. [PMID: 33357385 PMCID: PMC7889496 DOI: 10.1016/j.neuron.2020.11.024] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 10/09/2020] [Accepted: 11/19/2020] [Indexed: 11/25/2022]
Abstract
Knowledge of the structure of a problem, such as relationships between stimuli, enables rapid learning and flexible inference. Humans and other animals can abstract this structural knowledge and generalize it to solve new problems. For example, in spatial reasoning, shortest-path inferences are immediate in new environments. Spatial structural transfer is mediated by cells in entorhinal and (in humans) medial prefrontal cortices, which maintain their co-activation structure across different environments and behavioral states. Here, using fMRI, we show that entorhinal and ventromedial prefrontal cortex (vmPFC) representations perform a much broader role in generalizing the structure of problems. We introduce a task-remapping paradigm, where subjects solve multiple reinforcement learning (RL) problems differing in structural or sensory properties. We show that, as with space, entorhinal representations are preserved across different RL problems only if task structure is preserved. In vmPFC and ventral striatum, representations of prediction error also depend on task structure.
Collapse
Affiliation(s)
- Alon Boaz Baram
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK.
| | - Timothy Howard Muller
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK
| | - Hamed Nili
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK
| | - Mona Maria Garvert
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK; Max-Planck-Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103, Leipzig, Germany
| | - Timothy Edward John Behrens
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK; Wellcome Trust Centre for Neuroimaging, University College London, London WC1N 3AR, UK
| |
Collapse
|
20
|
Miendlarzewska EA, Aberg KC, Bavelier D, Schwartz S. Prior Reward Conditioning Dampens Hippocampal and Striatal Responses during an Associative Memory Task. J Cogn Neurosci 2020; 33:402-421. [PMID: 33326326 DOI: 10.1162/jocn_a_01660] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Offering reward during encoding typically leads to better memory [Adcock, R. A., Thangavel, A., Whitfield-Gabrieli, S.,Knutson, B., & Gabrieli, J. D. E. Reward-motivated learning: Mesolimbic activation precedes memory formation. Neuron, 50, 507-517, 2006]. Whether such memory benefit persists when tested in a different task context remains, however, largely understudied [Wimmer, G. E., & Buechel, C. Reactivation of reward-related patterns from single past episodes supports memory-based decision making. Journal of Neuroscience, 36, 2868-2880, 2016]. Here, we ask whether reward at encoding leads to a generalized advantage across learning episodes, a question of high importance for any everyday life applications, from education to patient rehabilitation. Although we confirmed that offering monetary reward increased responses in the ventral striatum and pleasantness judgments for pictures used as stimuli, this immediate beneficial effect of reward did not carry over to a subsequent and different picture-location association memory task during which no reward was delivered. If anything, a trend for impaired memory accuracy was observed for the initially high-rewarded pictures as compared to low-rewarded ones. In line with this trend in behavioral performance, fMRI activity in reward (i.e., ventral striatum) and in memory (i.e., hippocampus) circuits was reduced during the encoding of new associations using previously highly rewarded pictures (compared to low-reward pictures). These neural effects extended to new pictures from same, previously highly rewarded semantic category. Twenty-four hours later, delayed recall of associations involving originally highly rewarded items was accompanied by decreased functional connectivity between the hippocampus and two brain regions implicated in value-based learning, the ventral striatum and the ventromedial PFC. We conclude that acquired reward value elicits a downward value-adjustment signal in the human reward circuit when reactivated in a novel nonrewarded context, with a parallel disengagement of memory-reward (hippocampal-striatal) networks, likely to undermine new associative learning. Although reward is known to promote learning, here we show how it may subsequently hinder hippocampal and striatal responses during new associative memory formation.
Collapse
Affiliation(s)
- Ewa A Miendlarzewska
- University of Geneva.,Campus Biotech, Geneva, Switzerland.,Montpellier Business School
| | | | | | | |
Collapse
|
21
|
van de Vijver I, Ligneul R. Relevance of working memory for reinforcement learning in older adults varies with timescale of learning. NEUROPSYCHOLOGY, DEVELOPMENT, AND COGNITION. SECTION B, AGING, NEUROPSYCHOLOGY AND COGNITION 2020; 27:654-676. [PMID: 31544587 DOI: 10.1080/13825585.2019.1664389] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 09/02/2019] [Indexed: 06/10/2023]
Abstract
In young adults, individual differences in working memory (WM) contribute to reinforcement learning (RL). Age-related RL changes, however, are mostly attributed to decreased reward prediction-error (RPE) signaling. Here, we investigated the contribution of WM to RL in young (18-35) and older (≥65) adults. Because WM supports maintenance across a limited timescale, we only expected a relation between RL and WM with short delays between stimulus repetitions. Our results demonstrated better learning with short than long delays. A week later, however, long-delay associations were remembered better. Computational modeling corroborated that during learning, WM was more engaged by young adults in the short-delay condition than in any other age-condition combination. Crucially, both model-derived and neuropsychological assessments of WM predicted short-delay learning in older adults, who further benefitted from using self-conceived learning strategies. Thus, depending on the timescale of learning, age-related RL changes may not only reflect decreased RPE signaling but also WM decline.
Collapse
Affiliation(s)
- Irene van de Vijver
- Behavioural Science Institute, Radboud University , Nijmegen, The Netherlands
- Department of Clinical Psychology, University of Amsterdam , Amsterdam, The Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam , Amsterdam, The Netherlands
| | - Romain Ligneul
- Champalimaud Neuroscience Program, Champalimaud Foundation , Lisboa, Portugal
| |
Collapse
|
22
|
Collins AGE, Cockburn J. Beyond dichotomies in reinforcement learning. Nat Rev Neurosci 2020; 21:576-586. [PMID: 32873936 DOI: 10.1038/s41583-020-0355-6] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2020] [Indexed: 11/09/2022]
Abstract
Reinforcement learning (RL) is a framework of particular importance to psychology, neuroscience and machine learning. Interactions between these fields, as promoted through the common hub of RL, has facilitated paradigm shifts that relate multiple levels of analysis in a singular framework (for example, relating dopamine function to a computationally defined RL signal). Recently, more sophisticated RL algorithms have been proposed to better account for human learning, and in particular its oft-documented reliance on two separable systems: a model-based (MB) system and a model-free (MF) system. However, along with many benefits, this dichotomous lens can distort questions, and may contribute to an unnecessarily narrow perspective on learning and decision-making. Here, we outline some of the consequences that come from overconfidently mapping algorithms, such as MB versus MF RL, with putative cognitive processes. We argue that the field is well positioned to move beyond simplistic dichotomies, and we propose a means of refocusing research questions towards the rich and complex components that comprise learning and decision-making.
Collapse
Affiliation(s)
- Anne G E Collins
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Jeffrey Cockburn
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
23
|
Functional connectivity between memory and reward centers across task and rest track memory sensitivity to reward. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2020; 19:503-522. [PMID: 30805850 DOI: 10.3758/s13415-019-00700-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
External motivation, such as a promise of future monetary reward for remembering an event, can affect which events are remembered. Reward-based memory modulation is thought to result from encoding and post-encoding interactions between dopaminergic midbrain, signaling reward, and hippocampus and parahippocampal cortex, supporting episodic memory. We asked whether hippocampal and parahippocampal interactions with other reward-related regions are related to reward modulation of memory and whether such relationships are stable over time. Individuals' memory sensitivity to reward was measured using a monetary incentive encoding task in which a cue indicated potential monetary reward (penny, dime, or dollar) for remembering an upcoming object pair. Functional connectivity between memory and reward regions was measured before, during, and following the task. Reward-related regions of interest were generated using a meta-analysis of existing studies on reward and included ventral striatum, medial and orbital prefrontal cortices and anterior cingulate cortex, in addition to midbrain. The results showed that connectivity between memory and reward regions tracked individual differences in reward modulation of memory, irrespective of when connectivity was measured. Connectivity patterns of anterior cingulate, orbitofrontal cortex, and ventral striatum covaried together and tracked behavior most strongly. These findings implicate a broader set of reward regions in reward modulation of memory than considered previously and provide new evidence that stable connectivity patterns between memory and reward centers relate to individual differences in how reward impacts memory.
Collapse
|
24
|
Hippocampal contributions to value-based learning: Converging evidence from fMRI and amnesia. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2020; 19:523-536. [PMID: 30767129 DOI: 10.3758/s13415-018-00687-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Recent evidence suggests that the human hippocampus-known primarily for its involvement in episodic memory-plays a role in a host of motivationally relevant behaviors, including some forms of value-based decision-making. However, less is known about the role of the hippocampus in value-based learning. Such learning is typically associated with a striatal system, yet a small number of studies, both in human and nonhuman species, suggest hippocampal engagement. It is not clear, however, whether this engagement is necessary for such learning. In the present study, we used both functional MRI (fMRI) and lesion-based neuropsychological methods to clarify hippocampal contributions to value-based learning. In Experiment 1, healthy participants were scanned while learning value-based contingencies (whether players in a "game" win money) in the context of a probabilistic learning task. Here, we observed recruitment of the hippocampus, in addition to the expected ventral striatal (nucleus accumbens) activation that typically accompanies such learning. In Experiment 2, we administered this task to amnesic patients with medial temporal lobe damage and to healthy controls. Amnesic patients, including those with damage circumscribed to the hippocampus, failed to acquire value-based contingencies, thus confirming that hippocampal engagement is necessary for task performance. Control experiments established that this impairment was not due to perceptual demands or memory load. Future research is needed to clarify the mechanisms by which the hippocampus contributes to value-based learning, but these findings point to a broader role for the hippocampus in goal-directed behaviors than previously appreciated.
Collapse
|
25
|
Huang Y, Yaple ZA, Yu R. Goal-oriented and habitual decisions: Neural signatures of model-based and model-free learning. Neuroimage 2020; 215:116834. [PMID: 32283275 DOI: 10.1016/j.neuroimage.2020.116834] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 03/03/2020] [Accepted: 04/08/2020] [Indexed: 11/26/2022] Open
Abstract
Human decision-making is mainly driven by two fundamental learning processes: a slow, deliberative, goal-directed model-based process that maps out the potential outcomes of all options and a rapid habitual model-free process that enables reflexive repetition of previously successful choices. Although many model-informed neuroimaging studies have examined the neural correlates of model-based and model-free learning, the concordant activity among these two processes remains unclear. We used quantitative meta-analyses of functional magnetic resonance imaging experiments to identify the concordant activity pertaining to model-based and model-free learning over a range of reward-related paradigms. We found that: 1) both processes yielded concordant ventral striatum activity, 2) model-based learning activated the medial prefrontal cortex and orbital frontal cortex, and 3) model-free learning specifically activated the left globus pallidus and right caudate head. Our findings suggest that model-free and model-based decision making engage overlapping yet distinct neural regions. These stereotaxic maps improve our understanding of how deliberative goal-directed and reflexive habitual learning are implemented in the brain.
Collapse
Affiliation(s)
- Yi Huang
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore
| | - Zachary A Yaple
- Department of Psychology, National University of Singapore, Singapore
| | - Rongjun Yu
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore; Department of Psychology, National University of Singapore, Singapore.
| |
Collapse
|
26
|
Schulz E, Franklin NT, Gershman SJ. Finding structure in multi-armed bandits. Cogn Psychol 2020; 119:101261. [PMID: 32059133 DOI: 10.1016/j.cogpsych.2019.101261] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 11/10/2019] [Accepted: 12/02/2019] [Indexed: 12/24/2022]
Abstract
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, which require participants to trade off exploration and exploitation. Standard multi-armed bandits assume that each option has an independent reward distribution. However, learning about options independently is unrealistic, since in the real world options often share an underlying structure. We study a class of structured bandit tasks, which we use to probe how generalization guides exploration. In a structured multi-armed bandit, options have a correlation structure dictated by a latent function. We focus on bandits in which rewards are linear functions of an option's spatial position. Across 5 experiments, we find evidence that participants utilize functional structure to guide their exploration, and also exhibit a learning-to-learn effect across rounds, becoming progressively faster at identifying the latent function. Our experiments rule out several heuristic explanations and show that the same findings obtain with non-linear functions. Comparing several models of learning and decision making, we find that the best model of human behavior in our tasks combines three computational mechanisms: (1) function learning, (2) clustering of reward distributions across rounds, and (3) uncertainty-guided exploration. Our results suggest that human reinforcement learning can utilize latent structure in sophisticated ways to improve efficiency.
Collapse
|
27
|
|
28
|
Duncan K, Semmler A, Shohamy D. Modulating the Use of Multiple Memory Systems in Value-based Decisions with Contextual Novelty. J Cogn Neurosci 2019; 31:1455-1467. [PMID: 31322467 DOI: 10.1162/jocn_a_01447] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
With multiple learning and memory systems at its disposal, the human brain can represent the past in many ways, from extracting regularities across similar experiences (incremental learning) to storing rich, idiosyncratic details of individual events (episodic memory). The unique information carried by these neurologically distinct forms of memory can bias our behavior in different directions, raising crucial questions about how these memory systems interact to guide choice and the factors that cause one to dominate. Here, we devised a new approach to estimate how decisions are independently influenced by episodic memories and incremental learning. Furthermore, we identified a biologically motivated factor that biases the use of different memory types-the detection of novelty versus familiarity. Consistent with computational models of cholinergic memory modulation, we find that choices are more influenced by episodic memories following the recognition of an unrelated familiar image but more influenced by incrementally learned values after the detection of a novel image. Together this work provides a new behavioral tool enabling the disambiguation of key memory behaviors thought to be supported by distinct neural systems while also identifying a theoretically important and broadly applicable manipulation to bias the arbitration between these two sources of memories.
Collapse
|
29
|
Schulz E, Bhui R, Love BC, Brier B, Todd MT, Gershman SJ. Structured, uncertainty-driven exploration in real-world consumer choice. Proc Natl Acad Sci U S A 2019; 116:13903-13908. [PMID: 31235598 PMCID: PMC6628813 DOI: 10.1073/pnas.1821028116] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Making good decisions requires people to appropriately explore their available options and generalize what they have learned. While computational models can explain exploratory behavior in constrained laboratory tasks, it is unclear to what extent these models generalize to real-world choice problems. We investigate the factors guiding exploratory behavior in a dataset consisting of 195,333 customers placing 1,613,967 orders from a large online food delivery service. We find important hallmarks of adaptive exploration and generalization, which we analyze using computational models. In particular, customers seem to engage in uncertainty-directed exploration and use feature-based generalization to guide their exploration. Our results provide evidence that people use sophisticated strategies to explore complex, real-world environments.
Collapse
Affiliation(s)
- Eric Schulz
- Department of Psychology, Harvard University, Cambridge, MA 02138;
| | - Rahul Bhui
- Department of Psychology, Harvard University, Cambridge, MA 02138
| | - Bradley C Love
- Department of Experimental Psychology, University College London, London WC1H 0AP, United Kingdom
- The Alan Turing Institute, London NW1 2DB, United Kingdom
| | - Bastien Brier
- Data Science Team, Deliveroo, London EC4R 3TE, United Kingdom
| | - Michael T Todd
- Data Science Team, Deliveroo, London EC4R 3TE, United Kingdom
| | | |
Collapse
|
30
|
Benoit RG, Paulus PC, Schacter DL. Forming attitudes via neural activity supporting affective episodic simulations. Nat Commun 2019; 10:2215. [PMID: 31101806 PMCID: PMC6525197 DOI: 10.1038/s41467-019-09961-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 03/29/2019] [Indexed: 01/21/2023] Open
Abstract
Humans have the adaptive capacity for imagining hypothetical episodes. Such episodic simulation is based on a neural network that includes the ventromedial prefrontal cortex (vmPFC). This network draws on existing knowledge (e.g., of familiar people and places) to construct imaginary events (e.g., meeting with the person at that place). Here, we test the hypothesis that a simulation changes attitudes towards its constituent elements. In two experiments, we demonstrate how imagining meeting liked versus disliked people (unconditioned stimuli, UCS) at initially neutral places (conditioned stimuli, CS) changes the value of these places. We further provide evidence that the vmPFC codes for representations of those elements (i.e., of individual people and places). Critically, attitude changes induced by the liked UCS are based on a transfer of positive affective value between the representations (i.e., from the UCS to the CS). Thereby, we reveal how mere imaginings shape attitudes towards elements (i.e., places) from our real-life environment.
Collapse
Affiliation(s)
- Roland G Benoit
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany.
| | - Philipp C Paulus
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany.,International Max Planck Research School NeuroCom, Leipzig, 04103, Germany
| | - Daniel L Schacter
- Department of Psychology, Harvard University, Cambridge, MA, 02138, USA
| |
Collapse
|
31
|
Burnside R, Fischer AG, Ullsperger M. The feedback-related negativity indexes prediction error in active but not observational learning. Psychophysiology 2019; 56:e13389. [PMID: 31054155 DOI: 10.1111/psyp.13389] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 03/15/2019] [Accepted: 04/11/2019] [Indexed: 02/06/2023]
Abstract
Reinforcement learning (RL) theory states that learning is driven by prediction errors (PEs)-the discrepancy between the predicted and actual outcome of an action. When participants learn from their own actions, PEs correlate with the feedback-related negativity (FRN), but it is not clear if the FRN reflects a PE in observational learning. We use a model-based regression analysis of single-trial event-related potentials to determine if the FRN in observational learning is PE driven. Twenty participants (16 female) learned the stimulus-outcome contingencies for a probabilistic three-armed bandit task. They played in pairs, with the acting and observing player switching every one to three trials. An RL-learning algorithm was fit to participants' choices in the task to extract individual PE estimates for every trial of the experiment. In the acting condition, model-estimated PEs covaried positively with neural signal at electrode FCz, 200-350 ms after outcome presentation, which is a typical time frame for the FRN. There was no PE effect in the observation condition in the same time frame. From 300 ms the outcome correlated negatively with the frontal P300 component at FCz and parietal P300 at Pz. At Pz the effect was greater in the acting than the observing condition. The frontal and parietal P300 components have been linked to attentional reorienting and stimulus value updating, respectively. These findings indicate that observed outcomes undergo processing that is distinguishable from directly experienced outcomes in the time windows of the FRN and P3b but that attention dedicated to the two outcomes types is comparable.
Collapse
Affiliation(s)
- Rebecca Burnside
- Department of Neuropsychology, Institute of Psychology, Otto-von-Guericke University, Magdeburg, Germany
| | - Adrian G Fischer
- Department of Neuropsychology, Institute of Psychology, Otto-von-Guericke University, Magdeburg, Germany.,Biological Psychology and Cognitive Neuroscience, Freie Universität Berlin, Berlin, Germany.,Center for Behavioral Brain Sciences (CBBS), Otto-von-Guericke University, Magdeburg, Germany
| | - Markus Ullsperger
- Department of Neuropsychology, Institute of Psychology, Otto-von-Guericke University, Magdeburg, Germany.,Center for Behavioral Brain Sciences (CBBS), Otto-von-Guericke University, Magdeburg, Germany
| |
Collapse
|
32
|
Hippocampal pattern separation supports reinforcement learning. Nat Commun 2019; 10:1073. [PMID: 30842581 PMCID: PMC6403348 DOI: 10.1038/s41467-019-08998-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 02/13/2019] [Indexed: 11/08/2022] Open
Abstract
Animals rely on learned associations to make decisions. Associations can be based on relationships between object features (e.g., the three leaflets of poison ivy leaves) and outcomes (e.g., rash). More often, outcomes are linked to multidimensional states (e.g., poison ivy is green in summer but red in spring). Feature-based reinforcement learning fails when the values of individual features depend on the other features present. One solution is to assign value to multi-featural conjunctive representations. Here, we test if the hippocampus forms separable conjunctive representations that enables the learning of response contingencies for stimuli of the form: AB+, B-, AC-, C+. Pattern analyses on functional MRI data show the hippocampus forms conjunctive representations that are dissociable from feature components and that these representations, along with those of cortex, influence striatal prediction errors. Our results establish a novel role for hippocampal pattern separation and conjunctive representation in reinforcement learning.
Collapse
|
33
|
Abstract
Habits form a crucial component of behavior. In recent years, key computational models have conceptualized habits as arising from model-free reinforcement learning mechanisms, which typically select between available actions based on the future value expected to result from each. Traditionally, however, habits have been understood as behaviors that can be triggered directly by a stimulus, without requiring the animal to evaluate expected outcomes. Here, we develop a computational model instantiating this traditional view, in which habits develop through the direct strengthening of recently taken actions rather than through the encoding of outcomes. We demonstrate that this model accounts for key behavioral manifestations of habits, including insensitivity to outcome devaluation and contingency degradation, as well as the effects of reinforcement schedule on the rate of habit formation. The model also explains the prevalent observation of perseveration in repeated-choice tasks as an additional behavioral manifestation of the habit system. We suggest that mapping habitual behaviors onto value-free mechanisms provides a parsimonious account of existing behavioral and neural data. This mapping may provide a new foundation for building robust and comprehensive models of the interaction of habits with other, more goal-directed types of behaviors and help to better guide research into the neural mechanisms underlying control of instrumental behavior more generally. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown Institute for Brain Science, Brown University
| | | |
Collapse
|
34
|
The algorithmic architecture of exploration in the human brain. Curr Opin Neurobiol 2018; 55:7-14. [PMID: 30529148 DOI: 10.1016/j.conb.2018.11.003] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 09/18/2018] [Accepted: 11/19/2018] [Indexed: 11/20/2022]
Abstract
Balancing exploration and exploitation is one of the central problems in reinforcement learning. We review recent studies that have identified multiple algorithmic strategies underlying exploration. In particular, humans use a combination of random and uncertainty-directed exploration strategies, which rely on different brain systems, have different developmental trajectories, and are sensitive to different task manipulations. Humans are also able to exploit sophisticated structural knowledge to aid their exploration, such as information about correlations between options. New computational models, drawing inspiration from machine learning, have begun to formalize these ideas and offer new ways to understand the neural basis of reinforcement learning.
Collapse
|
35
|
Fonzo GA. Diminished positive affect and traumatic stress: A biobehavioral review and commentary on trauma affective neuroscience. Neurobiol Stress 2018; 9:214-230. [PMID: 30450386 PMCID: PMC6234277 DOI: 10.1016/j.ynstr.2018.10.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 07/20/2018] [Accepted: 10/17/2018] [Indexed: 11/28/2022] Open
Abstract
Post-traumatic stress manifests in disturbed affect and emotion, including exaggerated severity and frequency of negative valence emotions, e.g., fear, anxiety, anger, shame, and guilt. However, another core feature of common post-trauma psychopathologies, i.e. post-traumatic stress disorder (PTSD) and major depression, is diminished positive affect, or reduced frequency and intensity of positive emotions and affective states such as happiness, joy, love, interest, and desire/capacity for interpersonal affiliation. There remains a stark imbalance in the degree to which the neuroscience of each affective domain has been probed and characterized in PTSD, with our knowledge of post-trauma diminished positive affect remaining comparatively underdeveloped. This remains a prominent barrier to realizing the clinical breakthroughs likely to be afforded by the increasing availability of neuroscience assessment and intervention tools. In this review and commentary, the author summarizes the modest extant neuroimaging literature that has probed diminished positive affect in PTSD using reward processing behavioral paradigms, first briefly reviewing and outlining the neurocircuitry implicated in reward and positive emotion and its interrelationship with negative emotion and negative valence circuitry. Specific research guidelines are then offered to best and most efficiently develop the knowledge base in this area in a way that is clinically translatable and will exert a positive impact on routine clinical care. The author concludes with the prediction that the development of an integrated, bivalent theoretical and predictive model of how trauma impacts affective neurocircuitry to promote post-trauma psychopathology will ultimately lead to breakthroughs in how trauma treatments are conceptualized mechanistically and developed pragmatically.
Collapse
Affiliation(s)
- Gregory A. Fonzo
- Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Sierra-Pacific Mental Illness Research, Education, and Clinical Center (MIRECC), Veterans Affairs Palo Alto Healthcare System, 401 Quarry Road, MC 5722, Stanford, CA, 94305, USA.
| |
Collapse
|
36
|
Reward Learning over Weeks Versus Minutes Increases the Neural Representation of Value in the Human Brain. J Neurosci 2018; 38:7649-7666. [PMID: 30061189 DOI: 10.1523/jneurosci.0075-18.2018] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Revised: 06/12/2018] [Accepted: 06/27/2018] [Indexed: 12/13/2022] Open
Abstract
Over the past few decades, neuroscience research has illuminated the neural mechanisms supporting learning from reward feedback. Learning paradigms are increasingly being extended to study mood and psychiatric disorders as well as addiction. However, one potentially critical characteristic that this research ignores is the effect of time on learning: human feedback learning paradigms are usually conducted in a single rapidly paced session, whereas learning experiences in ecologically relevant circumstances and in animal research are almost always separated by longer periods of time. In our experiments, we examined reward learning in short condensed sessions distributed across weeks versus learning completed in a single "massed" session in male and female participants. As expected, we found that after equal amounts of training, accuracy was matched between the spaced and massed conditions. However, in a 3-week follow-up, we found that participants exhibited significantly greater memory for the value of spaced-trained stimuli. Supporting a role for short-term memory in massed learning, we found a significant positive correlation between initial learning and working memory capacity. Neurally, we found that patterns of activity in the medial temporal lobe and prefrontal cortex showed stronger discrimination of spaced- versus massed-trained reward values. Further, patterns in the striatum discriminated between spaced- and massed-trained stimuli overall. Our results indicate that single-session learning tasks engage partially distinct learning mechanisms from distributed training. Our studies begin to address a large gap in our knowledge of human learning from reinforcement, with potential implications for our understanding of mood disorders and addiction.SIGNIFICANCE STATEMENT Humans and animals learn to associate predictive value with stimuli and actions, and these values then guide future behavior. Such reinforcement-based learning often happens over long time periods, in contrast to most studies of reward-based learning in humans. In experiments that tested the effect of spacing on learning, we found that associations learned in a single massed session were correlated with short-term memory and significantly decayed over time, whereas associations learned in short massed sessions over weeks were well maintained. Additionally, patterns of activity in the medial temporal lobe and prefrontal cortex discriminated the values of stimuli learned over weeks but not minutes. These results highlight the importance of studying learning over time, with potential applications to drug addiction and psychiatry.
Collapse
|
37
|
Gerraty RT, Davidow JY, Foerde K, Galvan A, Bassett DS, Shohamy D. Dynamic Flexibility in Striatal-Cortical Circuits Supports Reinforcement Learning. J Neurosci 2018; 38:2442-2453. [PMID: 29431652 PMCID: PMC5858591 DOI: 10.1523/jneurosci.2084-17.2018] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2017] [Revised: 01/15/2018] [Accepted: 01/21/2018] [Indexed: 12/19/2022] Open
Abstract
Complex learned behaviors must involve the integrated action of distributed brain circuits. Although the contributions of individual regions to learning have been extensively investigated, much less is known about how distributed brain networks orchestrate their activity over the course of learning. To address this gap, we used fMRI combined with tools from dynamic network neuroscience to obtain time-resolved descriptions of network coordination during reinforcement learning in humans. We found that learning to associate visual cues with reward involves dynamic changes in network coupling between the striatum and distributed brain regions, including visual, orbitofrontal, and ventromedial prefrontal cortex (n = 22; 13 females). Moreover, we found that this flexibility in striatal network coupling correlates with participants' learning rate and inverse temperature, two parameters derived from reinforcement learning models. Finally, we found that episodic learning, measured separately in the same participants at the same time, was related to dynamic connectivity in distinct brain networks. These results suggest that dynamic changes in striatal-centered networks provide a mechanism for information integration during reinforcement learning.SIGNIFICANCE STATEMENT Learning from the outcomes of actions, referred to as reinforcement learning, is an essential part of life. The roles of individual brain regions in reinforcement learning have been well characterized in terms of updating values for actions or cues. Missing from this account, however, is an understanding of how different brain areas interact during learning to integrate sensory and value information. Here we characterize flexible striatal-cortical network dynamics that relate to reinforcement learning behavior.
Collapse
Affiliation(s)
- Raphael T Gerraty
- Department of Psychology, Columbia University, New York, New York 10027,
| | - Juliet Y Davidow
- Department of Psychology, Harvard University, Cambridge, Massachusetts 02138
| | - Karin Foerde
- Department of Psychology, New York University, New York, New York 10003
| | - Adriana Galvan
- Department of Psychology, UCLA, Los Angeles, California 90095
| | - Danielle S Bassett
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104
- Department of Electrical & Systems Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, and
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, New York 10027,
- Zuckerman Mind Brain Behavior Institute and Kavli Institute for Brain Science, Columbia University, New York, New York 10027
| |
Collapse
|
38
|
Deserno L, Heinz A, Schlagenhauf F. Computational approaches to schizophrenia: A perspective on negative symptoms. Schizophr Res 2017; 186:46-54. [PMID: 27986430 DOI: 10.1016/j.schres.2016.10.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Revised: 09/22/2016] [Accepted: 10/01/2016] [Indexed: 12/30/2022]
Abstract
Schizophrenia is a heterogeneous spectrum disorder often associated with detrimental negative symptoms. In recent years, computational approaches to psychiatry have attracted growing attention. Negative symptoms have shown some overlap with general cognitive impairments and were also linked to impaired motivational processing in brain circuits implementing reward prediction. In this review, we outline how computational approaches may help to provide a better understanding of negative symptoms in terms of the potentially underlying behavioural and biological mechanisms. First, we describe the idea that negative symptoms could arise from a failure to represent reward expectations to enable flexible behavioural adaptation. It has been proposed that these impairments arise from a failure to use prediction errors to update expectations. Important previous studies focused on processing of so-called model-free prediction errors where learning is determined by past rewards only. However, learning and decision-making arise from multiple cognitive mechanisms functioning simultaneously, and dissecting them via well-designed tasks in conjunction with computational modelling is a promising avenue. Second, we move on to a proof-of-concept example on how generative models of functional imaging data from a cognitive task enable the identification of subgroups of patients mapping on different levels of negative symptoms. Combining the latter approach with behavioural studies regarding learning and decision-making may allow the identification of key behavioural and biological parameters distinctive for different dimensions of negative symptoms versus a general cognitive impairment. We conclude with an outlook on how this computational framework could, at some point, enrich future clinical studies.
Collapse
Affiliation(s)
- Lorenz Deserno
- Max Planck Fellow Group 'Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité-Universitätsmedizin Berlin, Berlin, Germany; Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University of Leipzig, Leipzig, Germany.
| | - Andreas Heinz
- Max Planck Fellow Group 'Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Florian Schlagenhauf
- Max Planck Fellow Group 'Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
39
|
Kumaran D, Banino A, Blundell C, Hassabis D, Dayan P. Computations Underlying Social Hierarchy Learning: Distinct Neural Mechanisms for Updating and Representing Self-Relevant Information. Neuron 2017; 92:1135-1147. [PMID: 27930904 PMCID: PMC5158095 DOI: 10.1016/j.neuron.2016.10.052] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2016] [Revised: 09/17/2016] [Accepted: 10/21/2016] [Indexed: 11/24/2022]
Abstract
Knowledge about social hierarchies organizes human behavior, yet we understand little about the underlying computations. Here we show that a Bayesian inference scheme, which tracks the power of individuals, better captures behavioral and neural data compared with a reinforcement learning model inspired by rating systems used in games such as chess. We provide evidence that the medial prefrontal cortex (MPFC) selectively mediates the updating of knowledge about one’s own hierarchy, as opposed to that of another individual, a process that underpinned successful performance and involved functional interactions with the amygdala and hippocampus. In contrast, we observed domain-general coding of rank in the amygdala and hippocampus, even when the task did not require it. Our findings reveal the computations underlying a core aspect of social cognition and provide new evidence that self-relevant information may indeed be afforded a unique representational status in the brain. Social hierarchy learning accounted for by a Bayesian inference scheme Amygdala and hippocampus support domain-general social hierarchy learning Medial prefrontal cortex selectively updates knowledge about one’s own hierarchy Rank signals generated by these neural structures in absence of task demands
Collapse
Affiliation(s)
- Dharshan Kumaran
- Google DeepMind, 5 New Street Square, London EC4A 3TW, UK; Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK.
| | - Andrea Banino
- Google DeepMind, 5 New Street Square, London EC4A 3TW, UK
| | | | - Demis Hassabis
- Google DeepMind, 5 New Street Square, London EC4A 3TW, UK; Gatsby Computational Neuroscience Unit, 25 Howland Street, London W1T 4JG, UK
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, 25 Howland Street, London W1T 4JG, UK
| |
Collapse
|
40
|
Behavioral and Neural Signatures of Reduced Updating of Alternative Options in Alcohol-Dependent Patients during Flexible Decision-Making. J Neurosci 2017; 36:10935-10948. [PMID: 27798176 DOI: 10.1523/jneurosci.4322-15.2016] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 08/14/2016] [Indexed: 01/09/2023] Open
Abstract
Addicted individuals continue substance use despite the knowledge of harmful consequences and often report having no choice but to consume. Computational psychiatry accounts have linked this clinical observation to difficulties in making flexible and goal-directed decisions in dynamic environments via consideration of potential alternative choices. To probe this in alcohol-dependent patients (n = 43) versus healthy volunteers (n = 35), human participants performed an anticorrelated decision-making task during functional neuroimaging. Via computational modeling, we investigated behavioral and neural signatures of inference regarding the alternative option. While healthy control subjects exploited the anticorrelated structure of the task to guide decision-making, alcohol-dependent patients were relatively better explained by a model-free strategy due to reduced inference on the alternative option after punishment. Whereas model-free prediction error signals were preserved, alcohol-dependent patients exhibited blunted medial prefrontal signatures of inference on the alternative option. This reduction was associated with patients' behavioral deficit in updating the alternative choice option and their obsessive-compulsive drinking habits. All results remained significant when adjusting for potential confounders (e.g., neuropsychological measures and gray matter density). A disturbed integration of alternative choice options implemented by the medial prefrontal cortex appears to be one important explanation for the puzzling question of why addicted individuals continue drug consumption despite negative consequences. SIGNIFICANCE STATEMENT In addiction, patients maintain substance use despite devastating consequences and often report having no choice but to consume. These clinical observations have been theoretically linked to disturbed mechanisms of inference, for example, to difficulties when learning statistical regularities of the environmental structure to guide decisions. Using computational modeling, we demonstrate disturbed inference on alternative choice options in alcohol addiction. Patients neglecting "what might have happened" was accompanied by blunted coding of inference regarding alternative choice options in the medial prefrontal cortex. An impaired integration of alternative choice options implemented by the medial prefrontal cortex might contribute to ongoing drug consumption in the face of evident negative consequences.
Collapse
|
41
|
Reiter AMF, Heinze HJ, Schlagenhauf F, Deserno L. Impaired Flexible Reward-Based Decision-Making in Binge Eating Disorder: Evidence from Computational Modeling and Functional Neuroimaging. Neuropsychopharmacology 2017; 42:628-637. [PMID: 27301429 PMCID: PMC5240187 DOI: 10.1038/npp.2016.95] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2015] [Revised: 05/11/2016] [Accepted: 05/24/2016] [Indexed: 12/17/2022]
Abstract
Despite its clinical relevance and the recent recognition as a diagnostic category in the DSM-5, binge eating disorder (BED) has rarely been investigated from a cognitive neuroscientific perspective targeting a more precise neurocognitive profiling of the disorder. BED patients suffer from a lack of behavioral control during recurrent binge eating episodes and thus fail to adapt their behavior in the face of negative consequences, eg, high risk for obesity. To examine impairments in flexible reward-based decision-making, we exposed BED patients (n=22) and matched healthy individuals (n=22) to a reward-guided decision-making task during functional resonance imaging (fMRI). Performing fMRI analysis informed via computational modeling of choice behavior, we were able to identify specific signatures of altered decision-making in BED. On the behavioral level, we observed impaired behavioral adaptation in BED, which was due to enhanced switching behavior, a putative deficit in striking a balance between exploration and exploitation appropriately. This was accompanied by diminished activation related to exploratory decisions in the anterior insula/ventro-lateral prefrontal cortex. Moreover, although so-called model-free reward prediction errors remained intact, representation of ventro-medial prefrontal learning signatures, incorporating inference on unchosen options, was reduced in BED, which was associated with successful decision-making in the task. On the basis of a computational psychiatry account, the presented findings contribute to defining a neurocognitive phenotype of BED.
Collapse
Affiliation(s)
- Andrea M F Reiter
- Max Planck Fellow Group ‘Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany,Department of Psychology, TU Dresden, Dresden, Germany,Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1a, 04103 Leipzig, Germany. Tel: +49 341 9940 2674, Fax: +49 341 9940 2221, E-mail:
| | - Hans-Jochen Heinze
- Max Planck Fellow Group ‘Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany,Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany,Department of Behavioral Neurology, Leibniz Institute for Neurobiology, Magdeburg, Germany
| | - Florian Schlagenhauf
- Max Planck Fellow Group ‘Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany,Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Lorenz Deserno
- Max Planck Fellow Group ‘Cognitive and Affective Control of Behavioral Adaptation', Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany,Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany,Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité - Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
42
|
Quevedo K, Ng R, Scott H, Kodavaganti S, Smyda G, Diwadkar V, Phillips M. Ventral Striatum Functional Connectivity during Rewards and Losses and Symptomatology in Depressed Patients. Biol Psychol 2017; 123:62-73. [PMID: 27876651 PMCID: PMC5737904 DOI: 10.1016/j.biopsycho.2016.11.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 11/09/2016] [Accepted: 11/09/2016] [Indexed: 12/22/2022]
Abstract
BACKGROUND The ventral striatum (VS) and striatal network supports goal motivated behavior. Identifying how depressed patients differ in their striatal network during the processing of emotionally salient events is a step towards uncovering biomarkers for diagnosis and treatment. METHODS 38 depressed and 30 healthy adults completed a task that examined brain activation to the anticipation and receipt of monetary rewards and losses. Data were collected using a 3T Siemens Trio scanner. Functional connectivity differences were examined with seeds in the Left or Right VS. FC estimates were regressed on specific symptoms. RESULTS Depressed patients displayed higher functional connectivity between the VS and midline cortical areas during loss versus reward trials. Anhedonia and depressed mood were associated to fairly similar striatal circuits but suicidality was associated to a unique VS-midline structures coupling, while depression severity was linked to higher VS to caudate and precuneus connectivity during loss versus reward trials. CONCLUSIONS Depression is characterized by excessive VS coupling to cognitive control and associative networks during losses versus rewards. High VS to midline cortical structures coupling may index suicidality.
Collapse
Affiliation(s)
- Karina Quevedo
- University of Minnesota, Department of Psychiatry, MN, USA.
| | - Rowena Ng
- University of Minnesota, Institute of Child Development, 51 East River Road, Minneapolis, MN, 55455, USA.
| | - Hannah Scott
- University of Minnesota, Department of Psychiatry, MN, USA.
| | | | - Garry Smyda
- University of Pittsburgh, School of Public Health, PA, USA.
| | - Vaibhav Diwadkar
- Dept. of Psychiatry & Behavioral Neurosciences, Wayne State University School of Medicine, Suite 5B, Tolan Park Medical Building, 3901 Chrysler Service Drive, Detroit, MI, 48201, USA.
| | - Mary Phillips
- Department of Psychiatry, Western Psychiatric Institute and Clinic, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
43
|
Zhang S, Hu S, Chao HH, Li CSR. Hemispheric lateralization of resting-state functional connectivity of the ventral striatum: an exploratory study. Brain Struct Funct 2017; 222:2573-2583. [PMID: 28110447 DOI: 10.1007/s00429-016-1358-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 12/21/2016] [Indexed: 01/01/2023]
Abstract
Resting-state functional connectivity (rsFC) is widely used to examine cerebral functional organization. The ventral striatum (VS) is critical to motivated behavior, with extant studies suggesting functional hemispheric asymmetry. The current work investigated differences in rsFC between the left (L) and right (R) VS and explored gender differences in the extent of functional lateralization. In 106 adults, we computed a laterality index (fcLI) to query whether a target region shows greater or less connectivity to the L vs R VS. A total of 45 target regions with hemispheric masks were examined from the Automated Anatomic Labeling atlas. One-sample t test was performed to explore significant laterality in the whole sample and in men and women separately. Two-sample t test was performed to examine gender differences in fcLI. At a corrected threshold (p < 0.05/45 = 0.0011), the dorsomedial prefrontal cortex (dmPFC) and posterior cingulate cortex (pCC) showed L lateralization and the intraparietal sulcus (IPS) and supramarginal gyrus (SMG) showed R lateralization in VS connectivity. Except for the pCC, these findings were replicated in a different data set (n = 97) from the Human Connectome Project. Furthermore, the fcLI of VS-pCC was negatively correlated with a novelty seeking trait in women but not in men. Together, the findings may suggest a more important role of the L VS in linking saliency response to self control and other internally directed processes. Right lateralization of VS connectivity to the SMG and IPS may support attention and action directed to external behavioral contingencies.
Collapse
Affiliation(s)
- Sheng Zhang
- Department of Psychiatry, Yale University School of Medicine, CMHC S112, 34 Park Street, New Haven, CT, 06519-1109, USA
| | - Sien Hu
- Department of Psychiatry, Yale University School of Medicine, CMHC S112, 34 Park Street, New Haven, CT, 06519-1109, USA
| | - Herta H Chao
- Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA.,Veterans Administration Medical Center, West Haven, CT, USA
| | - Chiang-Shan R Li
- Department of Psychiatry, Yale University School of Medicine, CMHC S112, 34 Park Street, New Haven, CT, 06519-1109, USA. .,Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA. .,Interdepartmental Neuroscience Program, Yale University School of Medicine, New Haven, CT, USA.
| |
Collapse
|
44
|
Deserno L, Schlagenhauf F, Heinz A. Striatal dopamine, reward, and decision making in schizophrenia. DIALOGUES IN CLINICAL NEUROSCIENCE 2017. [PMID: 27069382 PMCID: PMC4826774 DOI: 10.31887/dcns.2016.18.1/ldeserno] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Elevated striatal dopamine function is one of the best-established findings in schizophrenia. In this review, we discuss causes and consequences of this striata! dopamine alteration. We first summarize earlier findings regarding striatal reward processing and anticipation using functional neuroimaging. Secondly, we present a series of recent studies that are exemplary for a particular research approach: a combination of theory-driven reinforcement learning and decision-making tasks in combination with computational modeling and functional neuroimaging. We discuss why this approach represents a promising tool to understand underlying mechanisms of symptom dimensions by dissecting the contribution of multiple behavioral control systems working in parallel. We also discuss how it can advance our understanding of the neurobiological implementation of such functions. Thirdly, we review evidence regarding the topography of dopamine dysfunction within the striatum. Finally, we present conclusions and outline important aspects to be considered in future studies.
Collapse
Affiliation(s)
- Lorenz Deserno
- Max Planck Fellow Group "Cognitive and Affective Control of Behavioral Adaptation," Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychiatry and Psychotherapy, Campus Charite Mitte, Charite - Universitatsmedizin Berlin, Germany; Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany
| | - Florian Schlagenhauf
- Max Planck Fellow Group "Cognitive and Affective Control of Behavioral Adaptation," Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany; Department of Psychiatry and Psychotherapy, Campus Charite Mitte, Charite - Universitatsmedizin Berlin, Germany
| | - Andreas Heinz
- Department of Psychiatry and Psychotherapy, Campus Charite Mitte, Charite - Universitatsmedizin Berlin, Germany
| |
Collapse
|
45
|
Alderson-Day B, Diederen K, Fernyhough C, Ford JM, Horga G, Margulies DS, McCarthy-Jones S, Northoff G, Shine JM, Turner J, van de Ven V, van Lutterveld R, Waters F, Jardri R. Auditory Hallucinations and the Brain's Resting-State Networks: Findings and Methodological Observations. Schizophr Bull 2016; 42:1110-23. [PMID: 27280452 PMCID: PMC4988751 DOI: 10.1093/schbul/sbw078] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
In recent years, there has been increasing interest in the potential for alterations to the brain's resting-state networks (RSNs) to explain various kinds of psychopathology. RSNs provide an intriguing new explanatory framework for hallucinations, which can occur in different modalities and population groups, but which remain poorly understood. This collaboration from the International Consortium on Hallucination Research (ICHR) reports on the evidence linking resting-state alterations to auditory hallucinations (AH) and provides a critical appraisal of the methodological approaches used in this area. In the report, we describe findings from resting connectivity fMRI in AH (in schizophrenia and nonclinical individuals) and compare them with findings from neurophysiological research, structural MRI, and research on visual hallucinations (VH). In AH, various studies show resting connectivity differences in left-hemisphere auditory and language regions, as well as atypical interaction of the default mode network and RSNs linked to cognitive control and salience. As the latter are also evident in studies of VH, this points to a domain-general mechanism for hallucinations alongside modality-specific changes to RSNs in different sensory regions. However, we also observed high methodological heterogeneity in the current literature, affecting the ability to make clear comparisons between studies. To address this, we provide some methodological recommendations and options for future research on the resting state and hallucinations.
Collapse
Affiliation(s)
| | - Kelly Diederen
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | | | - Judith M. Ford
- Department of Psychiatry, School of Medicine, University of California, San Francisco, San Francisco, CA
| | - Guillermo Horga
- New York State Psychiatric Institute, Columbia University Medical Center, New York, NY
| | - Daniel S. Margulies
- Max Planck Research Group for Neuroanatomy & Connectivity, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | | | - Georg Northoff
- Mind, Brain Imaging and Neuroethics Research Unit, The Royal’s Institute of Mental Health Research, Ottawa, ON, Canada
| | - James M. Shine
- Department of Psychology, Stanford University, Stanford, CA
| | - Jessica Turner
- Department of Psychology, Neuroscience Institute, Georgia State University, Atlanta, GA
| | - Vincent van de Ven
- Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Remko van Lutterveld
- Center for Mindfulness, University of Massachusetts Medical School, Worcester, MA
| | - Flavie Waters
- North Metro Health Service Mental Health, Graylands Health Campus, School of Psychiatry and Clinical Neurosciences, University of Western Australia, Crawley, WA, Australia
| | - Renaud Jardri
- Univ Lille, CNRS (UMR 9193), SCALab & CHU Lille, Psychiatry dept. (CURE), Lille, France
| |
Collapse
|
46
|
Zaki J, Kallman S, Wimmer GE, Ochsner K, Shohamy D. Social Cognition as Reinforcement Learning: Feedback Modulates Emotion Inference. J Cogn Neurosci 2016; 28:1270-82. [DOI: 10.1162/jocn_a_00978] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Abstract
Neuroscientific studies of social cognition typically employ paradigms in which perceivers draw single-shot inferences about the internal states of strangers. Real-world social inference features much different parameters: People often encounter and learn about particular social targets (e.g., friends) over time and receive feedback about whether their inferences are correct or incorrect. Here, we examined this process and, more broadly, the intersection between social cognition and reinforcement learning. Perceivers were scanned using fMRI while repeatedly encountering three social targets who produced conflicting visual and verbal emotional cues. Perceivers guessed how targets felt and received feedback about whether they had guessed correctly. Visual cues reliably predicted one target's emotion, verbal cues predicted a second target's emotion, and neither reliably predicted the third target's emotion. Perceivers successfully used this information to update their judgments over time. Furthermore, trial-by-trial learning signals—estimated using two reinforcement learning models—tracked activity in ventral striatum and ventromedial pFC, structures associated with reinforcement learning, and regions associated with updating social impressions, including TPJ. These data suggest that learning about others' emotions, like other forms of feedback learning, relies on domain-general reinforcement mechanisms as well as domain-specific social information processing.
Collapse
|
47
|
Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning. Nat Commun 2016; 7:12438. [PMID: 27511383 PMCID: PMC4987535 DOI: 10.1038/ncomms12438] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 07/03/2016] [Indexed: 11/08/2022] Open
Abstract
Organisms appear to learn and make decisions using different strategies known as model-free and model-based learning; the former is mere reinforcement of previously rewarded actions and the latter is a forward-looking strategy that involves evaluation of action-state transition probabilities. Prior work has used neural data to argue that both model-based and model-free learners implement a value comparison process at trial onset, but model-based learners assign more weight to forward-looking computations. Here using eye-tracking, we report evidence for a different interpretation of prior results: model-based subjects make their choices prior to trial onset. In contrast, model-free subjects tend to ignore model-based aspects of the task and instead seem to treat the decision problem as a simple comparison process between two differentially valued items, consistent with previous work on sequential-sampling models of decision making. These findings illustrate a problem with assuming that experimental subjects make their decisions at the same prescribed time.
Collapse
|
48
|
Reiter AMF, Koch SP, Schröger E, Hinrichs H, Heinze HJ, Deserno L, Schlagenhauf F. The Feedback-related Negativity Codes Components of Abstract Inference during Reward-based Decision-making. J Cogn Neurosci 2016; 28:1127-38. [DOI: 10.1162/jocn_a_00957] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Abstract
Behavioral control is influenced not only by learning from the choices made and the rewards obtained but also by “what might have happened,” that is, inference about unchosen options and their fictive outcomes. Substantial progress has been made in understanding the neural signatures of direct learning from choices that are actually made and their associated rewards via reward prediction errors (RPEs). However, electrophysiological correlates of abstract inference in decision-making are less clear. One seminal theory suggests that the so-called feedback-related negativity (FRN), an ERP peaking 200–300 msec after a feedback stimulus at frontocentral sites of the scalp, codes RPEs. Hitherto, the FRN has been predominantly related to a so-called “model-free” RPE: The difference between the observed outcome and what had been expected. Here, by means of computational modeling of choice behavior, we show that individuals employ abstract, “double-update” inference on the task structure by concurrently tracking values of chosen stimuli (associated with observed outcomes) and unchosen stimuli (linked to fictive outcomes). In a parametric analysis, model-free RPEs as well as their modification because of abstract inference were regressed against single-trial FRN amplitudes. We demonstrate that components related to abstract inference uniquely explain variance in the FRN beyond model-free RPEs. These findings advance our understanding of the FRN and its role in behavioral adaptation. This might further the investigation of disturbed abstract inference, as proposed, for example, for psychiatric disorders, and its underlying neural correlates.
Collapse
Affiliation(s)
- Andrea M. F. Reiter
- 1Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- 2University of Leipzig
- 3Technische Universität Dresden
| | | | | | - Hermann Hinrichs
- 5Leibniz Institute for Neurobiology, Magdeburg, Germany
- 6Otto-von-Guericke University Magdeburg
| | - Hans-Jochen Heinze
- 5Leibniz Institute for Neurobiology, Magdeburg, Germany
- 6Otto-von-Guericke University Magdeburg
| | - Lorenz Deserno
- 1Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- 4Charité-Universitätsmedizin Berlin
- 6Otto-von-Guericke University Magdeburg
| | - Florian Schlagenhauf
- 1Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- 4Charité-Universitätsmedizin Berlin
| |
Collapse
|
49
|
MacInnes JJ, Dickerson KC, Chen NK, Adcock RA. Cognitive Neurostimulation: Learning to Volitionally Sustain Ventral Tegmental Area Activation. Neuron 2016; 89:1331-1342. [PMID: 26948894 PMCID: PMC5074682 DOI: 10.1016/j.neuron.2016.02.002] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 11/03/2015] [Accepted: 02/01/2016] [Indexed: 12/29/2022]
Abstract
Activation of the ventral tegmental area (VTA) and mesolimbic networks is essential to motivation, performance, and learning. Humans routinely attempt to motivate themselves, with unclear efficacy or impact on VTA networks. Using fMRI, we found untrained participants' motivational strategies failed to consistently activate VTA. After real-time VTA neurofeedback training, however, participants volitionally induced VTA activation without external aids, relative to baseline, Pre-test, and control groups. VTA self-activation was accompanied by increased mesolimbic network connectivity. Among two comparison groups (no neurofeedback, false neurofeedback) and an alternate neurofeedback group (nucleus accumbens), none sustained activation in target regions of interest nor increased VTA functional connectivity. The results comprise two novel demonstrations: learning and generalization after VTA neurofeedback training and the ability to sustain VTA activation without external reward or reward cues. These findings suggest theoretical alignment of ideas about motivation and midbrain physiology and the potential for generalizable interventions to improve performance and learning.
Collapse
Affiliation(s)
- Jeff J MacInnes
- Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, USA; Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC 27710, USA
| | - Kathryn C Dickerson
- Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, USA; Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC 27710, USA
| | - Nan-Kuei Chen
- Brain Imaging and Analysis Center, Duke University, Durham, NC 27710, USA; Department of Radiology, Duke University Medical Center, Durham, NC 27710, USA
| | - R Alison Adcock
- Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, USA; Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC 27710, USA; Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, USA; Department of Neurobiology, Duke University, Durham, NC 27710, USA.
| |
Collapse
|
50
|
Alós-Ferrer C, Hügelschäfer S, Li J. Inertia and Decision Making. Front Psychol 2016; 7:169. [PMID: 26909061 PMCID: PMC4754398 DOI: 10.3389/fpsyg.2016.00169] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 01/29/2016] [Indexed: 11/13/2022] Open
Abstract
Decision inertia is the tendency to repeat previous choices independently of the outcome, which can give rise to perseveration in suboptimal choices. We investigate this tendency in probability-updating tasks. Study 1 shows that, whenever decision inertia conflicts with normatively optimal behavior (Bayesian updating), error rates are larger and decisions are slower. This is consistent with a dual-process view of decision inertia as an automatic process conflicting with a more rational, controlled one. We find evidence of decision inertia in both required and autonomous decisions, but the effect of inertia is more clear in the latter. Study 2 considers more complex decision situations where further conflict arises due to reinforcement processes. We find the same effects of decision inertia when reinforcement is aligned with Bayesian updating, but if the two latter processes conflict, the effects are limited to autonomous choices. Additionally, both studies show that the tendency to rely on decision inertia is positively associated with preference for consistency.
Collapse
Affiliation(s)
| | | | - Jiahui Li
- Department of Economics, University of Cologne Cologne, Germany
| |
Collapse
|