Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Skatova A, Chan PA, Daw ND. Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task. Front Hum Neurosci 2013;7:525. [PMID: 24027514 PMCID: PMC3760140 DOI: 10.3389/fnhum.2013.00525] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Accepted: 08/13/2013] [Indexed: 11/20/2022] Open

For:	Skatova A, Chan PA, Daw ND. Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task. Front Hum Neurosci 2013;7:525. [PMID: 24027514 PMCID: PMC3760140 DOI: 10.3389/fnhum.2013.00525] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Accepted: 08/13/2013] [Indexed: 11/20/2022] Open

Number

Cited by Other Article(s)

Castro-Rodrigues P, Akam T, Snorasson I, Camacho M, Paixão V, Maia A, Barahona-Corrêa JB, Dayan P, Simpson HB, Costa RM, Oliveira-Maia AJ. Explicit knowledge of task structure is a primary determinant of human model-based action. Nat Hum Behav 2022;6:1126-1141. [PMID: 35589826 DOI: 10.1038/s41562-022-01346-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 03/19/2022] [Accepted: 03/31/2022] [Indexed: 11/09/2022]

Affiliation(s)

Pedro Castro-Rodrigues Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Centro Hospitalar Psiquiátrico de Lisboa, Lisbon, Portugal
Thomas Akam Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,Department of Experimental Psychology, University of Oxford, Oxford, UK
Ivar Snorasson Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA
Marta Camacho Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,John Van Geest Center for Brain Repair, University of Cambridge, Cambridge, UK
Vitor Paixão Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal
Ana Maia Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Department of Psychiatry and Mental Health, Centro Hospitalar de Lisboa Ocidental, Lisbon, Portugal
J Bernardo Barahona-Corrêa Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal
Peter Dayan Max Planck Institute for Biological Cybernetics, Tübingen, Germany.,The University of Tübingen, Tübingen, Germany
H Blair Simpson Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA.,Department of Psychiatry, Columbia University, New York, NY, USA
Rui M Costa Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
Albino J Oliveira-Maia Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal. .,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal. .,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.

Collapse

Allen TA, Schreiber AM, Hall NT, Hallquist MN. From Description to Explanation: Integrating Across Multiple Levels of Analysis to Inform Neuroscientific Accounts of Dimensional Personality Pathology. J Pers Disord 2020;34:650-676. [PMID: 33074057 PMCID: PMC7583665 DOI: 10.1521/pedi.2020.34.5.650] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning. Nat Commun 2019;10:5738. [PMID: 31844060 PMCID: PMC6915739 DOI: 10.1038/s41467-019-13632-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 11/11/2019] [Indexed: 12/11/2022] Open

Hannikainen IR, Machery E, Rose D, Stich S, Olivola CY, Sousa P, Cova F, Buchtel EE, Alai M, Angelucci A, Berniûnas R, Chatterjee A, Cheon H, Cho IR, Cohnitz D, Dranseika V, Eraña Lagos Á, Ghadakpour L, Grinberg M, Hashimoto T, Horowitz A, Hristova E, Jraissati Y, Kadreva V, Karasawa K, Kim H, Kim Y, Lee M, Mauro C, Mizumoto M, Moruzzi S, Ornelas J, Osimani B, Romero C, Rosas López A, Sangoi M, Sereni A, Songhorian S, Struchiner N, Tripodi V, Usui N, Vázquez Del Mercado A, Vosgerichian HA, Zhang X, Zhu J. For Whom Does Determinism Undermine Moral Responsibility? Surveying the Conditions for Free Will Across Cultures. Front Psychol 2019;10:2428. [PMID: 31749739 PMCID: PMC6848273 DOI: 10.3389/fpsyg.2019.02428] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/14/2019] [Indexed: 11/22/2022] Open

Affiliation(s)

Ivar R Hannikainen Department of Law, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
Edouard Machery Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, United States
David Rose Department of Philosophy, Florida State University, Tallahassee, FL, United States
Stephen Stich Department of Philosophy, Rutgers University, New Brunswick, NJ, United States
Christopher Y Olivola Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA, United States
Paulo Sousa Institute of Cognition and Culture, Queen's University, Belfast, United Kingdom
Florian Cova Department of Philosophy, University of Geneva, Geneva, Switzerland
Emma E Buchtel Department of Psychology, The Education University of Hong Kong, Tai Po, Hong Kong
Mario Alai Department of Pure and Applied Sciences, University of Urbino Carlo Bo, Urbino, Italy
Adriano Angelucci Department of Pure and Applied Sciences, University of Urbino Carlo Bo, Urbino, Italy
Renatas Berniûnas Institute of Psychology, Vilnius University, Vilnius, Lithuania
Amita Chatterjee School of Cognitive Science, Jadavpur University, Kolkata, India
Hyundeuk Cheon Department of Philosophy, Seoul National University, Seoul, South Korea
In-Rae Cho Department of Philosophy, Seoul National University, Seoul, South Korea
Daniel Cohnitz Department of Philosophy and Religious Studies, Utrecht University, Utrecht, Netherlands
Vilius Dranseika Institute of Philosophy, Vilnius University, Vilnius, Lithuania
Ángeles Eraña Lagos Instituto de Investigaciones Filosóficas-UNAM, Mexico City, Mexico
Laleh Ghadakpour Independent Researcher, Tehran, Iran
Maurice Grinberg Department of Cognitive Science and Psychology, New Bulgarian University, Sofia, Bulgaria
Takaaki Hashimoto Department of Social Psychology, University of Tokyo, Tokyo, Japan
Amir Horowitz Department of History, Philosophy and Judaic Studies, Open University of Israel, Ra'anana, Israel
Evgeniya Hristova Department of Cognitive Science and Psychology, New Bulgarian University, Sofia, Bulgaria
Yasmina Jraissati Department of Philosophy, American University of Beirut, Beirut, Lebanon
Veselina Kadreva Department of Cognitive Science and Psychology, New Bulgarian University, Sofia, Bulgaria
Kaori Karasawa Department of Social Psychology, University of Tokyo, Tokyo, Japan
Hackjin Kim Department of Psychology, Korea University, Seoul, South Korea
Yeonjeong Kim Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA, United States
Minwoo Lee Department of Psychology, Korea University, Seoul, South Korea
Carlos Mauro CLOO Behavioral Insights Unit, Porto, Portugal
Masaharu Mizumoto School of Knowledge Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Sebastiano Moruzzi Department of Philosophy and Communication Studies, University of Bologna, Bologna, Italy
Jorge Ornelas Faculty of Social Sciences and Humanities, Universidad Autónoma de San Luis Potosí, San Luis Potosí, Mexico
Barbara Osimani Munich Center for Mathematical Philosophy, Ludwig Maximilians Universität, Munich, Germany
Carlos Romero Instituto de Investigaciones Filosóficas-UNAM, Mexico City, Mexico
Alejandro Rosas López Department of Philosophy, National University of Colombia, Bogotá, Colombia
Massimo Sangoi Department of Pure and Applied Sciences, University of Urbino Carlo Bo, Urbino, Italy
Andrea Sereni Faculty of Philosophy, Scuola Universitaria Superiore IUSS, Pavia, Italy
Sarah Songhorian Faculty of Philosophy, Vita-Salute San Raffaele University, Milan, Italy
Noel Struchiner Department of Law, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
Vera Tripodi Department of Philosophy and Educational Sciences, University of Turin, Turin, Italy
Naoki Usui Department of Humanities, Mie University, Tsu, Japan
Alejandro Vázquez Del Mercado Instituto de Investigaciones Filosóficas-UNAM, Mexico City, Mexico
Hrag A Vosgerichian Department of History, Philosophy and Judaic Studies, Open University of Israel, Ra'anana, Israel
Xueyi Zhang School of Humanities, Southeast University, Nanjing, China
Jing Zhu School of Information Management, Sun Yat-sen University, Guangzhou, China

Collapse

Hasz BM, Redish AD. Deliberation and Procedural Automation on a Two-Step Task for Rats. Front Integr Neurosci 2018;12:30. [PMID: 30123115 PMCID: PMC6085996 DOI: 10.3389/fnint.2018.00030] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 07/02/2018] [Indexed: 11/25/2022] Open

A simple computational algorithm of model-based choice preference. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2018;17:764-783. [PMID: 28573384 DOI: 10.3758/s13415-017-0511-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Fakhari P, Khodadadi A, Busemeyer JR. The detour problem in a stochastic environment: Tolman revisited. Cogn Psychol 2018;101:29-49. [PMID: 29294373 DOI: 10.1016/j.cogpsych.2017.12.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 12/22/2017] [Accepted: 12/23/2017] [Indexed: 10/18/2022]

When Does Model-Based Control Pay Off? PLoS Comput Biol 2016;12:e1005090. [PMID: 27564094 PMCID: PMC5001643 DOI: 10.1371/journal.pcbi.1005090] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 08/01/2016] [Indexed: 12/02/2022] Open

Abstract

Many accounts of decision making and reinforcement learning posit the existence of two distinct systems that control choice: a fast, automatic system and a slow, deliberative system. Recent research formalizes this distinction by mapping these systems to “model-free” and “model-based” strategies in reinforcement learning. Model-free strategies are computationally cheap, but sometimes inaccurate, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. In contrast, model-based strategies compute action values through planning in a causal model of the environment, which is more accurate but also more cognitively demanding. It is assumed that this trade-off between accuracy and computational demand plays an important role in the arbitration between the two strategies, but we show that the hallmark task for dissociating model-free and model-based strategies, as well as several related variants, do not embody such a trade-off. We describe five factors that reduce the effectiveness of the model-based strategy on these tasks by reducing its accuracy in estimating reward outcomes and decreasing the importance of its choices. Based on these observations, we describe a version of the task that formally and empirically obtains an accuracy-demand trade-off between model-free and model-based strategies. Moreover, we show that human participants spontaneously increase their reliance on model-based control on this task, compared to the original paradigm. Our novel task and our computational analyses may prove important in subsequent empirical investigations of how humans balance accuracy and demand.

When you make a choice about what groceries to get for dinner, you can rely on two different strategies. You can make your choice by relying on habit, simply buying the items you need to make a meal that is second nature to you. However, you can also plan your actions in a more deliberative way, realizing that the friend who will join you is a vegetarian, and therefore you should not make the burgers that have become a staple in your cooking. These two strategies differ in how computationally demanding and accurate they are. While the habitual strategy is less computationally demanding (costs less effort and time), the deliberative strategy is more accurate. Scientists have been able to study the distinction between these strategies using a task that allows them to measure how much people rely on habit and planning strategies. Interestingly, we have discovered that in this task, the deliberative strategy does not increase performance accuracy, and hence does not induce a trade-off between accuracy and demand. We describe why this happens, and improve the task so that it embodies an accuracy-demand trade-off, providing evidence for theories of cost-based arbitration between cognitive strategies.

Collapse

Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning. Nat Commun 2016;7:12438. [PMID: 27511383 PMCID: PMC4987535 DOI: 10.1038/ncomms12438] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 07/03/2016] [Indexed: 11/08/2022] Open

Raio CM, Goldfarb EV, Lempert KM, Sokol-Hessner P. Classifying emotion regulation strategies. Nat Rev Neurosci 2016;17:532. [PMID: 27277870 DOI: 10.1038/nrn.2016.78] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Sebold M, Schad DJ, Nebe S, Garbusow M, Jünger E, Kroemer NB, Kathmann N, Zimmermann US, Smolka MN, Rapp MA, Heinz A, Huys QJM. Don't Think, Just Feel the Music: Individuals with Strong Pavlovian-to-Instrumental Transfer Effects Rely Less on Model-based Reinforcement Learning. J Cogn Neurosci 2016;28:985-95. [PMID: 26942321 DOI: 10.1162/jocn_a_00945] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Manza P, Hu S, Ide JS, Farr OM, Zhang S, Leung HC, Li CSR. The effects of methylphenidate on cerebral responses to conflict anticipation and unsigned prediction error in a stop-signal task. J Psychopharmacol 2016;30:283-93. [PMID: 26755547 PMCID: PMC4837899 DOI: 10.1177/0269881115625102] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Gillan CM, Kosinski M, Whelan R, Phelps EA, Daw ND. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife 2016;5. [PMID: 26928075 PMCID: PMC4786435 DOI: 10.7554/elife.11305] [Citation(s) in RCA: 280] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 01/14/2016] [Indexed: 12/22/2022] Open

Abstract

Prominent theories suggest that compulsive behaviors, characteristic of obsessive-compulsive disorder and addiction, are driven by shared deficits in goal-directed control, which confers vulnerability for developing rigid habits. However, recent studies have shown that deficient goal-directed control accompanies several disorders, including those without an obvious compulsive element. Reasoning that this lack of clinical specificity might reflect broader issues with psychiatric diagnostic categories, we investigated whether a dimensional approach would better delineate the clinical manifestations of goal-directed deficits. Using large-scale online assessment of psychiatric symptoms and neurocognitive performance in two independent general-population samples, we found that deficits in goal-directed control were most strongly associated with a symptom dimension comprising compulsive behavior and intrusive thought. This association was highly specific when compared to other non-compulsive aspects of psychopathology. These data showcase a powerful new methodology and highlight the potential of a dimensional, biologically-grounded approach to psychiatry research.

DOI:http://dx.doi.org/10.7554/eLife.11305.001

When an individual resists the temptation to stay out late in order to get a good night’s sleep, he or she is exercising what is known as “goal-directed control”. This kind of control allows individuals to regulate their behaviour in a deliberate manner. It is thought that a reduction in goal-directed control may be linked to compulsiveness or compulsivity, a psychological trait that involves excessive repetition of thoughts or actions. Furthermore, evidence shows that goal-directed control is reduced in people with compulsive disorders, such as obsessive-compulsive disorder (or OCD) and drug addiction. However, failures of goal-directed control have also been reported in other mental health conditions that are not linked to compulsivity, such as social anxiety disorder.

The fact that reduced goal-directed control is found across various mental health conditions highlights a core issue in modern psychiatric research and treatment. Mental health conditions are typically defined and diagnosed by their clinical symptoms, not by their underlying psychological traits or biological abnormalities. This makes it difficult to determine the cause of a specific disorder, as its symptoms are often rooted in the same psychological and biological traits seen in other mental health conditions.

To start to tackle this issue, Gillan et al. used a strategy that allowed them to look at compulsivity as a “trans-diagnostic dimension”; that is, as something that exists on a spectrum and is not specific to one disorder but involved in numerous different mental health conditions. Nearly 2,000 people completed an online task that assessed goal-directed control, and filled in questionnaires that measured symptoms of various mental health conditions. Gillan et al. showed that, as expected, people with reduced goal-directed control were generally more compulsive, and that this relationship could be seen in the context of both OCD and other compulsive disorders such as addiction.

Further, by leveraging the efficiency of online data collection to collect such a large sample, Gillan et al. were also able to examine how much different symptoms co-occurred in people. This enabled them to use a statistical technique to pick out three trans-diagnostic dimensions – compulsive behaviour and intrusive thought, anxious-depression and social withdrawal – and found that only the compulsive factor was associated with reduced goal-directed control. In fact, reduced goal-directed control was found to be more closely related to compulsivity than the symptoms of traditional mental health disorders including OCD. These findings show that research into the causes of mental health conditions and perhaps ultimately diagnosis and treatment – all of which have traditionally approached specific disorders in isolation – would benefit greatly from a trans-diagnostic approach.

DOI:http://dx.doi.org/10.7554/eLife.11305.002

Collapse

Akam T, Costa R, Dayan P. Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task. PLoS Comput Biol 2015;11:e1004648. [PMID: 26657806 PMCID: PMC4686094 DOI: 10.1371/journal.pcbi.1004648] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 11/09/2015] [Indexed: 11/28/2022] Open

Abstract

The recently developed ‘two-step’ behavioural task promises to differentiate model-based from model-free reinforcement learning, while generating neurophysiologically-friendly decision datasets with parametric variation of decision variables. These desirable features have prompted its widespread adoption. Here, we analyse the interactions between a range of different strategies and the structure of transitions and outcomes in order to examine constraints on what can be learned from behavioural performance. The task involves a trade-off between the need for stochasticity, to allow strategies to be discriminated, and a need for determinism, so that it is worth subjects’ investment of effort to exploit the contingencies optimally. We show through simulation that under certain conditions model-free strategies can masquerade as being model-based. We first show that seemingly innocuous modifications to the task structure can induce correlations between action values at the start of the trial and the subsequent trial events in such a way that analysis based on comparing successive trials can lead to erroneous conclusions. We confirm the power of a suggested correction to the analysis that can alleviate this problem. We then consider model-free reinforcement learning strategies that exploit correlations between where rewards are obtained and which actions have high expected value. These generate behaviour that appears model-based under these, and also more sophisticated, analyses. Exploiting the full potential of the two-step task as a tool for behavioural neuroscience requires an understanding of these issues.

Planning is the use of a predictive model of the consequences of actions to guide decision making. Planning plays a critical role in human behaviour, but isolating its contribution is challenging because it is complemented by control systems which learn values of actions directly from the history of reinforcement, resulting in automatized mappings from states to actions often termed habits. Our study examined a recently developed behavioural task which uses choices in a multi-step decision tree to differentiate planning from value-based control. We compared various strategies using simulations, showing a range that produce behaviour that resembles planning but in fact arises as a fixed mapping from particular sorts of states to action. These results show that when a planning problem is faced repeatedly, sophisticated automatization strategies may be developed which identify that there are in fact a limited number of relevant states of the world each with an appropriate fixed or habitual response. Understanding such strategies is important for the design and interpretation of tasks which aim to isolate the contribution of planning to behaviour. Such strategies are also of independent scientific interest as they may contribute to automatization of behaviour in complex environments.

Collapse

Model-Based Reasoning in Humans Becomes Automatic with Training. PLoS Comput Biol 2015;11:e1004463. [PMID: 26379239 PMCID: PMC4588166 DOI: 10.1371/journal.pcbi.1004463] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Accepted: 06/13/2015] [Indexed: 11/19/2022] Open

Otto AR, Skatova A, Madlon-Kay S, Daw ND. Cognitive control predicts use of model-based reinforcement learning. J Cogn Neurosci 2015;27:319-33. [PMID: 25170791 DOI: 10.1162/jocn_a_00709] [Citation(s) in RCA: 108] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Pickering AD, Pesola F. Modeling dopaminergic and other processes involved in learning from reward prediction error: contributions from an individual differences perspective. Front Hum Neurosci 2014;8:740. [PMID: 25324752 PMCID: PMC4179695 DOI: 10.3389/fnhum.2014.00740] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Accepted: 09/03/2014] [Indexed: 11/13/2022] Open

Multiple memory systems as substrates for multiple decision systems. Neurobiol Learn Mem 2014;117:4-13. [PMID: 24846190 DOI: 10.1016/j.nlm.2014.04.014] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Revised: 04/22/2014] [Accepted: 04/29/2014] [Indexed: 11/22/2022]

Abstract

It has recently become widely appreciated that value-based decision making is supported by multiple computational strategies. In particular, animal and human behavior in learning tasks appears to include habitual responses described by prominent model-free reinforcement learning (RL) theories, but also more deliberative or goal-directed actions that can be characterized by a different class of theories, model-based RL. The latter theories evaluate actions by using a representation of the contingencies of the task (as with a learned map of a spatial maze), called an "internal model." Given the evidence of behavioral and neural dissociations between these approaches, they are often characterized as dissociable learning systems, though they likely interact and share common mechanisms. In many respects, this division parallels a longstanding dissociation in cognitive neuroscience between multiple memory systems, describing, at the broadest level, separate systems for declarative and procedural learning. Procedural learning has notable parallels with model-free RL: both involve learning of habits and both are known to depend on parts of the striatum. Declarative memory, by contrast, supports memory for single events or episodes and depends on the hippocampus. The hippocampus is thought to support declarative memory by encoding temporal and spatial relations among stimuli and thus is often referred to as a relational memory system. Such relational encoding is likely to play an important role in learning an internal model, the representation that is central to model-based RL. Thus, insofar as the memory systems represent more general-purpose cognitive mechanisms that might subserve performance on many sorts of tasks including decision making, these parallels raise the question whether the multiple decision systems are served by multiple memory systems, such that one dissociation is grounded in the other. Here we investigated the relationship between model-based RL and relational memory by comparing individual differences across behavioral tasks designed to measure either capacity. Human subjects performed two tasks, a learning and generalization task (acquired equivalence) which involves relational encoding and depends on the hippocampus; and a sequential RL task that could be solved by either a model-based or model-free strategy. We assessed the correlation between subjects' use of flexible, relational memory, as measured by generalization in the acquired equivalence task, and their differential reliance on either RL strategy in the decision task. We observed a significant positive relationship between generalization and model-based, but not model-free, choice strategies. These results are consistent with the hypothesis that model-based RL, like acquired equivalence, relies on a more general-purpose relational memory system.

Collapse