Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Moran R, Keramati M, Dayan P, Dolan RJ. Retrospective model-based inference guides model-free credit assignment. Nat Commun 2019;10:750. [PMID: 30765718 PMCID: PMC6375980 DOI: 10.1038/s41467-019-08662-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 01/17/2019] [Indexed: 11/09/2022] Open

For:	Moran R, Keramati M, Dayan P, Dolan RJ. Retrospective model-based inference guides model-free credit assignment. Nat Commun 2019;10:750. [PMID: 30765718 PMCID: PMC6375980 DOI: 10.1038/s41467-019-08662-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 01/17/2019] [Indexed: 11/09/2022] Open

Number

Cited by Other Article(s)

Scholz V, Waltmann M, Herzog N, Reiter A, Horstmann A, Deserno L. Cortical Grey Matter Mediates Increases in Model-Based Control and Learning from Positive Feedback from Adolescence to Adulthood. J Neurosci 2023;43:2178-2189. [PMID: 36823039 PMCID: PMC10039741 DOI: 10.1523/jneurosci.1418-22.2023] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 12/20/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open

Abstract

Cognition and brain structure undergo significant maturation from adolescence into adulthood. Model-based (MB) control is known to increase across development, which is mediated by cognitive abilities. Here, we asked two questions unaddressed in previous developmental studies. First, what are the brain structural correlates of age-related increases in MB control? Second, how are age-related increases in MB control from adolescence to adulthood influenced by motivational context? A human developmental sample (n = 103; age, 12-50, male/female, 55:48) completed structural MRI and an established task to capture MB control. The task was modified with respect to outcome valence by including (1) reward and punishment blocks to manipulate the motivational context and (2) an additional choice test to assess learning from positive versus negative feedback. After replicating that an age-dependent increase in MB control is mediated by cognitive abilities, we demonstrate first-time evidence that gray matter density (GMD) in the parietal cortex mediates the increase of MB control with age. Although motivational context did not relate to age-related changes in MB control, learning from positive feedback improved with age. Meanwhile, negative feedback learning showed no age effects. We present a first report that an age-related increase in positive feedback learning was mediated by reduced GMD in the parietal, medial, and dorsolateral prefrontal cortex. Our findings indicate that brain maturation, putatively reflected in lower GMD, in distinct and partially overlapping brain regions could lead to a more efficient brain organization and might thus be a key developmental step toward age-related increases in planning and value-based choice.SIGNIFICANCE STATEMENT Changes in model-based decision-making are paralleled by extensive maturation in cognition and brain structure across development. Still, to date the neuroanatomical underpinnings of these changes remain unclear. Here, we demonstrate for the first time that parietal GMD mediates age-dependent increases in model-based control. Age-related increases in positive feedback learning were mediated by reduced GMD in the parietal, medial, and dorsolateral prefrontal cortex. A manipulation of motivational context did not have an impact on age-related changes in model-based control. These findings highlight that brain maturation in distinct and overlapping cortical regions constitutes a key developmental step toward improved value-based choices.

Collapse

Affiliation(s)

Vanessa Scholz Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany Donders Institute for Brain, Cognition and Behaviour, Radboud University, 6525 GD Nijmegen, The Netherlands
Maria Waltmann Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany
Nadine Herzog Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany
Andrea Reiter Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany Collaborative Research Center-940 Volition and Cognitive Control, Faculty of Psychology, Technical University Dresden, 01069 Dresden, Germany
Annette Horstmann Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, 00014 Helsinki, Finland
Lorenz Deserno Department of Child and Adolescent Psychiatry, Psychosomatics and Psychotherapy, Centre of Mental Health, University of Würzburg, 97080 Würzburg, Germany Max Planck Institute for Cognition and Neuroscience, D-04103 Leipzig, Germany Integrated Research and Treatment Center AdiposityDiseases, Leipzig University Medical Center, 04103 Leipzig, Germany Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus, Technical University Dresden, 01069 Dresden, Germany

Collapse

Model-based learning retrospectively updates model-free values. Sci Rep 2022;12:2358. [PMID: 35149713 PMCID: PMC8837618 DOI: 10.1038/s41598-022-05567-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 12/16/2021] [Indexed: 12/02/2022] Open

Optimism and pessimism in optimised replay. PLoS Comput Biol 2022;18:e1009634. [PMID: 35020718 PMCID: PMC8809607 DOI: 10.1371/journal.pcbi.1009634] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 02/02/2022] [Accepted: 11/12/2021] [Indexed: 11/24/2022] Open

Abstract

The replay of task-relevant trajectories is known to contribute to memory consolidation and improved task performance. A wide variety of experimental data show that the content of replayed sequences is highly specific and can be modulated by reward as well as other prominent task variables. However, the rules governing the choice of sequences to be replayed still remain poorly understood. One recent theoretical suggestion is that the prioritization of replay experiences in decision-making problems is based on their effect on the choice of action. We show that this implies that subjects should replay sub-optimal actions that they dysfunctionally choose rather than optimal ones, when, by being forgetful, they experience large amounts of uncertainty in their internal models of the world. We use this to account for recent experimental data demonstrating exactly pessimal replay, fitting model parameters to the individual subjects’ choices.

When animals are asleep or restfully awake, populations of neurons in their brains recapitulate activity associated with extended behaviourally-relevant experiences. This process is called replay, and it has been established for a long time in rodents, and very recently in humans, to be important for good performance in decision-making tasks. The specific experiences which are replayed during those epochs follow highly ordered patterns, but the mechanisms which establish their priority are still not fully understood. One promising theoretical suggestion is that each replay experience is chosen in such a way that the learning that ensues is most helpful for the subsequent performance of the animal. A very recent study reported a surprising result that humans who achieved high performance in a planning task tended to replay actions they found to be sub-optimal, and that this was associated with a useful deprecation of those actions in subsequent performance. In this study, we examine the nature of this pessimized form of replay and show that it is exactly appropriate for forgetful agents. We analyse the role of forgetting for replay choices of our model, and verify our predictions using human subject data.

Collapse

Collins AGE, Shenhav A. Advances in modeling learning and decision-making in neuroscience. Neuropsychopharmacology 2022;47:104-118. [PMID: 34453117 PMCID: PMC8617262 DOI: 10.1038/s41386-021-01126-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 07/14/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023]

Deserno L, Moran R, Michely J, Lee Y, Dayan P, Dolan RJ. Dopamine enhances model-free credit assignment through boosting of retrospective model-based inference. eLife 2021;10:e67778. [PMID: 34882092 PMCID: PMC8758138 DOI: 10.7554/elife.67778] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 12/08/2021] [Indexed: 11/13/2022] Open

Affiliation(s)

Lorenz Deserno Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University of WürzburgWürzburgGermany Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
Rani Moran Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom
Jochen Michely Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin BerlinBerlinGermany
Ying Lee Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom Department of Psychiatry and Psychotherapy, Technische Universität DresdenDresdenGermany
Peter Dayan Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom Max Planck Institute for Biological CyberneticsTübingenGermany University of TübingenTübingenGermany
Raymond J Dolan Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College LondonLondonUnited Kingdom The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College LondonLondonUnited Kingdom

Collapse

Shahar N, Hauser TU, Moran R, Moutoussis M, Bullmore ET, Dolan RJ. Assigning the right credit to the wrong action: compulsivity in the general population is associated with augmented outcome-irrelevant value-based learning. Transl Psychiatry 2021;11:564. [PMID: 34741013 PMCID: PMC8571313 DOI: 10.1038/s41398-021-01642-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 09/01/2021] [Accepted: 09/21/2021] [Indexed: 11/08/2022] Open

Na S, Chung D, Hula A, Perl O, Jung J, Heflin M, Blackmore S, Fiore VG, Dayan P, Gu X. Humans use forward thinking to exploit social controllability. eLife 2021;10:64983. [PMID: 34711304 PMCID: PMC8555988 DOI: 10.7554/elife.64983] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 09/30/2021] [Indexed: 12/27/2022] Open

Yu LQ, Wilson RC, Nassar MR. Adaptive learning is structure learning in time. Neurosci Biobehav Rev 2021;128:270-281. [PMID: 34144114 PMCID: PMC8422504 DOI: 10.1016/j.neubiorev.2021.06.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/19/2021] [Accepted: 06/11/2021] [Indexed: 10/21/2022]

Moran R, Dayan P, Dolan RJ. Efficiency and prioritization of inference-based credit assignment. Curr Biol 2021;31:2747-2756.e6. [PMID: 33887181 PMCID: PMC8279739 DOI: 10.1016/j.cub.2021.03.091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/11/2021] [Accepted: 03/29/2021] [Indexed: 11/16/2022]

Xia L, Collins AGE. Temporal and state abstractions for efficient learning, transfer, and composition in humans. Psychol Rev 2021;128:643-666. [PMID: 34014709 PMCID: PMC8485577 DOI: 10.1037/rev0000295] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Wood AN. New roles for dopamine in motor skill acquisition: lessons from primates, rodents, and songbirds. J Neurophysiol 2021;125:2361-2374. [PMID: 33978497 DOI: 10.1152/jn.00648.2020] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Human subjects exploit a cognitive map for credit assignment. Proc Natl Acad Sci U S A 2021;118:2016884118. [PMID: 33479182 PMCID: PMC7848688 DOI: 10.1073/pnas.2016884118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Abstract

Credit assignment (CA) to relevant actions poses a challenge because one is often flooded with reward feedback that is not easily causally attributed. We addressed this issue in a reinforcement learning framework wherein choice is mutually controlled by value-caching model-free (MF) and prospective, planning model-based (MB) systems. We find knowledge, stored in a cognitive map, filters exuberant reward feedback to guide CA in both systems but based on different attribute dimensions. In MF, CA is boosted for outcomes that are relevant (causally related) to one’s choice, whereas in MB, CA is enhanced for outcomes that attract greater attention during the deliberation process that preceded a choice. We consider normative and mechanistic accounts, including how these processes are instrumental to adaptation.

An influential reinforcement learning framework proposes that behavior is jointly governed by model-free (MF) and model-based (MB) controllers. The former learns the values of actions directly from past encounters, and the latter exploits a cognitive map of the task to calculate these prospectively. Considerable attention has been paid to how these systems interact during choice, but how and whether knowledge of a cognitive map contributes to the way MF and MB controllers assign credit (i.e., to how they revaluate actions and states following the receipt of an outcome) remains underexplored. Here, we examine such sophisticated credit assignment using a dual-outcome bandit task. We provide evidence that knowledge of a cognitive map influences credit assignment in both MF and MB systems, mediating subtly different aspects of apparent relevance. Specifically, we show MF credit assignment is enhanced for those rewards that are related to a choice, and this contrasted with choice-unrelated rewards that reinforced subsequent choices negatively. This modulation is only possible based on knowledge of task structure. On the other hand, MB credit assignment was boosted for outcomes that impacted on differences in values between offered bandits. We consider mechanistic accounts and the normative status of these findings. We suggest the findings extend the scope and sophistication of cognitive map-based credit assignment during reinforcement learning, with implications for understanding behavioral control.

Collapse

Moran R, Keramati M, Dolan RJ. Model based planners reflect on their model-free propensities. PLoS Comput Biol 2021;17:e1008552. [PMID: 33411724 PMCID: PMC7817042 DOI: 10.1371/journal.pcbi.1008552] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 01/20/2021] [Accepted: 11/23/2020] [Indexed: 12/19/2022] Open

Steinke A, Lange F, Kopp B. Parallel model-based and model-free reinforcement learning for card sorting performance. Sci Rep 2020;10:15464. [PMID: 32963297 PMCID: PMC7508815 DOI: 10.1038/s41598-020-72407-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 08/31/2020] [Indexed: 12/13/2022] Open

Collins AGE, Cockburn J. Beyond dichotomies in reinforcement learning. Nat Rev Neurosci 2020;21:576-586. [PMID: 32873936 DOI: 10.1038/s41583-020-0355-6] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2020] [Indexed: 11/09/2022]

FitzGerald THB, Penny WD, Bonnici HM, Adams RA. Retrospective Inference as a Form of Bounded Rationality, and Its Beneficial Influence on Learning. Front Artif Intell 2020;3:2. [PMID: 33733122 PMCID: PMC7861256 DOI: 10.3389/frai.2020.00002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 01/14/2020] [Indexed: 12/22/2022] Open

Abstract

Probabilistic models of cognition typically assume that agents make inferences about current states by combining new sensory information with fixed beliefs about the past, an approach known as Bayesian filtering. This is computationally parsimonious, but, in general, leads to suboptimal beliefs about past states, since it ignores the fact that new observations typically contain information about the past as well as the present. This is disadvantageous both because knowledge of past states may be intrinsically valuable, and because it impairs learning about fixed or slowly changing parameters of the environment. For these reasons, in offline data analysis it is usual to infer on every set of states using the entire time series of observations, an approach known as (fixed-interval) Bayesian smoothing. Unfortunately, however, this is impractical for real agents, since it requires the maintenance and updating of beliefs about an ever-growing set of states. We propose an intermediate approach, finite retrospective inference (FRI), in which agents perform update beliefs about a limited number of past states (Formally, this represents online fixed-lag smoothing with a sliding window). This can be seen as a form of bounded rationality in which agents seek to optimize the accuracy of their beliefs subject to computational and other resource costs. We show through simulation that this approach has the capacity to significantly increase the accuracy of both inference and learning, using a simple variational scheme applied to both randomly generated Hidden Markov models (HMMs), and a specific application of the HMM, in the form of the widely used probabilistic reversal task. Our proposal thus constitutes a theoretical contribution to normative accounts of bounded rationality, which makes testable empirical predictions that can be explored in future work.

Collapse

Kopp B, Steinke A, Bertram M, Skripuletz T, Lange F. Multiple Levels of Control Processes for Wisconsin Card Sorts: An Observational Study. Brain Sci 2019;9:brainsci9060141. [PMID: 31213007 PMCID: PMC6627185 DOI: 10.3390/brainsci9060141] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 06/12/2019] [Accepted: 06/15/2019] [Indexed: 11/16/2022] Open

Radulescu A, Niv Y. State representation in mental illness. Curr Opin Neurobiol 2019;55:160-166. [PMID: 31051434 DOI: 10.1016/j.conb.2019.03.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 03/10/2019] [Accepted: 03/25/2019] [Indexed: 10/26/2022]