1
|
Velázquez-Vargas CA, Daw ND, Taylor JA. The role of training variability for model-based and model-free learning of an arbitrary visuomotor mapping. PLoS Comput Biol 2024; 20:e1012471. [PMID: 39331685 PMCID: PMC11463753 DOI: 10.1371/journal.pcbi.1012471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 10/09/2024] [Accepted: 09/06/2024] [Indexed: 09/29/2024] Open
Abstract
A fundamental feature of the human brain is its capacity to learn novel motor skills. This capacity requires the formation of vastly different visuomotor mappings. Using a grid navigation task, we investigated whether training variability would enhance the flexible use of a visuomotor mapping (key-to-direction rule), leading to better generalization performance. Experiments 1 and 2 show that participants trained to move between multiple start-target pairs exhibited greater generalization to both distal and proximal targets compared to participants trained to move between a single pair. This finding suggests that limited variability can impair decisions even in simple tasks without planning. In addition, during the training phase, participants exposed to higher variability were more inclined to choose options that, counterintuitively, moved the cursor away from the target while minimizing its actual distance under the constrained mapping, suggesting a greater engagement in model-based computations. In Experiments 3 and 4, we showed that the limited generalization performance in participants trained with a single pair can be enhanced by a short period of variability introduced early in learning or by incorporating stochasticity into the visuomotor mapping. Our computational modeling analyses revealed that a hybrid model between model-free and model-based computations with different mixing weights for the training and generalization phases, best described participants' data. Importantly, the differences in the model-based weights between our experimental groups, paralleled the behavioral findings during training and generalization. Taken together, our results suggest that training variability enables the flexible use of the visuomotor mapping, potentially by preventing the consolidation of habits due to the continuous demand to change responses.
Collapse
Affiliation(s)
| | - Nathaniel D. Daw
- Department of Psychology, Princeton University, Princeton, New Jersey, United States of America
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America
| | - Jordan A. Taylor
- Department of Psychology, Princeton University, Princeton, New Jersey, United States of America
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
2
|
Kessler F, Frankenstein J, Rothkopf CA. Human navigation strategies and their errors result from dynamic interactions of spatial uncertainties. Nat Commun 2024; 15:5677. [PMID: 38971789 PMCID: PMC11227593 DOI: 10.1038/s41467-024-49722-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 06/14/2024] [Indexed: 07/08/2024] Open
Abstract
Goal-directed navigation requires continuously integrating uncertain self-motion and landmark cues into an internal sense of location and direction, concurrently planning future paths, and sequentially executing motor actions. Here, we provide a unified account of these processes with a computational model of probabilistic path planning in the framework of optimal feedback control under uncertainty. This model gives rise to diverse human navigational strategies previously believed to be distinct behaviors and predicts quantitatively both the errors and the variability of navigation across numerous experiments. This furthermore explains how sequential egocentric landmark observations form an uncertain allocentric cognitive map, how this internal map is used both in route planning and during execution of movements, and reconciles seemingly contradictory results about cue-integration behavior in navigation. Taken together, the present work provides a parsimonious explanation of how patterns of human goal-directed navigation behavior arise from the continuous and dynamic interactions of spatial uncertainties in perception, cognition, and action.
Collapse
Affiliation(s)
- Fabian Kessler
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany.
| | - Julia Frankenstein
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Constantin A Rothkopf
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
- Frankfurt Institute for Advanced Studies, Goethe University, Frankfurt, Germany
| |
Collapse
|
3
|
Katayama R, Shiraki R, Ishii S, Yoshida W. Belief inference for hierarchical hidden states in spatial navigation. Commun Biol 2024; 7:614. [PMID: 38773301 PMCID: PMC11109253 DOI: 10.1038/s42003-024-06316-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 05/10/2024] [Indexed: 05/23/2024] Open
Abstract
Uncertainty abounds in the real world, and in environments with multiple layers of unobservable hidden states, decision-making requires resolving uncertainties based on mutual inference. Focusing on a spatial navigation problem, we develop a Tiger maze task that involved simultaneously inferring the local hidden state and the global hidden state from probabilistically uncertain observation. We adopt a Bayesian computational approach by proposing a hierarchical inference model. Applying this to human task behaviour, alongside functional magnetic resonance brain imaging, allows us to separate the neural correlates associated with reinforcement and reassessment of belief in hidden states. The imaging results also suggest that different layers of uncertainty differentially involve the basal ganglia and dorsomedial prefrontal cortex, and that the regions responsible are organised along the rostral axis of these areas according to the type of inference and the level of abstraction of the hidden state, i.e. higher-order state inference involves more anterior parts.
Collapse
Affiliation(s)
- Risa Katayama
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan.
- Department of AI-Brain Integration, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan.
| | - Ryo Shiraki
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan
| | - Shin Ishii
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan
- Neural Information Analysis Laboratories, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan
- International Research Center for Neurointelligence, the University of Tokyo, Tokyo, 113-0033, Japan
| | - Wako Yoshida
- Department of Neural Computation for Decision-Making, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| |
Collapse
|
4
|
He Q, Liu JL, Eschapasse L, Zagora AK, Brown TI. The neural correlates of memory integration in value-based decision-making during human spatial navigation. Neuropsychologia 2024; 193:108758. [PMID: 38103679 DOI: 10.1016/j.neuropsychologia.2023.108758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 09/12/2023] [Accepted: 12/11/2023] [Indexed: 12/19/2023]
Abstract
In daily life, we often make decisions based on relative value of the options, and we often derive these values from segmenting or integrating the outcomes of past episodes in memory. The neural correlates involved in value-based decision-making have been extensively studied in the literature, but few studies have investigated this topic in decisions that require segmenting or integrating episodic memory from related sources, and even fewer studies examine it in the context of spatial navigation. Building on the computational models from our previous studies, the current study investigates the neural substrates involved in decisions that require people either segment or integrate wayfinding outcomes involving different goals, across virtual spatial navigation tasks with differing demands. We find that when decisions require computation of spatial distances for navigation options, but also evaluation of one's prior spatial navigation ability with the task, the estimated value of navigational choices (EV) modulates neural activity in the dorsomedial prefrontal (dmPFC) cortex and ventrolateral prefrontal (vlFPC) cortex. However, superior parietal cortex tracked EV when decision-making tasks only require spatial distance memory but not evaluation of spatial navigation ability. Our findings reveal divergent neural substrates of memory integration in value-based decision-making under different spatial processing demands.
Collapse
Affiliation(s)
- Qiliang He
- School of Psychology, Georgia Institute of Technology, USA.
| | - Jancy Ling Liu
- School of Economics, Georgia Institute of Technology, USA
| | - Lou Eschapasse
- School of Psychology, Georgia Institute of Technology, USA
| | - Anna K Zagora
- School of Biological Sciences, Georgia Institute of Technology, USA
| | | |
Collapse
|
5
|
Muhle-Karbe PS, Sheahan H, Pezzulo G, Spiers HJ, Chien S, Schuck NW, Summerfield C. Goal-seeking compresses neural codes for space in the human hippocampus and orbitofrontal cortex. Neuron 2023; 111:3885-3899.e6. [PMID: 37725981 DOI: 10.1016/j.neuron.2023.08.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/10/2023] [Accepted: 08/18/2023] [Indexed: 09/21/2023]
Abstract
Humans can navigate flexibly to meet their goals. Here, we asked how the neural representation of allocentric space is distorted by goal-directed behavior. Participants navigated an agent to two successive goal locations in a grid world environment comprising four interlinked rooms, with a contextual cue indicating the conditional dependence of one goal location on another. Examining the neural geometry by which room and context were encoded in fMRI signals, we found that map-like representations of the environment emerged in both hippocampus and neocortex. Cognitive maps in hippocampus and orbitofrontal cortices were compressed so that locations cued as goals were coded together in neural state space, and these distortions predicted successful learning. This effect was captured by a computational model in which current and prospective locations are jointly encoded in a place code, providing a theory of how goals warp the neural representation of space in macroscopic neural signals.
Collapse
Affiliation(s)
- Paul S Muhle-Karbe
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, UK; School of Psychology, University of Birmingham, Birmingham B15 2SA, UK; Centre for Human Brain Health, University of Birmingham, Birmingham B15 2SA, UK.
| | - Hannah Sheahan
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, UK; Google DeepMind, London EC4A 3TW, UK
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, 00185 Rome, Italy
| | - Hugo J Spiers
- Department of Experimental Psychology, University College London, London WC1E 6BT, UK
| | - Samson Chien
- Max Planck Research Group NeuroCode, Max Planck Institute for Human Development, 14195 Berlin, Germany
| | - Nicolas W Schuck
- Max Planck Research Group NeuroCode, Max Planck Institute for Human Development, 14195 Berlin, Germany; Max Planck UCL Centre for Computational Psychiatry and Aging Research, 14195 Berlin, Germany; Institute of Psychology, Universität Hamburg, 20146 Hamburg, Germany
| | - Christopher Summerfield
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, UK; Centre for Human Brain Health, University of Birmingham, Birmingham B15 2SA, UK.
| |
Collapse
|
6
|
Krausz TA, Comrie AE, Kahn AE, Frank LM, Daw ND, Berke JD. Dual credit assignment processes underlie dopamine signals in a complex spatial environment. Neuron 2023; 111:3465-3478.e7. [PMID: 37611585 PMCID: PMC10841332 DOI: 10.1016/j.neuron.2023.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/23/2023] [Accepted: 07/25/2023] [Indexed: 08/25/2023]
Abstract
Animals frequently make decisions based on expectations of future reward ("values"). Values are updated by ongoing experience: places and choices that result in reward are assigned greater value. Yet, the specific algorithms used by the brain for such credit assignment remain unclear. We monitored accumbens dopamine as rats foraged for rewards in a complex, changing environment. We observed brief dopamine pulses both at reward receipt (scaling with prediction error) and at novel path opportunities. Dopamine also ramped up as rats ran toward reward ports, in proportion to the value at each location. By examining the evolution of these dopamine place-value signals, we found evidence for two distinct update processes: progressive propagation of value along taken paths, as in temporal difference learning, and inference of value throughout the maze, using internal models. Our results demonstrate that within rich, naturalistic environments dopamine conveys place values that are updated via multiple, complementary learning algorithms.
Collapse
Affiliation(s)
- Timothy A Krausz
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Alison E Comrie
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ari E Kahn
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, Princeton, Princeton, NJ 08544, USA
| | - Loren M Frank
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA; Department of Physiology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Nathaniel D Daw
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, Princeton, Princeton, NJ 08544, USA
| | - Joshua D Berke
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA; Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Neurology and Department of Psychiatry and Behavioral Science, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
7
|
Mehrotra D, Dubé L. Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Front Neurosci 2023; 17:1200842. [PMID: 37732307 PMCID: PMC10508350 DOI: 10.3389/fnins.2023.1200842] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023] Open
Abstract
For adaptive real-time behavior in real-world contexts, the brain needs to allow past information over multiple timescales to influence current processing for making choices that create the best outcome as a person goes about making choices in their everyday life. The neuroeconomics literature on value-based decision-making has formalized such choice through reinforcement learning models for two extreme strategies. These strategies are model-free (MF), which is an automatic, stimulus-response type of action, and model-based (MB), which bases choice on cognitive representations of the world and causal inference on environment-behavior structure. The emphasis of examining the neural substrates of value-based decision making has been on the striatum and prefrontal regions, especially with regards to the "here and now" decision-making. Yet, such a dichotomy does not embrace all the dynamic complexity involved. In addition, despite robust research on the role of the hippocampus in memory and spatial learning, its contribution to value-based decision making is just starting to be explored. This paper aims to better appreciate the role of the hippocampus in decision-making and advance the successor representation (SR) as a candidate mechanism for encoding state representations in the hippocampus, separate from reward representations. To this end, we review research that relates hippocampal sequences to SR models showing that the implementation of such sequences in reinforcement learning agents improves their performance. This also enables the agents to perform multiscale temporal processing in a biologically plausible manner. Altogether, we articulate a framework to advance current striatal and prefrontal-focused decision making to better account for multiscale mechanisms underlying various real-world time-related concepts such as the self that cumulates over a person's life course.
Collapse
Affiliation(s)
- Dhruv Mehrotra
- Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
- Montréal Neurological Institute, McGill University, Montréal, QC, Canada
| | - Laurette Dubé
- Desautels Faculty of Management, McGill University, Montréal, QC, Canada
- McGill Center for the Convergence of Health and Economics, McGill University, Montréal, QC, Canada
| |
Collapse
|
8
|
Krausz TA, Comrie AE, Frank LM, Daw ND, Berke JD. Dual credit assignment processes underlie dopamine signals in a complex spatial environment. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528738. [PMID: 36993482 PMCID: PMC10054934 DOI: 10.1101/2023.02.15.528738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Dopamine in the nucleus accumbens helps motivate behavior based on expectations of future reward ("values"). These values need to be updated by experience: after receiving reward, the choices that led to reward should be assigned greater value. There are multiple theoretical proposals for how this credit assignment could be achieved, but the specific algorithms that generate updated dopamine signals remain uncertain. We monitored accumbens dopamine as freely behaving rats foraged for rewards in a complex, changing environment. We observed brief pulses of dopamine both when rats received reward (scaling with prediction error), and when they encountered novel path opportunities. Furthermore, dopamine ramped up as rats ran towards reward ports, in proportion to the value at each location. By examining the evolution of these dopamine place-value signals, we found evidence for two distinct update processes: progressive propagation along taken paths, as in temporal-difference learning, and inference of value throughout the maze, using internal models. Our results demonstrate that within rich, naturalistic environments dopamine conveys place values that are updated via multiple, complementary learning algorithms.
Collapse
Affiliation(s)
- Timothy A Krausz
- Neuroscience Graduate Program, University of California, San Francisco
| | - Alison E Comrie
- Neuroscience Graduate Program, University of California, San Francisco
| | - Loren M Frank
- Neuroscience Graduate Program, University of California, San Francisco
- Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, UCSF
- Howard Hughes Medical Institute
- Department of Physiology, UCSF
| | - Nathaniel D Daw
- Department of Psychology, and Princeton Neuroscience Institute, Princeton University, NJ
| | - Joshua D Berke
- Neuroscience Graduate Program, University of California, San Francisco
- Kavli Institute for Fundamental Neuroscience, and Weill Institute for Neurosciences, UCSF
- Department of Neurology, and Department of Psychiatry and Behavioral Science, UCSF
| |
Collapse
|
9
|
Rens N, Lancia GL, Eluchans M, Schwartenbeck P, Cunnington R, Pezzulo G. Evidence for entropy maximisation in human free choice behaviour. Cognition 2023; 232:105328. [PMID: 36463639 DOI: 10.1016/j.cognition.2022.105328] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 11/10/2022] [Accepted: 11/12/2022] [Indexed: 12/05/2022]
Abstract
The freedom to choose between options is strongly linked to notions of free will. Accordingly, several studies have shown that individuals demonstrate a preference for choice, or the availability of multiple options, over and above utilitarian value. Yet we lack a decision-making framework that integrates preference for choice with traditional utility maximisation in free choice behaviour. Here we test the predictions of an inference-based model of decision-making in which an agent actively seeks states yielding entropy (availability of options) in addition to utility (economic reward). We designed a study in which participants freely navigated a virtual environment consisting of two consecutive choices leading to reward locations in separate rooms. Critically, the choice of one room always led to two final doors while, in the second room, only one door was permissible to choose. This design allowed us to separately determine the influence of utility and entropy on participants' choice behaviour and their self-evaluation of free will. We found that choice behaviour was better predicted by an inference-based model than by expected utility alone, and that both the availability of options and the value of the context positively influenced participants' perceived freedom of choice. Moreover, this consideration of options was apparent in the ongoing motion dynamics as individuals navigated the environment. In a second study, in which participants selected between rooms that gave access to three or four doors, we observed a similar pattern of results, with participants preferring the room that gave access to more options and feeling freer in it. These results suggest that free choice behaviour is well explained by an inference-based framework in which both utility and entropy are optimised and supports the idea that the feeling of having free will is tightly related to options availability.
Collapse
Affiliation(s)
- Natalie Rens
- Queensland Brain Institute, The University of Queensland, St Lucia, Queensland 4072, Australia
| | - Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Via S. Martino della Battaglia, 44, 00185 Rome, Italy; University of Rome "La Sapienza", Rome, Italy
| | - Mattia Eluchans
- Institute of Cognitive Sciences and Technologies, National Research Council, Via S. Martino della Battaglia, 44, 00185 Rome, Italy; University of Rome "La Sapienza", Rome, Italy
| | - Philipp Schwartenbeck
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom; Oxford Centre for Functional MRI of the Brain, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom; University of Tübingen, Tübingen, Germany; Max Planck Institute for Biological Cybernetics, Tübingen, Baden-Württemberg, Germany
| | - Ross Cunnington
- School of Psychology, The University of Queensland, St Lucia, Queensland 4072, Australia
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Via S. Martino della Battaglia, 44, 00185 Rome, Italy.
| |
Collapse
|
10
|
Zhao Y, Wang D, Wang X, Chiu SC. Brain mechanisms underlying the influence of emotions on spatial decision-making: An EEG study. Front Neurosci 2022; 16:989988. [PMID: 36248638 PMCID: PMC9562092 DOI: 10.3389/fnins.2022.989988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 08/31/2022] [Indexed: 11/25/2022] Open
Abstract
It is common for people to make bad decisions because of their emotions in life. When these decisions are important, such as aeronautical decisions and driving decisions, the mistakes of decisions can cause irreversible damage. Therefore, it is important to explore how emotions influence decision-making, so as to avoid the negative influence of emotions on decision-making as much as possible. Although existing researchers have found some mechanisms of emotion's influence on decision-making, only a few studies focused on the influence of emotions on decision-making based on electroencephalography (EEG). In addition, most of them were focused on risky and uncertain decision-making. We designed a novel experimental task to explore the influence of emotion on spatial decision-making and recorded subjective data, decision-making behavioral data, and EEG data. By analyzing these data, we came to three conclusions. Firstly, we observed three similar event-related potentials (ERP) microstates in the decision-making process under different emotions by microstate analysis. Additionally, the prefrontal, parietal and occipital lobes played key roles in decision-making. Secondly, we found that the P2 component of the prefrontal lobe presented the influence of different emotions on decision-making by ERP analysis. Among them, positive emotion evoked the largest P2 amplitude compared to negative emotions and no stimuli. Thirdly, we found some graph metrics that were significantly associated with decision accuracy by effective connectivity analysis combined with graph theoretic analysis. In consequence, the finding of our study may shed more light on the brain mechanisms underlying the influence of emotions on spatial decision-making, thereby providing a basis for avoiding decision-making accidents caused by emotions and realizing better decision-making.
Collapse
Affiliation(s)
- Yanyan Zhao
- State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Danli Wang
- State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- *Correspondence: Danli Wang
| | - Xinyuan Wang
- State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Steve C. Chiu
- ECE Department, Idaho State University, Pocatello, ID, United States
| |
Collapse
|
11
|
de Cothi W, Nyberg N, Griesbauer EM, Ghanamé C, Zisch F, Lefort JM, Fletcher L, Newton C, Renaudineau S, Bendor D, Grieves R, Duvelle É, Barry C, Spiers HJ. Predictive maps in rats and humans for spatial navigation. Curr Biol 2022; 32:3676-3689.e5. [PMID: 35863351 PMCID: PMC9616735 DOI: 10.1016/j.cub.2022.06.090] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 05/19/2022] [Accepted: 06/29/2022] [Indexed: 11/25/2022]
Abstract
Much of our understanding of navigation comes from the study of individual species, often with specific tasks tailored to those species. Here, we provide a novel experimental and analytic framework integrating across humans, rats, and simulated reinforcement learning (RL) agents to interrogate the dynamics of behavior during spatial navigation. We developed a novel open-field navigation task ("Tartarus maze") requiring dynamic adaptation (shortcuts and detours) to frequently changing obstructions on the path to a hidden goal. Humans and rats were remarkably similar in their trajectories. Both species showed the greatest similarity to RL agents utilizing a "successor representation," which creates a predictive map. Humans also displayed trajectory features similar to model-based RL agents, which implemented an optimal tree-search planning procedure. Our results help refine models seeking to explain mammalian navigation in dynamic environments and highlight the utility of modeling the behavior of different species to uncover the shared mechanisms that support behavior.
Collapse
Affiliation(s)
- William de Cothi
- Department of Cell and Developmental Biology, University College London, London, UK; Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| | - Nils Nyberg
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Eva-Maria Griesbauer
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Carole Ghanamé
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Fiona Zisch
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; The Bartlett School of Architecture, University College London, London, UK
| | - Julie M Lefort
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Lydia Fletcher
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Coco Newton
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Sophie Renaudineau
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Daniel Bendor
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Roddy Grieves
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Éléonore Duvelle
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Caswell Barry
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Hugo J Spiers
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| |
Collapse
|
12
|
A comparison of reinforcement learning models of human spatial navigation. Sci Rep 2022; 12:13923. [PMID: 35978035 PMCID: PMC9385652 DOI: 10.1038/s41598-022-18245-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 08/08/2022] [Indexed: 11/09/2022] Open
Abstract
Reinforcement learning (RL) models have been influential in characterizing human learning and decision making, but few studies apply them to characterizing human spatial navigation and even fewer systematically compare RL models under different navigation requirements. Because RL can characterize one's learning strategies quantitatively and in a continuous manner, and one's consistency of using such strategies, it can provide a novel and important perspective for understanding the marked individual differences in human navigation and disentangle navigation strategies from navigation performance. One-hundred and fourteen participants completed wayfinding tasks in a virtual environment where different phases manipulated navigation requirements. We compared performance of five RL models (3 model-free, 1 model-based and 1 "hybrid") at fitting navigation behaviors in different phases. Supporting implications from prior literature, the hybrid model provided the best fit regardless of navigation requirements, suggesting the majority of participants rely on a blend of model-free (route-following) and model-based (cognitive mapping) learning in such navigation scenarios. Furthermore, consistent with a key prediction, there was a correlation in the hybrid model between the weight on model-based learning (i.e., navigation strategy) and the navigator's exploration vs. exploitation tendency (i.e., consistency of using such navigation strategy), which was modulated by navigation task requirements. Together, we not only show how computational findings from RL align with the spatial navigation literature, but also reveal how the relationship between navigation strategy and a person's consistency using such strategies changes as navigation requirements change.
Collapse
|
13
|
Fermin ASR, Friston K, Yamawaki S. An insula hierarchical network architecture for active interoceptive inference. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220226. [PMID: 35774133 PMCID: PMC9240682 DOI: 10.1098/rsos.220226] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 06/09/2022] [Indexed: 05/05/2023]
Abstract
In the brain, the insular cortex receives a vast amount of interoceptive information, ascending through deep brain structures, from multiple visceral organs. The unique hierarchical and modular architecture of the insula suggests specialization for processing interoceptive afferents. Yet, the biological significance of the insula's neuroanatomical architecture, in relation to deep brain structures, remains obscure. In this opinion piece, we propose the Insula Hierarchical Modular Adaptive Interoception Control (IMAC) model to suggest that insula modules (granular, dysgranular and agranular), forming parallel networks with the prefrontal cortex and striatum, are specialized to form higher order interoceptive representations. These interoceptive representations are recruited in a context-dependent manner to support habitual, model-based and exploratory control of visceral organs and physiological processes. We discuss how insula interoceptive representations may give rise to conscious feelings that best explain lower order deep brain interoceptive representations, and how the insula may serve to defend the body and mind against pathological depression.
Collapse
Affiliation(s)
- Alan S. R. Fermin
- Center for Brain, Mind and Kansei Sciences Research, Hiroshima University, Hiroshima, Japan
| | - Karl Friston
- The Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, England
| | - Shigeto Yamawaki
- Center for Brain, Mind and Kansei Sciences Research, Hiroshima University, Hiroshima, Japan
| |
Collapse
|
14
|
Zhu S, Lakshminarasimhan KJ, Arfaei N, Angelaki DE. Eye movements reveal spatiotemporal dynamics of visually-informed planning in navigation. eLife 2022; 11:e73097. [PMID: 35503099 PMCID: PMC9135400 DOI: 10.7554/elife.73097] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 05/01/2022] [Indexed: 11/28/2022] Open
Abstract
Goal-oriented navigation is widely understood to depend upon internal maps. Although this may be the case in many settings, humans tend to rely on vision in complex, unfamiliar environments. To study the nature of gaze during visually-guided navigation, we tasked humans to navigate to transiently visible goals in virtual mazes of varying levels of difficulty, observing that they took near-optimal trajectories in all arenas. By analyzing participants' eye movements, we gained insights into how they performed visually-informed planning. The spatial distribution of gaze revealed that environmental complexity mediated a striking trade-off in the extent to which attention was directed towards two complimentary aspects of the world model: the reward location and task-relevant transitions. The temporal evolution of gaze revealed rapid, sequential prospection of the future path, evocative of neural replay. These findings suggest that the spatiotemporal characteristics of gaze during navigation are significantly shaped by the unique cognitive computations underlying real-world, sequential decision making.
Collapse
Affiliation(s)
- Seren Zhu
- Center for Neural Science, New York UniversityNew YorkUnited States
| | | | - Nastaran Arfaei
- Department of Psychology, New York UniversityNew YorkUnited States
| | - Dora E Angelaki
- Center for Neural Science, New York UniversityNew YorkUnited States
- Department of Mechanical and Aerospace Engineering, New York UniversityNew YorkUnited States
| |
Collapse
|
15
|
Lei Y, Solway A. Conflict and competition between model-based and model-free control. PLoS Comput Biol 2022; 18:e1010047. [PMID: 35511764 PMCID: PMC9070915 DOI: 10.1371/journal.pcbi.1010047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 03/22/2022] [Indexed: 11/25/2022] Open
Abstract
A large literature has accumulated suggesting that human and animal decision making is driven by at least two systems, and that important functions of these systems can be captured by reinforcement learning algorithms. The "model-free" system caches and uses stimulus-value or stimulus-response associations, and the "model-based" system implements more flexible planning using a model of the world. However, it is not clear how the two systems interact during deliberation and how a single decision emerges from this process, especially when they disagree. Most previous work has assumed that while the systems operate in parallel, they do so independently, and they combine linearly to influence decisions. Using an integrated reinforcement learning/drift-diffusion model, we tested the hypothesis that the two systems interact in a non-linear fashion similar to other situations with cognitive conflict. We differentiated two forms of conflict: action conflict, a binary state representing whether the systems disagreed on the best action, and value conflict, a continuous measure of the extent to which the two systems disagreed on the difference in value between the available options. We found that decisions with greater value conflict were characterized by reduced model-based control and increased caution both with and without action conflict. Action conflict itself (the binary state) acted in the opposite direction, although its effects were less prominent. We also found that between-system conflict was highly correlated with within-system conflict, and although it is less clear a priori why the latter might influence the strength of each system above its standard linear contribution, we could not rule it out. Our work highlights the importance of non-linear conflict effects, and provides new constraints for more detailed process models of decision making. It also presents new avenues to explore with relation to disorders of compulsivity, where an imbalance between systems has been implicated.
Collapse
Affiliation(s)
- Yuqing Lei
- Department of Psychology, University of Maryland-College Park, College Park, Maryland, United States of America
| | - Alec Solway
- Department of Psychology, University of Maryland-College Park, College Park, Maryland, United States of America
- Program in Neuroscience and Cognitive Science, University of Maryland-College Park, College Park, Maryland, United States of America
| |
Collapse
|
16
|
Goodroe SC, Spiers HJ. Extending neural systems for navigation to hunting behavior. Curr Opin Neurobiol 2022; 73:102545. [PMID: 35483308 DOI: 10.1016/j.conb.2022.102545] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 03/18/2022] [Accepted: 03/21/2022] [Indexed: 11/03/2022]
Abstract
For decades, a central question in neuroscience has been: How does the brain support navigation? Recent research on navigation has explored how brain regions support the capacity to adapt to changes in the environment and track the distance and direction to goal locations. Here, we provide a brief review of this literature and speculate how these neural systems may be involved in another, parallel behavior-hunting. Hunting shares many of the same challenges as navigation. Like navigation, hunting requires the hunter to orient towards a goal while minimizing their distance from it while traveling. Likewise, hunting may require the accommodation of detours to locate prey or the exploitation of shortcuts for a quicker capture. Recent research suggests that neurons in the periaqueductal gray, hypothalamus, and dorsal anterior cingulate play key roles in such hunting behavior. In this review, we speculate on how these regions may operate functionally with other key brain regions involved in navigation, such as the hippocampus, to support hunting. Additionally, we posit that hunting in a group presents an additional set of challenges, where success relies on multicentric tracking and prediction of prey position as well as the position of co-hunters.
Collapse
Affiliation(s)
- Sarah C Goodroe
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.
| | - Hugo J Spiers
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, United Kingdom.
| |
Collapse
|
17
|
Katayama R, Yoshida W, Ishii S. Confidence modulates the decodability of scene prediction during partially-observable maze exploration in humans. Commun Biol 2022; 5:367. [PMID: 35440615 PMCID: PMC9018866 DOI: 10.1038/s42003-022-03314-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 03/23/2022] [Indexed: 11/23/2022] Open
Abstract
Prediction ability often involves some degree of uncertainty-a key determinant of confidence. Here, we sought to assess whether predictions are decodable in partially-observable environments where one's state is uncertain, and whether this information is sensitive to confidence produced by such uncertainty. We used functional magnetic resonance imaging-based, partially-observable maze navigation tasks in which subjects predicted upcoming scenes and reported their confidence regarding these predictions. Using a multi-voxel pattern analysis, we successfully decoded both scene predictions and subjective confidence from activities in the localized parietal and prefrontal regions. We also assessed confidence in their beliefs about where they were in the maze. Importantly, prediction decodability varied according to subjective scene confidence in the superior parietal lobule and state confidence estimated by the behavioral model in the inferior parietal lobule. These results demonstrate that prediction in uncertain environments depends on the prefrontal-parietal network within which prediction and confidence interact.
Collapse
Affiliation(s)
- Risa Katayama
- Graduate School of Informatics, Kyoto University, Kyoto, Kyoto, 606-8501, Japan.
| | - Wako Yoshida
- Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, OX3 9DU, UK
- Department of Neural Computation for Decision-making, Advanced Telecommunications Research Institute International, Soraku-gun, Kyoto, 619-0288, Japan
| | - Shin Ishii
- Graduate School of Informatics, Kyoto University, Kyoto, Kyoto, 606-8501, Japan
- Neural Information Analysis Laboratories, Advanced Telecommunications Research Institute International, Soraku-gun, Kyoto, 619-0288, Japan
- International Research Center for Neurointelligence, The University of Tokyo, Bunkyo-ku, Tokyo, 113-0033, Japan
| |
Collapse
|
18
|
Ott F, Legler E, Kiebel SJ. Forward planning driven by context-dependant conflict processing in anterior cingulate cortex. Neuroimage 2022; 256:119222. [PMID: 35447352 DOI: 10.1016/j.neuroimage.2022.119222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 03/08/2022] [Accepted: 04/16/2022] [Indexed: 11/17/2022] Open
Abstract
Cognitive control and forward planning in particular is costly, and therefore must be regulated such that the amount of cognitive resources invested is adequate to the current situation. However, knowing in advance how beneficial forward planning will be in a given situation is hard. A way to know the exact value of planning would be to actually do it, which would ab initio defeat the purpose of regulating planning, i.e. the reduction of computational and time costs. One possible solution to this dilemma is that planning is regulated by learned associations between stimuli and the expected demand for planning. Such learning might be based on generalisation processes that cluster together stimulus states with similar control relevant properties into more general control contexts. In this way, the brain could infer the demand for planning, based on previous experience with situations that share some structural properties with the current situation. Here, we used a novel sequential task to test the hypothesis that people use control contexts to efficiently regulate their forward planning, using behavioural and functional magnetic resonance imaging data. Consistent with our hypothesis, reaction times increased with trial-by-trial conflict, where this increase was more pronounced in a context with a learned high demand for planning. Similarly, we found that fMRI activity in the dorsal anterior cingulate cortex (dACC) increased with conflict, and this increase was more pronounced in a context with generally high demand for planning. Taken together, the results indicate that the dACC integrates representations of planning demand at different levels of abstraction to regulate planning in an efficient and situation-appropriate way.
Collapse
Affiliation(s)
- Florian Ott
- Department of Psychology, Technische Universität Dresden, Dresden, Germany.
| | - Eric Legler
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Stefan J Kiebel
- Department of Psychology, Technische Universität Dresden, Dresden, Germany; Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
19
|
Abstract
Recent breakthroughs in artificial intelligence (AI) have enabled machines to plan in tasks previously thought to be uniquely human. Meanwhile, the planning algorithms implemented by the brain itself remain largely unknown. Here, we review neural and behavioral data in sequential decision-making tasks that elucidate the ways in which the brain does-and does not-plan. To systematically review available biological data, we create a taxonomy of planning algorithms by summarizing the relevant design choices for such algorithms in AI. Across species, recording techniques, and task paradigms, we find converging evidence that the brain represents future states consistent with a class of planning algorithms within our taxonomy-focused, depth-limited, and serial. However, we argue that current data are insufficient for addressing more detailed algorithmic questions. We propose a new approach leveraging AI advances to drive experiments that can adjudicate between competing candidate algorithms.
Collapse
|
20
|
Brunec IK, Momennejad I. Predictive Representations in Hippocampal and Prefrontal Hierarchies. J Neurosci 2022; 42:299-312. [PMID: 34799416 PMCID: PMC8802932 DOI: 10.1523/jneurosci.1327-21.2021] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 10/19/2021] [Accepted: 10/22/2021] [Indexed: 11/21/2022] Open
Abstract
As we navigate the world, we use learned representations of relational structures to explore and to reach goals. Studies of how relational knowledge enables inference and planning are typically conducted in controlled small-scale settings. It remains unclear, however, how people use stored knowledge in continuously unfolding navigation (e.g., walking long distances in a city). We hypothesized that multiscale predictive representations guide naturalistic navigation in humans, and these scales are organized along posterior-anterior prefrontal and hippocampal hierarchies. We conducted model-based representational similarity analyses of neuroimaging data collected while male and female participants navigated realistically long paths in virtual reality. We tested the pattern similarity of each point, along each path, to a weighted sum of its successor points within predictive horizons of different scales. We found that anterior PFC showed the largest predictive horizons, posterior hippocampus the smallest, with the anterior hippocampus and orbitofrontal regions in between. Our findings offer novel insights into how cognitive maps support hierarchical planning at multiple scales.SIGNIFICANCE STATEMENT Whenever we navigate the world, we represent our journey at multiple horizons: from our immediate surroundings to our distal goal. How are such cognitive maps at different horizons simultaneously represented in the brain? Here, we applied a reinforcement learning-based analysis to neuroimaging data acquired while participants virtually navigated their hometown. We investigated neural patterns in the hippocampus and PFC, key cognitive map regions. We uncovered predictive representations with multiscale horizons in prefrontal and hippocampal gradients, with the longest predictive horizons in anterior PFC and the shortest in posterior hippocampus. These findings provide empirical support for the computational hypothesis that multiscale neural representations guide goal-directed navigation. This advances our understanding of hierarchical planning in everyday navigation of realistic distances.
Collapse
Affiliation(s)
- Iva K Brunec
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | | |
Collapse
|
21
|
Kim D, Jeong J, Lee SW. Prefrontal solution to the bias-variance tradeoff during reinforcement learning. Cell Rep 2021; 37:110185. [PMID: 34965420 DOI: 10.1016/j.celrep.2021.110185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/09/2021] [Accepted: 12/07/2021] [Indexed: 11/17/2022] Open
Abstract
Evidence that the brain combines different value learning strategies to minimize prediction error is accumulating. However, the tradeoff between bias and variance error, which imposes different constraints on each learning strategy's performance, poses a challenge for value learning. While this tradeoff specifies the requirements for optimal learning, little has been known about how the brain deals with this issue. Here, we hypothesize that the brain adaptively resolves the bias-variance tradeoff during reinforcement learning. Our theory suggests that the solution necessitates baseline correction for prediction error, which offsets the adverse effects of irreducible error on value learning. We show behavioral evidence of adaptive control using a Markov decision task with context changes. The prediction error baseline seemingly signals context changes to improve adaptability. Critically, we identify multiplexed representations of prediction error baseline within the ventrolateral and ventromedial prefrontal cortex, key brain regions known to guide model-based and model-free reinforcement learning.
Collapse
Affiliation(s)
- Dongjae Kim
- Center for Neural Science, New York University, New York, NY, USA; Department of Psychology, New York University, New York, NY, USA
| | - Jaeseung Jeong
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea; Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea
| | - Sang Wan Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea; Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea; KAIST Center for Neuroscience-inspired AI, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea; KI for Health Science and Technology, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea; KI for Artificial Intelligence, Korea Advanced Institute of Science and Technology (KAIST), 34141 Daejeon, Republic of Korea.
| |
Collapse
|
22
|
Real-time processes in the development of action planning. Curr Biol 2021; 32:190-199.e3. [PMID: 34883048 DOI: 10.1016/j.cub.2021.11.018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 09/27/2021] [Accepted: 11/08/2021] [Indexed: 11/22/2022]
Abstract
Across species and ages, planning multi-step actions is a hallmark of intelligence and critical for survival. Traditionally, researchers adopt a "top-down" approach to action planning by focusing on the ability to create an internal representation of the world that guides the next step in a multi-step action. However, a top-down approach does not inform on underlying mechanisms, so researchers can only speculate about how and why improvements in planning occur. The current study takes a "bottom-up" approach by testing developmental changes in the real-time, moment-to-moment interplay among perceptual, neural, and motor components of action planning using simultaneous video, motion-tracking, head-mounted eye tracking, and electroencephalography (EEG). Preschoolers (n = 32) and adults (n = 22) grasped a hammer with their dominant hand to pound a peg when the hammer handle pointed in different directions. When the handle pointed toward their non-dominant hand, younger children ("nonadaptive planners") used a habitual overhand grip that interfered with wielding the hammer, whereas adults and older children ("adaptive planners") used an adaptive underhand grip. Adaptive and nonadaptive children differed in when and where they directed their gaze to obtain visual information, neural activation of the motor system before reaching, and straightness of their reach trajectories. Nonadaptive children immediately used a habitual overhand grip before gathering visual information, leaving insufficient time to form a plan before acting. Our novel bottom-up approach transcends mere speculation by providing converging evidence that the development of action planning depends on a real-time "tug of war" between habits and information gathering and processing.
Collapse
|
23
|
Liakoni V, Lehmann MP, Modirshanechi A, Brea J, Lutti A, Gerstner W, Preuschoff K. Brain signals of a Surprise-Actor-Critic model: Evidence for multiple learning modules in human decision making. Neuroimage 2021; 246:118780. [PMID: 34875383 DOI: 10.1016/j.neuroimage.2021.118780] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 08/03/2021] [Accepted: 12/04/2021] [Indexed: 11/25/2022] Open
Abstract
Learning how to reach a reward over long series of actions is a remarkable capability of humans, and potentially guided by multiple parallel learning modules. Current brain imaging of learning modules is limited by (i) simple experimental paradigms, (ii) entanglement of brain signals of different learning modules, and (iii) a limited number of computational models considered as candidates for explaining behavior. Here, we address these three limitations and (i) introduce a complex sequential decision making task with surprising events that allows us to (ii) dissociate correlates of reward prediction errors from those of surprise in functional magnetic resonance imaging (fMRI); and (iii) we test behavior against a large repertoire of model-free, model-based, and hybrid reinforcement learning algorithms, including a novel surprise-modulated actor-critic algorithm. Surprise, derived from an approximate Bayesian approach for learning the world-model, is extracted in our algorithm from a state prediction error. Surprise is then used to modulate the learning rate of a model-free actor, which itself learns via the reward prediction error from model-free value estimation by the critic. We find that action choices are well explained by pure model-free policy gradient, but reaction times and neural data are not. We identify signatures of both model-free and surprise-based learning signals in blood oxygen level dependent (BOLD) responses, supporting the existence of multiple parallel learning modules in the brain. Our results extend previous fMRI findings to a multi-step setting and emphasize the role of policy gradient and surprise signalling in human learning.
Collapse
Affiliation(s)
- Vasiliki Liakoni
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland.
| | - Marco P Lehmann
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Alireza Modirshanechi
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Johanni Brea
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Antoine Lutti
- Laboratoire de recherche en neuroimagerie (LREN), Department of Clinical Neurosciences, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Wulfram Gerstner
- École Polytechnique Fédérale de Lausanne (EPFL), School of Computer and Communication Sciences and School of Life Sciences, Lausanne, Switzerland
| | - Kerstin Preuschoff
- Geneva Finance Research Institute & Interfaculty Center for Affective Sciences, University of Geneva, Geneva, Switzerland
| |
Collapse
|
24
|
The Neural Instantiation of an Abstract Cognitive Map for Economic Choice. Neuroscience 2021; 477:106-114. [PMID: 34543674 DOI: 10.1016/j.neuroscience.2021.09.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 09/01/2021] [Accepted: 09/09/2021] [Indexed: 11/24/2022]
Abstract
Since the discovery of cognitive maps in rodent hippocampus (HC), the cognitive map has evolved from originally referring to spatial representations encoding locations and objects in Euclidean spaces to a general low-dimensional organization of information along selected feature dimensions. A cognitive map includes hypothetical constructs that bridge between environmental stimuli and the final overt behavior. To neuroeconomists, utility and utility functions are such constructs with neurobiological basis that drive choice behavior. Emergence of distinct functional neuron groups in the primate orbitofrontal cortex (OFC) during simple economic choice indicates the formation of an abstract cognitive map for organizing information of goods for value computation. Experimental evidence suggests that organization of neuronal activity in such cognitive map reflects the abstraction of core task features. Thus, such map can be adapted to accommodate economic choices under various task contexts.
Collapse
|
25
|
|
26
|
Duvelle É, Grieves RM, Liu A, Jedidi-Ayoub S, Holeniewska J, Harris A, Nyberg N, Donnarumma F, Lefort JM, Jeffery KJ, Summerfield C, Pezzulo G, Spiers HJ. Hippocampal place cells encode global location but not connectivity in a complex space. Curr Biol 2021; 31:1221-1233.e9. [PMID: 33581073 PMCID: PMC7988036 DOI: 10.1016/j.cub.2021.01.005] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 12/22/2020] [Accepted: 01/05/2021] [Indexed: 11/20/2022]
Abstract
Flexible navigation relies on a cognitive map of space, thought to be implemented by hippocampal place cells: neurons that exhibit location-specific firing. In connected environments, optimal navigation requires keeping track of one's location and of the available connections between subspaces. We examined whether the dorsal CA1 place cells of rats encode environmental connectivity in four geometrically identical boxes arranged in a square. Rats moved between boxes by pushing saloon-type doors that could be locked in one or both directions. Although rats demonstrated knowledge of environmental connectivity, their place cells did not respond to connectivity changes, nor did they represent doorways differently from other locations. Place cells coded location in a global reference frame, with a different map for each box and minimal repetitive fields despite the repetitive geometry. These results suggest that CA1 place cells provide a spatial map that does not explicitly include connectivity.
Collapse
Affiliation(s)
- Éléonore Duvelle
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| | - Roddy M Grieves
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Anyi Liu
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK
| | - Selim Jedidi-Ayoub
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK
| | - Joanna Holeniewska
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK
| | - Adam Harris
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, UK
| | - Nils Nyberg
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK
| | - Francesco Donnarumma
- Institute of Cognitive Sciences and Technologies, National Research Council, via S. Martino d. Battaglia 44, 00185 Rome, Italy
| | - Julie M Lefort
- University College London, Department of Cell and Developmental Biology, London, UK
| | - Kate J Jeffery
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK
| | | | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, via S. Martino d. Battaglia 44, 00185 Rome, Italy
| | - Hugo J Spiers
- Department of Experimental Psychology, Institute of Behavioural Neuroscience, University College London, London, UK.
| |
Collapse
|
27
|
Patai EZ, Spiers HJ. The Versatile Wayfinder: Prefrontal Contributions to Spatial Navigation. Trends Cogn Sci 2021; 25:520-533. [PMID: 33752958 DOI: 10.1016/j.tics.2021.02.010] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 02/22/2021] [Accepted: 02/23/2021] [Indexed: 12/15/2022]
Abstract
The prefrontal cortex (PFC) supports decision-making, goal tracking, and planning. Spatial navigation is a behavior that taxes these cognitive processes, yet the role of the PFC in models of navigation has been largely overlooked. In humans, activity in dorsolateral PFC (dlPFC) and ventrolateral PFC (vlPFC) during detours, reveal a role in inhibition and replanning. Dorsal anterior cingulate cortex (dACC) is implicated in planning and spontaneous internally-generated changes of route. Orbitofrontal cortex (OFC) integrates representations of the environment with the value of actions, providing a 'map' of possible decisions. In rodents, medial frontal areas interact with hippocampus during spatial decisions and switching between navigation strategies. In reviewing these advances, we provide a framework for how different prefrontal regions may contribute to different stages of navigation.
Collapse
Affiliation(s)
- Eva Zita Patai
- Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology and Neuroscience (IoPPN), King's College London, UK; Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language sciences, University College London, UK.
| | - Hugo J Spiers
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language sciences, University College London, UK.
| |
Collapse
|
28
|
The Best Laid Plans: Computational Principles of Anterior Cingulate Cortex. Trends Cogn Sci 2021; 25:316-329. [PMID: 33593641 DOI: 10.1016/j.tics.2021.01.008] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 01/17/2021] [Accepted: 01/19/2021] [Indexed: 12/26/2022]
Abstract
Despite continual debate for the past 30 years about the function of anterior cingulate cortex (ACC), its key contribution to neurocognition remains unknown. However, recent computational modeling work has provided insight into this question. Here we review computational models that illustrate three core principles of ACC function, related to hierarchy, world models, and cost. We also discuss four constraints on the neural implementation of these principles, related to modularity, binding, encoding, and learning and regulation. These observations suggest a role for ACC in hierarchical model-based hierarchical reinforcement learning (HMB-HRL), which instantiates a mechanism motivating the execution of high-level plans.
Collapse
|
29
|
Neural Mechanisms of Human Decision-Making. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021; 21:35-57. [PMID: 33409958 DOI: 10.3758/s13415-020-00842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 11/08/2022]
Abstract
We present a theory and neural network model of the neural mechanisms underlying human decision-making. We propose a detailed model of the interaction between brain regions, under a proposer-predictor-actor-critic framework. This theory is based on detailed animal data and theories of action-selection. Those theories are adapted to serial operation to bridge levels of analysis and explain human decision-making. Task-relevant areas of cortex propose a candidate plan using fast, model-free, parallel neural computations. Other areas of cortex and medial temporal lobe can then predict likely outcomes of that plan in this situation. This optional prediction- (or model-) based computation can produce better accuracy and generalization, at the expense of speed. Next, linked regions of basal ganglia act to accept or reject the proposed plan based on its reward history in similar contexts. If that plan is rejected, the process repeats to consider a new option. The reward-prediction system acts as a critic to determine the value of the outcome relative to expectations and produce dopamine as a training signal for cortex and basal ganglia. By operating sequentially and hierarchically, the same mechanisms previously proposed for animal action-selection could explain the most complex human plans and decisions. We discuss explanations of model-based decisions, habitization, and risky behavior based on the computational model.
Collapse
|
30
|
Mollick JA, Kober H. Computational models of drug use and addiction: A review. JOURNAL OF ABNORMAL PSYCHOLOGY 2020; 129:544-555. [PMID: 32757599 PMCID: PMC7416739 DOI: 10.1037/abn0000503] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this brief review, we describe current computational models of drug-use and addiction that fall into 2 broad categories: mathematically based models that rely on computational theories, and brain-based models that link computations to brain areas or circuits. Across categories, many are models of learning and decision-making, which may be compromised in addiction. Several mathematical models take predictive coding approaches, focusing on Bayesian prediction error. Other models focus on learning processes and (traditional) prediction error. Brain-based models have incorporated prefrontal cortex, basal ganglia, and the dopamine system, based on the effects of drugs on dopamine, motivation, and executive control circuits. Several models specifically describe how behavioral control may transition from habitual to goal-directed systems, consistent with computational accounts of compromised "model-based" control. Some brain-based models have linked this to the transition of behavioral control from ventral to dorsal striatum. Overall, we propose that while computational models capture some aspects of addiction and have advanced our thinking, most have focused on the effects of drug use rather than addiction per se, most have not been tested on and/or supported by human data, and few capture multiple stages and symptoms of addiction. We conclude by suggesting a path forward for computational models of addiction. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
- Jessica A Mollick
- Clinical and Affective Neuroscience Lab, Department of Psychiatry, Yale University
| | - Hedy Kober
- Clinical and Affective Neuroscience Lab, Department of Psychiatry, Yale University
| |
Collapse
|
31
|
Abstract
The commentaries suggest many important improvements to the target article. They clearly distinguish two varieties of rationalization - the traditional "motivated reasoning" model, and the proposed representational exchange model - and show that they have distinct functions and consequences. They describe how representational exchange occurs not only by post hoc rationalization but also by ex ante rationalization and other more dynamic processes. They argue that the social benefits of representational exchange are at least as important as its direct personal benefits. Finally, they construe our search for meaning, purpose, and narrative - both individually and collectively - as a variety of representational exchange. The result is a theory of rationalization as representational exchange both wider in scope and better defined in mechanism.
Collapse
|
32
|
Huang Y, Yaple ZA, Yu R. Goal-oriented and habitual decisions: Neural signatures of model-based and model-free learning. Neuroimage 2020; 215:116834. [PMID: 32283275 DOI: 10.1016/j.neuroimage.2020.116834] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 03/03/2020] [Accepted: 04/08/2020] [Indexed: 11/26/2022] Open
Abstract
Human decision-making is mainly driven by two fundamental learning processes: a slow, deliberative, goal-directed model-based process that maps out the potential outcomes of all options and a rapid habitual model-free process that enables reflexive repetition of previously successful choices. Although many model-informed neuroimaging studies have examined the neural correlates of model-based and model-free learning, the concordant activity among these two processes remains unclear. We used quantitative meta-analyses of functional magnetic resonance imaging experiments to identify the concordant activity pertaining to model-based and model-free learning over a range of reward-related paradigms. We found that: 1) both processes yielded concordant ventral striatum activity, 2) model-based learning activated the medial prefrontal cortex and orbital frontal cortex, and 3) model-free learning specifically activated the left globus pallidus and right caudate head. Our findings suggest that model-free and model-based decision making engage overlapping yet distinct neural regions. These stereotaxic maps improve our understanding of how deliberative goal-directed and reflexive habitual learning are implemented in the brain.
Collapse
Affiliation(s)
- Yi Huang
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore
| | - Zachary A Yaple
- Department of Psychology, National University of Singapore, Singapore
| | - Rongjun Yu
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore; Department of Psychology, National University of Singapore, Singapore.
| |
Collapse
|
33
|
Franklin NT, Frank MJ. Generalizing to generalize: Humans flexibly switch between compositional and conjunctive structures during reinforcement learning. PLoS Comput Biol 2020; 16:e1007720. [PMID: 32282795 PMCID: PMC7179934 DOI: 10.1371/journal.pcbi.1007720] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 04/23/2020] [Accepted: 02/11/2020] [Indexed: 12/02/2022] Open
Abstract
Humans routinely face novel environments in which they have to generalize in order to act adaptively. However, doing so involves the non-trivial challenge of deciding which aspects of a task domain to generalize. While it is sometimes appropriate to simply re-use a learned behavior, often adaptive generalization entails recombining distinct components of knowledge acquired across multiple contexts. Theoretical work has suggested a computational trade-off in which it can be more or less useful to learn and generalize aspects of task structure jointly or compositionally, depending on previous task statistics, but it is unknown whether humans modulate their generalization strategy accordingly. Here we develop a series of navigation tasks that separately manipulate the statistics of goal values ("what to do") and state transitions ("how to do it") across contexts and assess whether human subjects generalize these task components separately or conjunctively. We find that human generalization is sensitive to the statistics of the previously experienced task domain, favoring compositional or conjunctive generalization when the task statistics are indicative of such structures, and a mixture of the two when they are more ambiguous. These results support a normative "meta-generalization" account and suggests that people not only generalize previous task components but also generalize the statistical structure most likely to support generalization.
Collapse
Affiliation(s)
- Nicholas T. Franklin
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
| | - Michael J. Frank
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
34
|
O C Jordan H, Navarro DM, Stringer SM. The formation and use of hierarchical cognitive maps in the brain: A neural network model. NETWORK (BRISTOL, ENGLAND) 2020; 31:37-141. [PMID: 32746663 DOI: 10.1080/0954898x.2020.1798531] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/21/2020] [Accepted: 07/16/2020] [Indexed: 06/11/2023]
Abstract
Many researchers have tried to model how environmental knowledge is learned by the brain and used in the form of cognitive maps. However, previous work was limited in various important ways: there was little consensus on how these cognitive maps were formed and represented, the planning mechanism was inherently limited to performing relatively simple tasks, and there was little consideration of how these mechanisms would scale up. This paper makes several significant advances. Firstly, the planning mechanism used by the majority of previous work propagates a decaying signal through the network to create a gradient that points towards the goal. However, this decaying signal limited the scale and complexity of tasks that can be solved in this manner. Here we propose several ways in which a network can can self-organize a novel planning mechanism that does not require decaying activity. We also extend this model with a hierarchical planning mechanism: a layer of cells that identify frequently-used sequences of actions and reuse them to significantly increase the efficiency of planning. We speculate that our results may explain the apparent ability of humans and animals to perform model-based planning on both small and large scales without a noticeable loss of efficiency.
Collapse
|
35
|
Gahnstrom CJ, Spiers HJ. Striatal and hippocampal contributions to flexible navigation in rats and humans. Brain Neurosci Adv 2020; 4:2398212820979772. [PMID: 33426302 PMCID: PMC7755934 DOI: 10.1177/2398212820979772] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 11/16/2020] [Indexed: 12/13/2022] Open
Abstract
The hippocampus has been firmly established as playing a crucial role in flexible navigation. Recent evidence suggests that dorsal striatum may also play an important role in such goal-directed behaviour in both rodents and humans. Across recent studies, activity in the caudate nucleus has been linked to forward planning and adaptation to changes in the environment. In particular, several human neuroimaging studies have found the caudate nucleus tracks information traditionally associated with that by the hippocampus. In this brief review, we examine this evidence and argue the dorsal striatum encodes the transition structure of the environment during flexible, goal-directed behaviour. We highlight that future research should explore the following: (1) Investigate neural responses during spatial navigation via a biophysically plausible framework explained by reinforcement learning models and (2) Observe the interaction between cortical areas and both the dorsal striatum and hippocampus during flexible navigation.
Collapse
Affiliation(s)
- Christoffer J. Gahnstrom
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Hugo J. Spiers
- Institute of Behavioural Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| |
Collapse
|
36
|
Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning. Nat Commun 2019; 10:5738. [PMID: 31844060 PMCID: PMC6915739 DOI: 10.1038/s41467-019-13632-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 11/11/2019] [Indexed: 12/11/2022] Open
Abstract
It has previously been shown that the relative reliability of model-based and model-free reinforcement-learning (RL) systems plays a role in the allocation of behavioral control between them. However, the role of task complexity in the arbitration between these two strategies remains largely unknown. Here, using a combination of novel task design, computational modelling, and model-based fMRI analysis, we examined the role of task complexity alongside state-space uncertainty in the arbitration process. Participants tended to increase model-based RL control in response to increasing task complexity. However, they resorted to model-free RL when both uncertainty and task complexity were high, suggesting that these two variables interact during the arbitration process. Computational fMRI revealed that task complexity interacts with neural representations of the reliability of the two systems in the inferior prefrontal cortex. The brain dynamically arbitrates between two model-based and model-free reinforcement learning (RL). Here, the authors show that participants tended to increase model-based control in response to increasing task complexity, but resorted to model-free when both uncertainty and task complexity were high.
Collapse
|
37
|
Rusu SI, Pennartz CMA. Learning, memory and consolidation mechanisms for behavioral control in hierarchically organized cortico-basal ganglia systems. Hippocampus 2019; 30:73-98. [PMID: 31617622 PMCID: PMC6972576 DOI: 10.1002/hipo.23167] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 09/09/2019] [Accepted: 09/11/2019] [Indexed: 01/05/2023]
Abstract
This article aims to provide a synthesis on the question how brain structures cooperate to accomplish hierarchically organized behaviors, characterized by low‐level, habitual routines nested in larger sequences of planned, goal‐directed behavior. The functioning of a connected set of brain structures—prefrontal cortex, hippocampus, striatum, and dopaminergic mesencephalon—is reviewed in relation to two important distinctions: (a) goal‐directed as opposed to habitual behavior and (b) model‐based and model‐free learning. Recent evidence indicates that the orbitomedial prefrontal cortices not only subserve goal‐directed behavior and model‐based learning, but also code the “landscape” (task space) of behaviorally relevant variables. While the hippocampus stands out for its role in coding and memorizing world state representations, it is argued to function in model‐based learning but is not required for coding of action–outcome contingencies, illustrating that goal‐directed behavior is not congruent with model‐based learning. While the dorsolateral and dorsomedial striatum largely conform to the dichotomy between habitual versus goal‐directed behavior, ventral striatal functions go beyond this distinction. Next, we contextualize findings on coding of reward‐prediction errors by ventral tegmental dopamine neurons to suggest a broader role of mesencephalic dopamine cells, viz. in behavioral reactivity and signaling unexpected sensory changes. We hypothesize that goal‐directed behavior is hierarchically organized in interconnected cortico‐basal ganglia loops, where a limbic‐affective prefrontal‐ventral striatal loop controls action selection in a dorsomedial prefrontal–striatal loop, which in turn regulates activity in sensorimotor‐dorsolateral striatal circuits. This structure for behavioral organization requires alignment with mechanisms for memory formation and consolidation. We propose that frontal corticothalamic circuits form a high‐level loop for memory processing that initiates and temporally organizes nested activities in lower‐level loops, including the hippocampus and the ripple‐associated replay it generates. The evidence on hierarchically organized behavior converges with that on consolidation mechanisms in suggesting a frontal‐to‐caudal directionality in processing control.
Collapse
Affiliation(s)
- Silviu I Rusu
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.,Research Priority Program Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Cyriel M A Pennartz
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.,Research Priority Program Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
38
|
|
39
|
Pezzulo G, Donnarumma F, Maisto D, Stoianov I. Planning at decision time and in the background during spatial navigation. Curr Opin Behav Sci 2019. [DOI: 10.1016/j.cobeha.2019.04.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
40
|
Solway A, Lohrenz T, Montague PR. Loss Aversion Correlates With the Propensity to Deploy Model-Based Control. Front Neurosci 2019; 13:915. [PMID: 31555082 PMCID: PMC6743018 DOI: 10.3389/fnins.2019.00915] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 08/16/2019] [Indexed: 11/13/2022] Open
Abstract
Reward-based decision making is thought to be driven by at least two different types of decision systems: a simple stimulus–response cache-based system which embodies the common-sense notion of “habit,” for which model-free reinforcement learning serves as a computational substrate, and a more deliberate, prospective, model-based planning system. Previous work has shown that loss aversion, a well-studied measure of how much more on average individuals weigh losses relative to gains during decision making, is reduced when participants take all possible decisions and outcomes into account including future ones, relative to when they myopically focus on the current decision. Model-based control offers a putative mechanism for implementing such foresight. Using a well-powered data set (N = 117) in which participants completed two different tasks designed to measure each of the two quantities of interest, and four models of choice data for these tasks, we found consistent evidence of a relationship between loss aversion and model-based control but in the direction opposite to that expected based on previous work: loss aversion had a positive relationship with model-based control. We did not find evidence for a relationship between either decision system and risk aversion, a related aspect of subjective utility.
Collapse
Affiliation(s)
- Alec Solway
- Virginia Tech Carilion Research Institute, Roanoke, VA, United States
| | - Terry Lohrenz
- Virginia Tech Carilion Research Institute, Roanoke, VA, United States
| | - P Read Montague
- Virginia Tech Carilion Research Institute, Roanoke, VA, United States.,Department of Physics, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States.,Wellcome Trust Centre for Neuroimaging, University College London, London, United Kingdom
| |
Collapse
|
41
|
Javadi AH, Patai EZ, Marin-Garcia E, Margolis A, Tan HRM, Kumaran D, Nardini M, Penny W, Duzel E, Dayan P, Spiers HJ. Prefrontal Dynamics Associated with Efficient Detours and Shortcuts: A Combined Functional Magnetic Resonance Imaging and Magnetoencenphalography Study. J Cogn Neurosci 2019; 31:1227-1247. [DOI: 10.1162/jocn_a_01414] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Central to the concept of the “cognitive map” is that it confers behavioral flexibility, allowing animals to take efficient detours, exploit shortcuts, and avoid alluring, but unhelpful, paths. The neural underpinnings of such naturalistic and flexible behavior remain unclear. In two neuroimaging experiments, we tested human participants on their ability to navigate to a set of goal locations in a virtual desert island riven by lava, which occasionally spread to block selected paths (necessitating detours) or receded to open new paths (affording real shortcuts or false shortcuts to be avoided). Detours activated a network of frontal regions compared with shortcuts. Activity in the right dorsolateral PFC specifically increased when participants encountered tempting false shortcuts that led along suboptimal paths that needed to be differentiated from real shortcuts. We also report modulation in event-related fields and theta power in these situations, providing insight to the temporal evolution of response to encountering detours and shortcuts. These results help inform current models as to how the brain supports navigation and planning in dynamic environments.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Peter Dayan
- Max Planck Institute for Biological Cybernetics
| | | |
Collapse
|
42
|
Vikbladh OM, Meager MR, King J, Blackmon K, Devinsky O, Shohamy D, Burgess N, Daw ND. Hippocampal Contributions to Model-Based Planning and Spatial Memory. Neuron 2019; 102:683-693.e4. [PMID: 30871859 DOI: 10.1016/j.neuron.2019.02.014] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 12/21/2018] [Accepted: 02/08/2019] [Indexed: 12/30/2022]
Abstract
Little is known about the neural mechanisms that allow humans and animals to plan actions using knowledge of task contingencies. Emerging theories hypothesize that it involves the same hippocampal mechanisms that support self-localization and memory for locations. Yet limited direct evidence supports the link between planning and the hippocampal place map. We addressed this by investigating model-based planning and place memory in healthy controls and epilepsy patients treated using unilateral anterior temporal lobectomy with hippocampal resection. Both functions were impaired in the patient group. Specifically, the planning impairment was related to right hippocampal lesion size, controlling for overall lesion size. Furthermore, although planning and boundary-driven place memory covaried in the control group, this relationship was attenuated in patients, consistent with both functions relying on the same structure in the healthy brain. These findings clarify both the neural mechanism of model-based planning and the scope of hippocampal contributions to behavior.
Collapse
Affiliation(s)
- Oliver M Vikbladh
- Center for Neural Science, New York University School of Arts and Science, New York, NY 10003, USA.
| | - Michael R Meager
- Department of Psychology, New York University School of Arts and Science, New York, NY 10003, USA; Department of Neurology, New York University School of Medicine, New York, NY 10016, USA
| | - John King
- Division of Psychology & Language Sciences, Department of Clinical, Educational & Health Psychology, University College London, London WC1H 0AP, UK
| | - Karen Blackmon
- Department of Physiology, Neuroscience, and Behavioral Sciences, St. George's University School of Medicine, St. George, Grenada, West Indies
| | - Orrin Devinsky
- Department of Neurology, New York University School of Medicine, New York, NY 10016, USA; Department of Neurosurgery, New York University School of Medicine, New York, NY 10016, USA; Department of Psychiatry, New York University School of Medicine, New York, NY 10016, USA
| | - Daphna Shohamy
- Department of Psychology, Columbia University, New York, NY 10027, USA; Zuckerman Mind Brain Behavior Institute and Kavli Institute for Brain Science, Columbia University, New York, NY 10027, USA
| | - Neil Burgess
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK; Institute of Neurology, University College London, London WC1N 3BG, UK
| | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, USA; Department of Psychology, Princeton University, Princeton, NJ 08540, USA.
| |
Collapse
|
43
|
Information Theory and Cognition: A Review. ENTROPY 2018; 20:e20090706. [PMID: 33265795 PMCID: PMC7513233 DOI: 10.3390/e20090706] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 08/31/2018] [Accepted: 09/08/2018] [Indexed: 01/12/2023]
Abstract
We examine how information theory has been used to study cognition over the last seven decades. After an initial burst of activity in the 1950s, the backlash that followed stopped most work in this area. The last couple of decades has seen both a revival of interest, and a more firmly grounded, experimentally justified use of information theory. We can view cognition as the process of transforming perceptions into information—where we use information in the colloquial sense of the word. This last clarification is one of the problems we run into when trying to use information theoretic principles to understand or analyze cognition. Information theory is mathematical, while cognition is a subjective phenomenon. It is relatively easy to discern a subjective connection between cognition and information; it is a different matter altogether to apply the rigor of information theory to the process of cognition. In this paper, we will look at the many ways in which people have tried to alleviate this problem. These approaches range from narrowing the focus to only quantifiable aspects of cognition or borrowing conceptual machinery from information theory to address issues of cognition. We describe applications of information theory across a range of cognition research, from neural coding to cognitive control and predictive coding.
Collapse
|
44
|
Hasz BM, Redish AD. Deliberation and Procedural Automation on a Two-Step Task for Rats. Front Integr Neurosci 2018; 12:30. [PMID: 30123115 PMCID: PMC6085996 DOI: 10.3389/fnint.2018.00030] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 07/02/2018] [Indexed: 11/25/2022] Open
Abstract
Current theories suggest that decision-making arises from multiple, competing action-selection systems. Rodent studies dissociate deliberation and procedural behavior, and find a transition from procedural to deliberative behavior with experience. However, it remains unknown how this transition from deliberative to procedural control evolves within single trials, or within blocks of repeated choices. We adapted for rats a two-step task which has been used to dissociate model-based from model-free decisions in humans. We found that a mixture of model-based and model-free algorithms was more likely to explain rat choice strategies on the task than either model-based or model-free algorithms alone. This task contained two choices per trial, which provides a more complex and non-discrete per-trial choice structure. This task structure enabled us to evaluate how deliberative and procedural behavior evolved within-trial and within blocks of repeated choice sequences. We found that vicarious trial and error (VTE), a behavioral correlate of deliberation in rodents, was correlated between the two choice points on a given lap. We also found that behavioral stereotypy, a correlate of procedural automation, increased with the number of repeated choices. While VTE at the first choice point decreased [corrected] with the number of repeated choices, VTE at the second choice point did not, and only increased after unexpected transitions within the task. This suggests that deliberation at the beginning of trials may correspond to changes in choice patterns, while mid-trial deliberation may correspond to an interruption of a procedural process.
Collapse
Affiliation(s)
- Brendan M. Hasz
- Graduate Program in Neuroscience, University of Minnesota Twin CitiesMinneapolis, MN, United States
| | - A. David Redish
- Department of Neuroscience, University of Minnesota Twin CitiesMinneapolis, MN, United States
| |
Collapse
|
45
|
Herweg NA, Kahana MJ. Spatial Representations in the Human Brain. Front Hum Neurosci 2018; 12:297. [PMID: 30104966 PMCID: PMC6078001 DOI: 10.3389/fnhum.2018.00297] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 07/06/2018] [Indexed: 11/13/2022] Open
Abstract
While extensive research on the neurophysiology of spatial memory has been carried out in rodents, memory research in humans had traditionally focused on more abstract, language-based tasks. Recent studies have begun to address this gap using virtual navigation tasks in combination with electrophysiological recordings in humans. These studies suggest that the human medial temporal lobe (MTL) is equipped with a population of place and grid cells similar to that previously observed in the rodent brain. Furthermore, theta oscillations have been linked to spatial navigation and, more specifically, to the encoding and retrieval of spatial information. While some studies suggest a single navigational theta rhythm which is of lower frequency in humans than rodents, other studies advocate for the existence of two functionally distinct delta-theta frequency bands involved in both spatial and episodic memory. Despite the general consensus between rodent and human electrophysiology, behavioral work in humans does not unequivocally support the use of a metric Euclidean map for navigation. Formal models of navigational behavior, which specifically consider the spatial scale of the environment and complementary learning mechanisms, may help to better understand different navigational strategies and their neurophysiological mechanisms. Finally, the functional overlap of spatial and declarative memory in the MTL calls for a unified theory of MTL function. Such a theory will critically rely upon linking task-related phenomena at multiple temporal and spatial scales. Understanding how single cell responses relate to ongoing theta oscillations during both the encoding and retrieval of spatial and non-spatial associations appears to be key toward developing a more mechanistic understanding of memory processes in the MTL.
Collapse
Affiliation(s)
- Nora A. Herweg
- Computational Memory Lab, Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael J. Kahana
- Computational Memory Lab, Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
46
|
Anggraini D, Glasauer S, Wunderlich K. Neural signatures of reinforcement learning correlate with strategy adoption during spatial navigation. Sci Rep 2018; 8:10110. [PMID: 29973606 PMCID: PMC6031619 DOI: 10.1038/s41598-018-28241-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 06/19/2018] [Indexed: 12/14/2022] Open
Abstract
Human navigation is generally believed to rely on two types of strategy adoption, route-based and map-based strategies. Both types of navigation require making spatial decisions along the traversed way although formal computational and neural links between navigational strategies and mechanisms of value-based decision making have so far been underexplored in humans. Here we employed functional magnetic resonance imaging (fMRI) while subjects located different objects in a virtual environment. We then modelled their paths using reinforcement learning (RL) algorithms, which successfully explained decision behavior and its neural correlates. Our results show that subjects used a mixture of route and map-based navigation and their paths could be well explained by the model-free and model-based RL algorithms. Furthermore, the value signals of model-free choices during route-based navigation modulated the BOLD signals in the ventro-medial prefrontal cortex (vmPFC), whereas the BOLD signals in parahippocampal and hippocampal regions pertained to model-based value signals during map-based navigation. Our findings suggest that the brain might share computational mechanisms and neural substrates for navigation and value-based decisions such that model-free choice guides route-based navigation and model-based choice directs map-based navigation. These findings open new avenues for computational modelling of wayfinding by directing attention to value-based decision, differing from common direction and distances approaches.
Collapse
Affiliation(s)
- Dian Anggraini
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, 80802, Germany.,Graduate School of Systemic Neuroscience LMU Munich, Planegg, Martinsried, 82152, Germany
| | - Stefan Glasauer
- Center for Sensorimotor Research, Department of Neurology, Ludwig-Maximilians-Universitaet München Klinikum Grosshadern, Munich, 81377, Germany.,Bernstein Center for Computational Neuroscience Munich, Planegg, Martinsried, 82152, Germany.,Graduate School of Systemic Neuroscience LMU Munich, Planegg, Martinsried, 82152, Germany
| | - Klaus Wunderlich
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, 80802, Germany. .,Bernstein Center for Computational Neuroscience Munich, Planegg, Martinsried, 82152, Germany. .,Graduate School of Systemic Neuroscience LMU Munich, Planegg, Martinsried, 82152, Germany.
| |
Collapse
|
47
|
Goodroe SC, Starnes J, Brown TI. The Complex Nature of Hippocampal-Striatal Interactions in Spatial Navigation. Front Hum Neurosci 2018; 12:250. [PMID: 29977198 PMCID: PMC6021746 DOI: 10.3389/fnhum.2018.00250] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 05/30/2018] [Indexed: 12/15/2022] Open
Abstract
Decades of research have established the importance of the hippocampus for episodic and spatial memory. In spatial navigation tasks, the role of the hippocampus has been classically juxtaposed with the role of the dorsal striatum, the latter of which has been characterized as a system important for implementing stimulus-response and action-outcome associations. In many neuroimaging paradigms, this has been explored through contrasting way finding and route-following behavior. The distinction between the contributions of the hippocampus and striatum to spatial navigation has been supported by extensive literature. Convergent research has also underscored the fact that these different memory systems can interact in dynamic ways and contribute to a broad range of navigational scenarios. For example, although familiar routes may often be navigable based on stimulus-response associations, hippocampal episodic memory mechanisms can also contribute to egocentric route-oriented memory, enabling recall of context-dependent sequences of landmarks or the actions to be made at decision points. Additionally, the literature has stressed the importance of subdividing the striatum into functional gradients—with more ventral and medial components being important for the behavioral expression of hippocampal-dependent spatial memories. More research is needed to reveal how networks involving these regions process and respond to dynamic changes in memory and control demands over the course of navigational events. In this Perspective article, we suggest that a critical direction for navigation research is to further characterize how hippocampal and striatal subdivisions interact in different navigational contexts.
Collapse
Affiliation(s)
- Sarah C Goodroe
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, United States
| | - Jon Starnes
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, United States
| | - Thackery I Brown
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, United States
| |
Collapse
|
48
|
Codol O, Holland PJ, Galea JM. The relationship between reinforcement and explicit control during visuomotor adaptation. Sci Rep 2018; 8:9121. [PMID: 29904096 PMCID: PMC6002524 DOI: 10.1038/s41598-018-27378-1] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 06/01/2018] [Indexed: 11/22/2022] Open
Abstract
The motor system’s ability to adapt to environmental changes is essential for maintaining accurate movements. Such adaptation recruits several distinct systems: cerebellar sensory-prediction error learning, success-based reinforcement, and explicit control. Although much work has focused on the relationship between cerebellar learning and explicit control, there is little research regarding how reinforcement and explicit control interact. To address this, participants first learnt a 20° visuomotor displacement. After reaching asymptotic performance, binary, hit-or-miss feedback (BF) was introduced either with or without visual feedback, the latter promoting reinforcement. Subsequently, retention was assessed using no-feedback trials, with half of the participants in each group being instructed to stop aiming off target. Although BF led to an increase in retention of the visuomotor displacement, instructing participants to stop re-aiming nullified this effect, suggesting explicit control is critical to BF-based reinforcement. In a second experiment, we prevented the expression or development of explicit control during BF performance, by either constraining participants to a short preparation time (expression) or by introducing the displacement gradually (development). Both manipulations strongly impaired BF performance, suggesting reinforcement requires both recruitment and expression of an explicit component. These results emphasise the pivotal role explicit control plays in reinforcement-based motor learning.
Collapse
Affiliation(s)
- Olivier Codol
- School of Psychology, University of Birmingham, Birmingham, UK.
| | - Peter J Holland
- School of Psychology, University of Birmingham, Birmingham, UK
| | - Joseph M Galea
- School of Psychology, University of Birmingham, Birmingham, UK
| |
Collapse
|
49
|
Loh E, Kurth-Nelson Z, Berron D, Dayan P, Duzel E, Dolan R, Guitart-Masip M. Parsing the Role of the Hippocampus in Approach-Avoidance Conflict. Cereb Cortex 2018; 27:201-215. [PMID: 27993819 PMCID: PMC5939226 DOI: 10.1093/cercor/bhw378] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Accepted: 11/11/2016] [Indexed: 01/07/2023] Open
Abstract
The hippocampus plays a central role in the approach-avoidance conflict that is central to the genesis of anxiety. However, its exact functional contribution has yet to be identified. We designed a novel gambling task that generated approach-avoidance conflict while controlling for spatial processing. We fit subjects' behavior using a model that quantified the subjective values of choice options, and recorded neural signals using functional magnetic resonance imaging (fMRI). Distinct functional signals were observed in anterior hippocampus, with inferior hippocampus selectively recruited when subjects rejected a gamble, to a degree that covaried with individual differences in anxiety. The superior anterior hippocampus, in contrast, uniquely demonstrated value signals that were potentiated in the context of approach-avoidance conflict. These results implicate the anterior hippocampus in behavioral avoidance and choice monitoring, in a manner relevant to understanding its role in anxiety. Our findings highlight interactions between subregions of the hippocampus as an important focus for future study.
Collapse
Affiliation(s)
- Eleanor Loh
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1n 3BG, UK
| | - Zeb Kurth-Nelson
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1n 3BG, UK.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London WC1B 5EH, UK
| | - David Berron
- Institute of Cognitive Neurology and Dementia Research, Otto-von-Guericke University, D-39120 Magdeburg, Germany
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, UK
| | - Emrah Duzel
- Institute of Cognitive Neurology and Dementia Research, Otto-von-Guericke University, D-39120 Magdeburg, Germany.,Institute of Cognitive Neuroscience, University College London, London WC1N 3AR, UK
| | - Ray Dolan
- Wellcome Trust Centre for Neuroimaging, University College London, London WC1n 3BG, UK.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London WC1B 5EH, UK
| | - Marc Guitart-Masip
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London WC1B 5EH, UK.,Ageing Research Centre, Karolinska Institute Stockholm, SE-11330 Stockholm, Sweden
| |
Collapse
|
50
|
Fakhari P, Khodadadi A, Busemeyer JR. The detour problem in a stochastic environment: Tolman revisited. Cogn Psychol 2018; 101:29-49. [PMID: 29294373 DOI: 10.1016/j.cogpsych.2017.12.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 12/22/2017] [Accepted: 12/23/2017] [Indexed: 10/18/2022]
Abstract
We designed a grid world task to study human planning and re-planning behavior in an unknown stochastic environment. In our grid world, participants were asked to travel from a random starting point to a random goal position while maximizing their reward. Because they were not familiar with the environment, they needed to learn its characteristics from experience to plan optimally. Later in the task, we randomly blocked the optimal path to investigate whether and how people adjust their original plans to find a detour. To this end, we developed and compared 12 different models. These models were different on how they learned and represented the environment and how they planned to catch the goal. The majority of our participants were able to plan optimally. We also showed that people were capable of revising their plans when an unexpected event occurred. The result from the model comparison showed that the model-based reinforcement learning approach provided the best account for the data and outperformed heuristics in explaining the behavioral data in the re-planning trials.
Collapse
Affiliation(s)
- Pegah Fakhari
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States.
| | - Arash Khodadadi
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States
| | - Jerome R Busemeyer
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States
| |
Collapse
|