1
|
Tarder-Stoll H, Baldassano C, Aly M. Consolidation Enhances Sequential Multistep Anticipation but Diminishes Access to Perceptual Features. Psychol Sci 2024:9567976241256617. [PMID: 39110746 DOI: 10.1177/09567976241256617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2024] Open
Abstract
Many experiences unfold predictably over time. Memory for these temporal regularities enables anticipation of events multiple steps into the future. Because temporally predictable events repeat over days, weeks, and years, we must maintain-and potentially transform-memories of temporal structure to support adaptive behavior. We explored how individuals build durable models of temporal regularities to guide multistep anticipation. Healthy young adults (Experiment 1: N = 99, age range = 18-40 years; Experiment 2: N = 204, age range = 19-40 years) learned sequences of scene images that were predictable at the category level and contained incidental perceptual details. Individuals then anticipated upcoming scene categories multiple steps into the future, immediately and at a delay. Consolidation increased the efficiency of anticipation, particularly for events further in the future, but diminished access to perceptual features. Further, maintaining a link-based model of the sequence after consolidation improved anticipation accuracy. Consolidation may therefore promote efficient and durable models of temporal structure, thus facilitating anticipation of future events.
Collapse
Affiliation(s)
- Hannah Tarder-Stoll
- Department of Psychology, Columbia University
- Baycrest Health Sciences, Rotman Research Institute, Toronto, Canada
| | | | - Mariam Aly
- Department of Psychology, Columbia University
| |
Collapse
|
2
|
Lohnas LJ, Howard MW. The influence of emotion on temporal context models. Cogn Emot 2024:1-29. [PMID: 39007902 DOI: 10.1080/02699931.2024.2371075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 06/17/2024] [Indexed: 07/16/2024]
Abstract
Temporal context models (TCMs) have been influential in understanding episodic memory and its neural underpinnings. Recently, TCMs have been extended to explain emotional memory effects, one of the most clinically important findings in the field of memory research. This review covers recent advances in hypotheses for the neural representation of spatiotemporal context through the lens of TCMs, including their ability to explain the influence of emotion on episodic and temporal memory. In recent years, simplifying assumptions of "classical" TCMs - with exponential trace decay and the mechanism by which temporal context is recovered - have become increasingly clear. The review also outlines how recent advances could be incorporated into a future TCM, beyond classical assumptions, to integrate emotional modulation.
Collapse
Affiliation(s)
- Lynn J Lohnas
- Department of Psychology, Syracuse University, Syracuse, NY, USA
| | - Marc W Howard
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA
| |
Collapse
|
3
|
Verosky NJ. Associative Learning of an Unnormalized Successor Representation. Neural Comput 2024; 36:1410-1423. [PMID: 38776964 DOI: 10.1162/neco_a_01675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Accepted: 03/13/2024] [Indexed: 05/25/2024]
Abstract
The successor representation is known to relate to temporal associations learned in the temporal context model (Gershman et al., 2012), and subsequent work suggests a wide relevance of the successor representation across spatial, visual, and abstract relational tasks. I demonstrate that the successor representation and purely associative learning have an even deeper relationship than initially indicated: Hebbian temporal associations are an unnormalized form of the successor representation, such that the two converge on an identical representation whenever all states are equally frequent and can correlate highly in practice even when the state distribution is nonuniform.
Collapse
Affiliation(s)
- Niels J Verosky
- Department of Psychology, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| |
Collapse
|
4
|
Giallanza T, Campbell D, Cohen JD. Toward the Emergence of Intelligent Control: Episodic Generalization and Optimization. Open Mind (Camb) 2024; 8:688-722. [PMID: 38828434 PMCID: PMC11142636 DOI: 10.1162/opmi_a_00143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 04/01/2024] [Indexed: 06/05/2024] Open
Abstract
Human cognition is unique in its ability to perform a wide range of tasks and to learn new tasks quickly. Both abilities have long been associated with the acquisition of knowledge that can generalize across tasks and the flexible use of that knowledge to execute goal-directed behavior. We investigate how this emerges in a neural network by describing and testing the Episodic Generalization and Optimization (EGO) framework. The framework consists of an episodic memory module, which rapidly learns relationships between stimuli; a semantic pathway, which more slowly learns how stimuli map to responses; and a recurrent context module, which maintains a representation of task-relevant context information, integrates this over time, and uses it both to recall context-relevant memories (in episodic memory) and to bias processing in favor of context-relevant features and responses (in the semantic pathway). We use the framework to address empirical phenomena across reinforcement learning, event segmentation, and category learning, showing in simulations that the same set of underlying mechanisms accounts for human performance in all three domains. The results demonstrate how the components of the EGO framework can efficiently learn knowledge that can be flexibly generalized across tasks, furthering our understanding of how humans can quickly learn how to perform a wide range of tasks-a capability that is fundamental to human intelligence.
Collapse
Affiliation(s)
- Tyler Giallanza
- Department of Psychology, Princeton University, Princeton, NJ, USA
| | - Declan Campbell
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Jonathan D. Cohen
- Department of Psychology, Princeton University, Princeton, NJ, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| |
Collapse
|
5
|
Sagiv Y, Akam T, Witten IB, Daw ND. Prioritizing replay when future goals are unknown. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.29.582822. [PMID: 38496674 PMCID: PMC10942393 DOI: 10.1101/2024.02.29.582822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Although hippocampal place cells replay nonlocal trajectories, the computational function of these events remains controversial. One hypothesis, formalized in a prominent reinforcement learning account, holds that replay plans routes to current goals. However, recent puzzling data appear to contradict this perspective by showing that replayed destinations lag current goals. These results may support an alternative hypothesis that replay updates route information to build a "cognitive map." Yet no similar theory exists to formalize this view, and it is unclear how such a map is represented or what role replay plays in computing it. We address these gaps by introducing a theory of replay that learns a map of routes to candidate goals, before reward is available or when its location may change. Our work extends the planning account to capture a general map-building function for replay, reconciling it with data, and revealing an unexpected relationship between the seemingly distinct hypotheses.
Collapse
Affiliation(s)
- Yotam Sagiv
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Thomas Akam
- Department of Experimental Psychology, Oxford University, Oxford, UK
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| |
Collapse
|
6
|
Antony JW, Van Dam J, Massey JR, Barnett AJ, Bennion KA. Long-term, multi-event surprise correlates with enhanced autobiographical memory. Nat Hum Behav 2023; 7:2152-2168. [PMID: 37322234 DOI: 10.1038/s41562-023-01631-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 05/16/2023] [Indexed: 06/17/2023]
Abstract
Neurobiological and psychological models of learning emphasize the importance of prediction errors (surprises) for memory formation. This relationship has been shown for individual momentary surprising events; however, it is less clear whether surprise that unfolds across multiple events and timescales is also linked with better memory of those events. We asked basketball fans about their most positive and negative autobiographical memories of individual plays, games and seasons, allowing surprise measurements spanning seconds, hours and months. We used advanced analytics on National Basketball Association play-by-play data and betting odds spanning 17 seasons, more than 22,000 games and more than 5.6 million plays to compute and align the estimated surprise value of each memory. We found that surprising events were associated with better recall of positive memories on the scale of seconds and months and negative memories across all three timescales. Game and season memories could not be explained by surprise at shorter timescales, suggesting that long-term, multi-event surprise correlates with memory. These results expand notions of surprise in models of learning and reinforce its relevance in real-world domains.
Collapse
Affiliation(s)
- James W Antony
- Department of Psychology and Child Development, California Polytechnic State University, San Luis Obispo, CA, USA.
| | - Jacob Van Dam
- Department of Psychology and Child Development, California Polytechnic State University, San Luis Obispo, CA, USA
| | - Jarett R Massey
- Department of Psychology and Child Development, California Polytechnic State University, San Luis Obispo, CA, USA
| | | | - Kelly A Bennion
- Department of Psychology and Child Development, California Polytechnic State University, San Luis Obispo, CA, USA
| |
Collapse
|
7
|
Mehrotra D, Dubé L. Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Front Neurosci 2023; 17:1200842. [PMID: 37732307 PMCID: PMC10508350 DOI: 10.3389/fnins.2023.1200842] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023] Open
Abstract
For adaptive real-time behavior in real-world contexts, the brain needs to allow past information over multiple timescales to influence current processing for making choices that create the best outcome as a person goes about making choices in their everyday life. The neuroeconomics literature on value-based decision-making has formalized such choice through reinforcement learning models for two extreme strategies. These strategies are model-free (MF), which is an automatic, stimulus-response type of action, and model-based (MB), which bases choice on cognitive representations of the world and causal inference on environment-behavior structure. The emphasis of examining the neural substrates of value-based decision making has been on the striatum and prefrontal regions, especially with regards to the "here and now" decision-making. Yet, such a dichotomy does not embrace all the dynamic complexity involved. In addition, despite robust research on the role of the hippocampus in memory and spatial learning, its contribution to value-based decision making is just starting to be explored. This paper aims to better appreciate the role of the hippocampus in decision-making and advance the successor representation (SR) as a candidate mechanism for encoding state representations in the hippocampus, separate from reward representations. To this end, we review research that relates hippocampal sequences to SR models showing that the implementation of such sequences in reinforcement learning agents improves their performance. This also enables the agents to perform multiscale temporal processing in a biologically plausible manner. Altogether, we articulate a framework to advance current striatal and prefrontal-focused decision making to better account for multiscale mechanisms underlying various real-world time-related concepts such as the self that cumulates over a person's life course.
Collapse
Affiliation(s)
- Dhruv Mehrotra
- Integrated Program in Neuroscience, McGill University, Montréal, QC, Canada
- Montréal Neurological Institute, McGill University, Montréal, QC, Canada
| | - Laurette Dubé
- Desautels Faculty of Management, McGill University, Montréal, QC, Canada
- McGill Center for the Convergence of Health and Economics, McGill University, Montréal, QC, Canada
| |
Collapse
|
8
|
Shamash P, Lee S, Saxe AM, Branco T. Mice identify subgoal locations through an action-driven mapping process. Neuron 2023; 111:1966-1978.e8. [PMID: 37119818 PMCID: PMC10636595 DOI: 10.1016/j.neuron.2023.03.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 10/12/2022] [Accepted: 03/27/2023] [Indexed: 05/01/2023]
Abstract
Mammals form mental maps of the environments by exploring their surroundings. Here, we investigate which elements of exploration are important for this process. We studied mouse escape behavior, in which mice are known to memorize subgoal locations-obstacle edges-to execute efficient escape routes to shelter. To test the role of exploratory actions, we developed closed-loop neural-stimulation protocols for interrupting various actions while mice explored. We found that blocking running movements directed at obstacle edges prevented subgoal learning; however, blocking several control movements had no effect. Reinforcement learning simulations and analysis of spatial data show that artificial agents can match these results if they have a region-level spatial representation and explore with object-directed movements. We conclude that mice employ an action-driven process for integrating subgoals into a hierarchical cognitive map. These findings broaden our understanding of the cognitive toolkit that mammals use to acquire spatial knowledge.
Collapse
Affiliation(s)
- Philip Shamash
- UCL Sainsbury Wellcome Centre for Neural Circuits and Behaviour, London W1T 4JG, UK
| | - Sebastian Lee
- UCL Gatsby Computational Neuroscience Unit, London W1T 4JG, UK
| | - Andrew M Saxe
- UCL Gatsby Computational Neuroscience Unit, London W1T 4JG, UK
| | - Tiago Branco
- UCL Sainsbury Wellcome Centre for Neural Circuits and Behaviour, London W1T 4JG, UK.
| |
Collapse
|
9
|
Fang C, Aronov D, Abbott LF, Mackevicius EL. Neural learning rules for generating flexible predictions and computing the successor representation. eLife 2023; 12:e80680. [PMID: 36928104 PMCID: PMC10019889 DOI: 10.7554/elife.80680] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 10/26/2022] [Indexed: 03/18/2023] Open
Abstract
The predictive nature of the hippocampus is thought to be useful for memory-guided cognitive behaviors. Inspired by the reinforcement learning literature, this notion has been formalized as a predictive map called the successor representation (SR). The SR captures a number of observations about hippocampal activity. However, the algorithm does not provide a neural mechanism for how such representations arise. Here, we show the dynamics of a recurrent neural network naturally calculate the SR when the synaptic weights match the transition probability matrix. Interestingly, the predictive horizon can be flexibly modulated simply by changing the network gain. We derive simple, biologically plausible learning rules to learn the SR in a recurrent network. We test our model with realistic inputs and match hippocampal data recorded during random foraging. Taken together, our results suggest that the SR is more accessible in neural circuits than previously thought and can support a broad range of cognitive functions.
Collapse
Affiliation(s)
- Ching Fang
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - Dmitriy Aronov
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - LF Abbott
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
| | - Emily L Mackevicius
- Zuckerman Institute, Department of Neuroscience, Columbia UniversityNew YorkUnited States
- Basis Research InstituteNew YorkUnited States
| |
Collapse
|
10
|
Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
11
|
Duvelle É, Grieves RM, van der Meer MAA. Temporal context and latent state inference in the hippocampal splitter signal. eLife 2023; 12:e82357. [PMID: 36622350 PMCID: PMC9829411 DOI: 10.7554/elife.82357] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/06/2022] [Indexed: 01/10/2023] Open
Abstract
The hippocampus is thought to enable the encoding and retrieval of ongoing experience, the organization of that experience into structured representations like contexts, maps, and schemas, and the use of these structures to plan for the future. A central goal is to understand what the core computations supporting these functions are, and how these computations are realized in the collective action of single neurons. A potential access point into this issue is provided by 'splitter cells', hippocampal neurons that fire differentially on the overlapping segment of trajectories that differ in their past and/or future. However, the literature on splitter cells has been fragmented and confusing, owing to differences in terminology, behavioral tasks, and analysis methods across studies. In this review, we synthesize consistent findings from this literature, establish a common set of terms, and translate between single-cell and ensemble perspectives. Most importantly, we examine the combined findings through the lens of two major theoretical ideas about hippocampal function: representation of temporal context and latent state inference. We find that unique signature properties of each of these models are necessary to account for the data, but neither theory, by itself, explains all of its features. Specifically, the temporal gradedness of the splitter signal is strong support for temporal context, but is hard to explain using state models, while its flexibility and task-dependence is naturally accounted for using state inference, but poses a challenge otherwise. These theories suggest a number of avenues for future work, and we believe their application to splitter cells is a timely and informative domain for testing and refining theoretical ideas about hippocampal function.
Collapse
Affiliation(s)
- Éléonore Duvelle
- Department of Psychological and Brain Sciences, Dartmouth CollegeHanoverUnited States
| | - Roddy M Grieves
- Department of Psychological and Brain Sciences, Dartmouth CollegeHanoverUnited States
| | | |
Collapse
|
12
|
Morita K, Shimomura K, Kawaguchi Y. Opponent Learning with Different Representations in the Cortico-Basal Ganglia Circuits. eNeuro 2023; 10:ENEURO.0422-22.2023. [PMID: 36653187 PMCID: PMC9884109 DOI: 10.1523/eneuro.0422-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/06/2022] [Accepted: 01/03/2023] [Indexed: 01/20/2023] Open
Abstract
The direct and indirect pathways of the basal ganglia (BG) have been suggested to learn mainly from positive and negative feedbacks, respectively. Since these pathways unevenly receive inputs from different cortical neuron types and/or regions, they may preferentially use different state/action representations. We explored whether such a combined use of different representations, coupled with different learning rates from positive and negative reward prediction errors (RPEs), has computational benefits. We modeled animal as an agent equipped with two learning systems, each of which adopted individual representation (IR) or successor representation (SR) of states. With varying the combination of IR or SR and also the learning rates from positive and negative RPEs in each system, we examined how the agent performed in a dynamic reward navigation task. We found that combination of SR-based system learning mainly from positive RPEs and IR-based system learning mainly from negative RPEs could achieve a good performance in the task, as compared with other combinations. In such a combination of appetitive SR-based and aversive IR-based systems, both systems show activities of comparable magnitudes with opposite signs, consistent with the suggested profiles of the two BG pathways. Moreover, the architecture of such a combination provides a novel coherent explanation for the functional significance and underlying mechanism of diverse findings about the cortico-BG circuits. These results suggest that particularly combining different representations with appetitive and aversive learning could be an effective learning strategy in certain dynamic environments, and it might actually be implemented in the cortico-BG circuits.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo 113-0033, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- Department of Behavioral Medicine, National Institute of Mental Health, National Center of Neurology and Psychiatry, Kodaira 187-8551, Japan
| | - Yasuo Kawaguchi
- Brain Science Institute, Tamagawa University, Machida 194-8610, Japan
- National Institute for Physiological Sciences (NIPS), Okazaki 444-8787, Japan
| |
Collapse
|
13
|
McNamee DC, Stachenfeld KL, Botvinick MM, Gershman SJ. Compositional Sequence Generation in the Entorhinal-Hippocampal System. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1791. [PMID: 36554196 PMCID: PMC9778317 DOI: 10.3390/e24121791] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 11/01/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
Neurons in the medial entorhinal cortex exhibit multiple, periodically organized, firing fields which collectively appear to form an internal representation of space. Neuroimaging data suggest that this grid coding is also present in other cortical areas such as the prefrontal cortex, indicating that it may be a general principle of neural functionality in the brain. In a recent analysis through the lens of dynamical systems theory, we showed how grid coding can lead to the generation of a diversity of empirically observed sequential reactivations of hippocampal place cells corresponding to traversals of cognitive maps. Here, we extend this sequence generation model by describing how the synthesis of multiple dynamical systems can support compositional cognitive computations. To empirically validate the model, we simulate two experiments demonstrating compositionality in space or in time during sequence generation. Finally, we describe several neural network architectures supporting various types of compositionality based on grid coding and highlight connections to recent work in machine learning leveraging analogous techniques.
Collapse
Affiliation(s)
- Daniel C. McNamee
- Neuroscience Programme, Champalimaud Research, 1400-038 Lisbon, Portugal
| | | | - Matthew M. Botvinick
- Google DeepMind, London N1C 4DN, UK
- Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, UK
| | - Samuel J. Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
- Center for Brains, Minds and Machines, MIT, Cambridge, MA 02139, USA
| |
Collapse
|
14
|
de Cothi W, Nyberg N, Griesbauer EM, Ghanamé C, Zisch F, Lefort JM, Fletcher L, Newton C, Renaudineau S, Bendor D, Grieves R, Duvelle É, Barry C, Spiers HJ. Predictive maps in rats and humans for spatial navigation. Curr Biol 2022; 32:3676-3689.e5. [PMID: 35863351 PMCID: PMC9616735 DOI: 10.1016/j.cub.2022.06.090] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 05/19/2022] [Accepted: 06/29/2022] [Indexed: 11/25/2022]
Abstract
Much of our understanding of navigation comes from the study of individual species, often with specific tasks tailored to those species. Here, we provide a novel experimental and analytic framework integrating across humans, rats, and simulated reinforcement learning (RL) agents to interrogate the dynamics of behavior during spatial navigation. We developed a novel open-field navigation task ("Tartarus maze") requiring dynamic adaptation (shortcuts and detours) to frequently changing obstructions on the path to a hidden goal. Humans and rats were remarkably similar in their trajectories. Both species showed the greatest similarity to RL agents utilizing a "successor representation," which creates a predictive map. Humans also displayed trajectory features similar to model-based RL agents, which implemented an optimal tree-search planning procedure. Our results help refine models seeking to explain mammalian navigation in dynamic environments and highlight the utility of modeling the behavior of different species to uncover the shared mechanisms that support behavior.
Collapse
Affiliation(s)
- William de Cothi
- Department of Cell and Developmental Biology, University College London, London, UK; Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| | - Nils Nyberg
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Eva-Maria Griesbauer
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Carole Ghanamé
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Fiona Zisch
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; The Bartlett School of Architecture, University College London, London, UK
| | - Julie M Lefort
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Lydia Fletcher
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Coco Newton
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Sophie Renaudineau
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Daniel Bendor
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK
| | - Roddy Grieves
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Éléonore Duvelle
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK; Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Caswell Barry
- Department of Cell and Developmental Biology, University College London, London, UK
| | - Hugo J Spiers
- Institute of Behavioral Neuroscience, Department of Experimental Psychology, Division of Psychology and Language Sciences, University College London, London, UK.
| |
Collapse
|
15
|
Pudhiyidath A, Morton NW, Viveros Duran R, Schapiro AC, Momennejad I, Hinojosa-Rowland DM, Molitor RJ, Preston AR. Representations of Temporal Community Structure in Hippocampus and Precuneus Predict Inductive Reasoning Decisions. J Cogn Neurosci 2022; 34:1736-1760. [PMID: 35579986 PMCID: PMC10262802 DOI: 10.1162/jocn_a_01864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Our understanding of the world is shaped by inferences about underlying structure. For example, at the gym, you might notice that the same people tend to arrive around the same time and infer that they are friends that work out together. Consistent with this idea, after participants are presented with a temporal sequence of objects that follows an underlying community structure, they are biased to infer that objects from the same community share the same properties. Here, we used fMRI to measure neural representations of objects after temporal community structure learning and examine how these representations support inference about object relationships. We found that community structure learning affected inferred object similarity: When asked to spatially group items based on their experience, participants tended to group together objects from the same community. Neural representations in perirhinal cortex predicted individual differences in object grouping, suggesting that high-level object representations are affected by temporal community learning. Furthermore, participants were biased to infer that objects from the same community would share the same properties. Using computational modeling of temporal learning and inference decisions, we found that inductive reasoning is influenced by both detailed knowledge of temporal statistics and abstract knowledge of the temporal communities. The fidelity of temporal community representations in hippocampus and precuneus predicted the degree to which temporal community membership biased reasoning decisions. Our results suggest that temporal knowledge is represented at multiple levels of abstraction, and that perirhinal cortex, hippocampus, and precuneus may support inference based on this knowledge.
Collapse
|
16
|
Fountas Z, Sylaidi A, Nikiforou K, Seth AK, Shanahan M, Roseboom W. A Predictive Processing Model of Episodic Memory and Time Perception. Neural Comput 2022; 34:1501-1544. [PMID: 35671462 DOI: 10.1162/neco_a_01514] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 03/06/2022] [Indexed: 11/04/2022]
Abstract
Human perception and experience of time are strongly influenced by ongoing stimulation, memory of past experiences, and required task context. When paying attention to time, time experience seems to expand; when distracted, it seems to contract. When considering time based on memory, the experience may be different than what is in the moment, exemplified by sayings like "time flies when you're having fun." Experience of time also depends on the content of perceptual experience-rapidly changing or complex perceptual scenes seem longer in duration than less dynamic ones. The complexity of interactions among attention, memory, and perceptual stimulation is a likely reason that an overarching theory of time perception has been difficult to achieve. Here, we introduce a model of perceptual processing and episodic memory that makes use of hierarchical predictive coding, short-term plasticity, spatiotemporal attention, and episodic memory formation and recall, and apply this model to the problem of human time perception. In an experiment with approximately 13,000 human participants, we investigated the effects of memory, cognitive load, and stimulus content on duration reports of dynamic natural scenes up to about 1 minute long. Using our model to generate duration estimates, we compared human and model performance. Model-based estimates replicated key qualitative biases, including differences by cognitive load (attention), scene type (stimulation), and whether the judgment was made based on current or remembered experience (memory). Our work provides a comprehensive model of human time perception and a foundation for exploring the computational basis of episodic memory within a hierarchical predictive coding framework.
Collapse
Affiliation(s)
- Zafeirios Fountas
- Emotech Labs, London, N1 7EU U.K.,Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, U.K.
| | | | | | - Anil K Seth
- Department of Informatics and Sackler Centre for Consciousness Science, University of Sussex, Brighton, BN1 9RH, U.K.,Canadian Institute for Advanced Research Program on Brain, Mind, and Consciousness, Toronto, ON M5G 1M1, Canada
| | - Murray Shanahan
- Department of Computing, Imperial College London, London, SW7 2RH, U.K.
| | - Warrick Roseboom
- Department of Informatics and Sackler Centre for Consciousness Science, University of Sussex, Brighton BN1 9RH, U.K.
| |
Collapse
|
17
|
Stiso J, Lynn CW, Kahn AE, Rangarajan V, Szymula KP, Archer R, Revell A, Stein JM, Litt B, Davis KA, Lucas TH, Bassett DS. Neurophysiological Evidence for Cognitive Map Formation during Sequence Learning. eNeuro 2022; 9:ENEURO.0361-21.2022. [PMID: 35105662 PMCID: PMC8896554 DOI: 10.1523/eneuro.0361-21.2022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 12/03/2021] [Accepted: 01/03/2022] [Indexed: 12/29/2022] Open
Abstract
Humans deftly parse statistics from sequences. Some theories posit that humans learn these statistics by forming cognitive maps, or underlying representations of the latent space which links items in the sequence. Here, an item in the sequence is a node, and the probability of transitioning between two items is an edge. Sequences can then be generated from walks through the latent space, with different spaces giving rise to different sequence statistics. Individual or group differences in sequence learning can be modeled by changing the time scale over which estimates of transition probabilities are built, or in other words, by changing the amount of temporal discounting. Latent space models with temporal discounting bear a resemblance to models of navigation through Euclidean spaces. However, few explicit links have been made between predictions from Euclidean spatial navigation and neural activity during human sequence learning. Here, we use a combination of behavioral modeling and intracranial encephalography (iEEG) recordings to investigate how neural activity might support the formation of space-like cognitive maps through temporal discounting during sequence learning. Specifically, we acquire human reaction times from a sequential reaction time task, to which we fit a model that formulates the amount of temporal discounting as a single free parameter. From the parameter, we calculate each individual's estimate of the latent space. We find that neural activity reflects these estimates mostly in the temporal lobe, including areas involved in spatial navigation. Similar to spatial navigation, we find that low-dimensional representations of neural activity allow for easy separation of important features, such as modules, in the latent space. Lastly, we take advantage of the high temporal resolution of iEEG data to determine the time scale on which latent spaces are learned. We find that learning typically happens within the first 500 trials, and is modulated by the underlying latent space and the amount of temporal discounting characteristic of each participant. Ultimately, this work provides important links between behavioral models of sequence learning and neural activity during the same behavior, and contextualizes these results within a broader framework of domain general cognitive maps.
Collapse
Affiliation(s)
- Jennifer Stiso
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
| | - Christopher W Lynn
- Initiative for the Theoretical Sciences, Graduate Center, City University of New York, New York, NY 10016
- Joseph Henry Laboratories of Physics, Princeton University, Princeton, NJ 08544
| | - Ari E Kahn
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544
| | - Vinitha Rangarajan
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
| | - Karol P Szymula
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
| | - Ryan Archer
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Andrew Revell
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Joel M Stein
- Department of Radiology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Brian Litt
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Kathryn A Davis
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Timothy H Lucas
- Department of Neurology, Hospital of the University of Pennsylvania, Philadelphia, PA 19104
| | - Dani S Bassett
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
- Department of Electrical and Systems Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Department of Physics and Astronomy, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104
- The Santa Fe Institute, Santa Fe, NM 87501
- Department of Bioengineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104
- Initiative for the Theoretical Sciences, Graduate Center, City University of New York, New York, NY 10016
| |
Collapse
|
18
|
|
19
|
Benna MK, Fusi S. Place cells may simply be memory cells: Memory compression leads to spatial tuning and history dependence. Proc Natl Acad Sci U S A 2021; 118:e2018422118. [PMID: 34916282 PMCID: PMC8713479 DOI: 10.1073/pnas.2018422118] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2021] [Indexed: 11/18/2022] Open
Abstract
The observation of place cells has suggested that the hippocampus plays a special role in encoding spatial information. However, place cell responses are modulated by several nonspatial variables and reported to be rather unstable. Here, we propose a memory model of the hippocampus that provides an interpretation of place cells consistent with these observations. We hypothesize that the hippocampus is a memory device that takes advantage of the correlations between sensory experiences to generate compressed representations of the episodes that are stored in memory. A simple neural network model that can efficiently compress information naturally produces place cells that are similar to those observed in experiments. It predicts that the activity of these cells is variable and that the fluctuations of the place fields encode information about the recent history of sensory experiences. Place cells may simply be a consequence of a memory compression process implemented in the hippocampus.
Collapse
Affiliation(s)
- Marcus K Benna
- Center for Theoretical Neuroscience, Columbia University, New York, NY 10027;
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027
- Neurobiology Section, Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093
| | - Stefano Fusi
- Center for Theoretical Neuroscience, Columbia University, New York, NY 10027;
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027
- Kavli Institute for Brain Sciences, Columbia University, New York, NY 10027
| |
Collapse
|
20
|
K Namboodiri VM, Stuber GD. The learning of prospective and retrospective cognitive maps within neural circuits. Neuron 2021; 109:3552-3575. [PMID: 34678148 PMCID: PMC8809184 DOI: 10.1016/j.neuron.2021.09.034] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/26/2021] [Accepted: 09/16/2021] [Indexed: 11/18/2022]
Abstract
Brain circuits are thought to form a "cognitive map" to process and store statistical relationships in the environment. A cognitive map is commonly defined as a mental representation that describes environmental states (i.e., variables or events) and the relationship between these states. This process is commonly conceptualized as a prospective process, as it is based on the relationships between states in chronological order (e.g., does reward follow a given state?). In this perspective, we expand this concept on the basis of recent findings to postulate that in addition to a prospective map, the brain forms and uses a retrospective cognitive map (e.g., does a given state precede reward?). In doing so, we demonstrate that many neural signals and behaviors (e.g., habits) that seem inflexible and non-cognitive can result from retrospective cognitive maps. Together, we present a significant conceptual reframing of the neurobiological study of associative learning, memory, and decision making.
Collapse
Affiliation(s)
- Vijay Mohan K Namboodiri
- Department of Neurology, Center for Integrative Neuroscience, Kavli Institute for Fundamental Neuroscience, Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA.
| | - Garret D Stuber
- Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, Neuroscience Graduate Program, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
21
|
Feng Z, Nagase AM, Morita K. A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task? Front Neurosci 2021; 15:660595. [PMID: 34602962 PMCID: PMC8481628 DOI: 10.3389/fnins.2021.660595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 08/16/2021] [Indexed: 11/27/2022] Open
Abstract
Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a "student" doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the "student" had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The "student" learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the "student" made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the "student," to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.
Collapse
Affiliation(s)
- Zheyu Feng
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
| | - Asako Mitsuto Nagase
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
- Division of Neurology, Department of Brain and Neurosciences, Faculty of Medicine, Tottori University, Yonago, Japan
- Research Fellowship for Young Scientists, Japan Society for the Promotion of Science, Tokyo, Japan
- Department of Neurology, Faculty of Medicine, Shimane University, Izumo, Japan
| | - Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo, Japan
| |
Collapse
|
22
|
Shimomura K, Kato A, Morita K. Rigid reduced successor representation as a potential mechanism for addiction. Eur J Neurosci 2021; 53:3768-3790. [PMID: 33840120 PMCID: PMC8252639 DOI: 10.1111/ejn.15227] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Revised: 03/30/2021] [Accepted: 04/07/2021] [Indexed: 12/14/2022]
Abstract
Difficulty in cessation of drinking, smoking, or gambling has been widely recognized. Conventional theories proposed relative dominance of habitual over goal-directed control, but human studies have not convincingly supported them. Referring to the recently suggested "successor representation (SR)" of states that enables partially goal-directed control, we propose a dopamine-related mechanism that makes resistance to habitual reward-obtaining particularly difficult. We considered that long-standing behavior towards a certain reward without resisting temptation can (but not always) lead to a formation of rigid dimension-reduced SR based on the goal state, which cannot be updated. Then, in our model assuming such rigid reduced SR, whereas no reward prediction error (RPE) is generated at the goal while no resistance is made, a sustained large positive RPE is generated upon goal reaching once the person starts resisting temptation. Such sustained RPE is somewhat similar to the hypothesized sustained fictitious RPE caused by drug-induced dopamine. In contrast, if rigid reduced SR is not formed and states are represented individually as in simple reinforcement learning models, no sustained RPE is generated at the goal. Formation of rigid reduced SR also attenuates the resistance-dependent decrease in the value of the cue for behavior, makes subsequent introduction of punishment after the goal ineffective, and potentially enhances the propensity of nonresistance through the influence of RPEs via the spiral striatum-midbrain circuit. These results suggest that formation of rigid reduced SR makes cessation of habitual reward-obtaining particularly difficult and can thus be a mechanism for addiction, common to substance and nonsubstance reward.
Collapse
Affiliation(s)
- Kanji Shimomura
- Physical and Health EducationGraduate School of EducationThe University of TokyoTokyoJapan
- Department of Behavioral MedicineNational Institute of Mental HealthNational Center of Neurology and PsychiatryKodairaJapan
| | - Ayaka Kato
- Department of Life SciencesGraduate School of Arts and SciencesThe University of TokyoTokyoJapan
- Laboratory for Circuit Mechanisms of Sensory PerceptionRIKEN Center for Brain ScienceWakoJapan
- Research Fellowship for Young ScientistsJapan Society for the Promotion of ScienceTokyoJapan
| | - Kenji Morita
- Physical and Health EducationGraduate School of EducationThe University of TokyoTokyoJapan
- International Research Center for Neurointelligence (WPI‐IRCN)The University of TokyoTokyoJapan
| |
Collapse
|
23
|
|
24
|
Abstract
Humans and other animals use multiple strategies for making decisions. Reinforcement-learning theory distinguishes between stimulus-response (model-free; MF) learning and deliberative (model-based; MB) planning. The spatial-navigation literature presents a parallel dichotomy between navigation strategies. In "response learning," associated with the dorsolateral striatum (DLS), decisions are anchored to an egocentric reference frame. In "place learning," associated with the hippocampus, decisions are anchored to an allocentric reference frame. Emerging evidence suggests that the contribution of hippocampus to place learning may also underlie its contribution to MB learning by representing relational structure in a cognitive map. Here, we introduce a computational model in which hippocampus subserves place and MB learning by learning a "successor representation" of relational structure between states; DLS implements model-free response learning by learning associations between actions and egocentric representations of landmarks; and action values from either system are weighted by the reliability of its predictions. We show that this model reproduces a range of seemingly disparate behavioral findings in spatial and nonspatial decision tasks and explains the effects of lesions to DLS and hippocampus on these tasks. Furthermore, modeling place cells as driven by boundaries explains the observation that, unlike navigation guided by landmarks, navigation guided by boundaries is robust to "blocking" by prior state-reward associations due to learned associations between place cells. Our model, originally shaped by detailed constraints in the spatial literature, successfully characterizes the hippocampal-striatal system as a general system for decision making via adaptive combination of stimulus-response learning and the use of a cognitive map.
Collapse
|
25
|
Mark S, Moran R, Parr T, Kennerley SW, Behrens TEJ. Transferring structural knowledge across cognitive maps in humans and models. Nat Commun 2020; 11:4783. [PMID: 32963219 PMCID: PMC7508979 DOI: 10.1038/s41467-020-18254-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 08/14/2020] [Indexed: 01/15/2023] Open
Abstract
Relations between task elements often follow hidden underlying structural forms such as periodicities or hierarchies, whose inferences fosters performance. However, transferring structural knowledge to novel environments requires flexible representations that are generalizable over particularities of the current environment, such as its stimuli and size. We suggest that humans represent structural forms as abstract basis sets and that in novel tasks, the structural form is inferred and the relevant basis set is transferred. Using a computational model, we show that such representation allows inference of the underlying structural form, important task states, effective behavioural policies and the existence of unobserved state-trajectories. In two experiments, participants learned three abstract graphs during two successive days. We tested how structural knowledge acquired on Day-1 affected Day-2 performance. In line with our model, participants who had a correct structural prior were able to infer the existence of unobserved state-trajectories and appropriate behavioural policies.
Collapse
Affiliation(s)
- Shirley Mark
- Wellcome Trust Centre for Neuroimaging, UCL. Queen Square 12, London, WC1N 3BG, UK.
| | - Rani Moran
- Max Planck UCL Center for Computational Psychiatry and Aging Research, Russell Square 10-12, London, WC1B 5EH, UK
| | - Thomas Parr
- Wellcome Trust Centre for Neuroimaging, UCL. Queen Square 12, London, WC1N 3BG, UK
| | - Steve W Kennerley
- Sobell Department of Motor Neuroscience, University College London, London, UK
| | - Timothy E J Behrens
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the Brain, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DU, UK
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London, WC1N 3BG, UK
| |
Collapse
|
26
|
Stefanini F, Kushnir L, Jimenez JC, Jennings JH, Woods NI, Stuber GD, Kheirbek MA, Hen R, Fusi S. A Distributed Neural Code in the Dentate Gyrus and in CA1. Neuron 2020; 107:703-716.e4. [PMID: 32521223 PMCID: PMC7442694 DOI: 10.1016/j.neuron.2020.05.022] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 04/01/2020] [Accepted: 05/15/2020] [Indexed: 01/28/2023]
Abstract
Neurons are often considered specialized functional units that encode a single variable. However, many neurons are observed to respond to a mix of disparate sensory, cognitive, and behavioral variables. For such representations, information is distributed across multiple neurons. Here we find this distributed code in the dentate gyrus and CA1 subregions of the hippocampus. Using calcium imaging in freely moving mice, we decoded an animal's position, direction of motion, and speed from the activity of hundreds of cells. The response properties of individual neurons were only partially predictive of their importance for encoding position. Non-place cells encoded position and contributed to position encoding when combined with other cells. Indeed, disrupting the correlations between neural activities decreased decoding performance, mostly in CA1. Our analysis indicates that population methods rather than classical analyses based on single-cell response properties may more accurately characterize the neural code in the hippocampus.
Collapse
Affiliation(s)
- Fabio Stefanini
- Center for Theoretical Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Lyudmila Kushnir
- GNT-LNC, Départment d'Études Cognitives, École Normale Supérieure, INSERM, PSL Research University, 75005 Paris, France
| | - Jessica C Jimenez
- Departments of Neuroscience, Psychiatry, & Pharmacology, Columbia University, New York, NY, USA; Division of Integrative Neuroscience, Department of Psychiatry, New York State Psychiatric Institute, New York, NY, USA
| | - Joshua H Jennings
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Nicholas I Woods
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA, USA; Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA, USA
| | - Garret D Stuber
- Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, University of Washington, Seattle, WA 98195, USA
| | - Mazen A Kheirbek
- Neuroscience Graduate Program, University of California, San Francisco, San Francisco, CA, USA; Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA; Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA; Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
| | - René Hen
- Departments of Neuroscience, Psychiatry, & Pharmacology, Columbia University, New York, NY, USA; Division of Integrative Neuroscience, Department of Psychiatry, New York State Psychiatric Institute, New York, NY, USA; Kavli Institute for Brain Sciences, Columbia University, New York, NY, USA.
| | - Stefano Fusi
- Center for Theoretical Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Kavli Institute for Brain Sciences, Columbia University, New York, NY, USA.
| |
Collapse
|
27
|
Baladron J, Hamker FH. Habit learning in hierarchical cortex-basal ganglia loops. Eur J Neurosci 2020; 52:4613-4638. [PMID: 32237250 DOI: 10.1111/ejn.14730] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 03/21/2020] [Accepted: 03/22/2020] [Indexed: 12/17/2022]
Abstract
How do the multiple cortico-basal ganglia-thalamo-cortical loops interact? Are they parallel and fully independent or controlled by an arbitrator, or are they hierarchically organized? We introduce here a set of four key concepts, integrated and evaluated by means of a neuro-computational model, that bring together current ideas regarding cortex-basal ganglia interactions in the context of habit learning. According to key concept 1, each loop learns to select an intermediate objective at a different abstraction level, moving from goals in the ventral striatum to motor in the putamen. Key concept 2 proposes that the cortex integrates the basal ganglia selection with environmental information regarding the achieved objective. Key concept 3 claims shortcuts between loops, and key concept 4 predicts that loops compute their own prediction error signal for learning. Computational benefits of the key concepts are demonstrated. Contrasting with former concepts of habit learning, the loops collaborate to select goal-directed actions while training slower shortcuts develops habitual responses.
Collapse
Affiliation(s)
- Javier Baladron
- Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany
| | - Fred H Hamker
- Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany
| |
Collapse
|
28
|
Lynn CW, Kahn AE, Nyema N, Bassett DS. Abstract representations of events arise from mental errors in learning and memory. Nat Commun 2020; 11:2313. [PMID: 32385232 PMCID: PMC7210268 DOI: 10.1038/s41467-020-15146-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 02/13/2020] [Indexed: 11/17/2022] Open
Abstract
Humans are adept at uncovering abstract associations in the world around them, yet the underlying mechanisms remain poorly understood. Intuitively, learning the higher-order structure of statistical relationships should involve complex mental processes. Here we propose an alternative perspective: that higher-order associations instead arise from natural errors in learning and memory. Using the free energy principle, which bridges information theory and Bayesian inference, we derive a maximum entropy model of people's internal representations of the transitions between stimuli. Importantly, our model (i) affords a concise analytic form, (ii) qualitatively explains the effects of transition network structure on human expectations, and (iii) quantitatively predicts human reaction times in probabilistic sequential motor tasks. Together, these results suggest that mental errors influence our abstract representations of the world in significant and predictable ways, with direct implications for the study and design of optimally learnable information sources.
Collapse
Affiliation(s)
- Christopher W Lynn
- Department of Physics & Astronomy, College of Arts & Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Ari E Kahn
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Nathaniel Nyema
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Danielle S Bassett
- Department of Physics & Astronomy, College of Arts & Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Electrical & Systems Engineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Santa Fe Institute, Santa Fe, NM, 87501, USA.
| |
Collapse
|
29
|
Bornstein AM, Pickard H. "Chasing the first high": memory sampling in drug choice. Neuropsychopharmacology 2020; 45:907-915. [PMID: 31896119 PMCID: PMC7162911 DOI: 10.1038/s41386-019-0594-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/21/2019] [Accepted: 12/16/2019] [Indexed: 02/02/2023]
Abstract
Although vivid memories of drug experiences are prevalent within clinical contexts and addiction folklore ("chasing the first high"), little is known about the relevance of cognitive processes governing memory retrieval to substance use disorder. Drawing on recent work that identifies episodic memory's influence on decisions for reward, we propose a framework in which drug choices are biased by selective sampling of individual memories during two phases of addiction: (i) downward spiral into persistent use and (ii) relapse. Consideration of how memory retrieval influences the addiction process suggests novel treatment strategies. Rather than try to break learned associations between drug cues and drug rewards, treatment should aim to strengthen existing and/or create new associations between drug cues and drug-inconsistent rewards.
Collapse
Affiliation(s)
- Aaron M Bornstein
- Department of Cognitive Sciences, University of California, Irvine, CA, 92617, USA.
- Center for the Neurobiology of Learning and Memory, University of California, Irvine, CA, 92697, USA.
- Institute for Mathematical Behavioral Sciences, University of California, Irvine, CA, 92697, USA.
| | - Hanna Pickard
- Department of Philosophy, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Berman Institute of Bioethics, Johns Hopkins University, Baltimore, MD, 21205, USA.
| |
Collapse
|
30
|
Momennejad I. Learning Structures: Predictive Representations, Replay, and Generalization. Curr Opin Behav Sci 2020; 32:155-166. [DOI: 10.1016/j.cobeha.2020.02.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
31
|
Bouchacourt F, Palminteri S, Koechlin E, Ostojic S. Temporal chunking as a mechanism for unsupervised learning of task-sets. eLife 2020; 9:50469. [PMID: 32149602 PMCID: PMC7108869 DOI: 10.7554/elife.50469] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 02/24/2020] [Indexed: 12/26/2022] Open
Abstract
Depending on environmental demands, humans can learn and exploit multiple concurrent sets of stimulus-response associations. Mechanisms underlying the learning of such task-sets remain unknown. Here we investigate the hypothesis that task-set learning relies on unsupervised chunking of stimulus-response associations that occur in temporal proximity. We examine behavioral and neural data from a task-set learning experiment using a network model. We first show that task-set learning can be achieved provided the timescale of chunking is slower than the timescale of stimulus-response learning. Fitting the model to behavioral data on a subject-by-subject basis confirmed this expectation and led to specific predictions linking chunking and task-set retrieval that were borne out by behavioral performance and reaction times. Comparing the model activity with BOLD signal allowed us to identify neural correlates of task-set retrieval in a functional network involving ventral and dorsal prefrontal cortex, with the dorsal system preferentially engaged when retrievals are used to improve performance.
Collapse
Affiliation(s)
- Flora Bouchacourt
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Sante et de la Recherche Medicale, Paris, France.,Departement d'Etudes Cognitives, Ecole Normale Superieure, Paris, France
| | - Stefano Palminteri
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Sante et de la Recherche Medicale, Paris, France.,Departement d'Etudes Cognitives, Ecole Normale Superieure, Paris, France.,Institut d'Etudes de la Cognition, Universite de Recherche Paris Sciences et Lettres, Paris, France
| | - Etienne Koechlin
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Sante et de la Recherche Medicale, Paris, France.,Departement d'Etudes Cognitives, Ecole Normale Superieure, Paris, France
| | - Srdjan Ostojic
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Sante et de la Recherche Medicale, Paris, France.,Departement d'Etudes Cognitives, Ecole Normale Superieure, Paris, France.,Institut d'Etudes de la Cognition, Universite de Recherche Paris Sciences et Lettres, Paris, France
| |
Collapse
|
32
|
Abstract
The capacity to search memory for events learned in a particular context stands as one of the most remarkable feats of the human brain. How is memory search accomplished? First, I review the central ideas investigated by theorists developing models of memory. Then, I review select benchmark findings concerning memory search and analyze two influential computational approaches to modeling memory search: dual-store theory and retrieved context theory. Finally, I discuss the key theoretical ideas that have emerged from these modeling studies and the open questions that need to be answered by future research.
Collapse
Affiliation(s)
- Michael J Kahana
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
| |
Collapse
|
33
|
|
34
|
Tiganj Z, Gershman SJ, Sederberg PB, Howard MW. Estimating Scale-Invariant Future in Continuous Time. Neural Comput 2019; 31:681-709. [PMID: 30764739 PMCID: PMC6959535 DOI: 10.1162/neco_a_01171] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Natural learners must compute an estimate of future outcomes that follow from a stimulus in continuous time. Widely used reinforcement learning algorithms discretize continuous time and estimate either transition functions from one step to the next (model-based algorithms) or a scalar value of exponentially discounted future reward using the Bellman equation (model-free algorithms). An important drawback of model-based algorithms is that computational cost grows linearly with the amount of time to be simulated. An important drawback of model-free algorithms is the need to select a timescale required for exponential discounting. We present a computational mechanism, developed based on work in psychology and neuroscience, for computing a scale-invariant timeline of future outcomes. This mechanism efficiently computes an estimate of inputs as a function of future time on a logarithmically compressed scale and can be used to generate a scale-invariant power-law-discounted estimate of expected future reward. The representation of future time retains information about what will happen when. The entire timeline can be constructed in a single parallel operation that generates concrete behavioral and neural predictions. This computational mechanism could be incorporated into future reinforcement learning algorithms.
Collapse
Affiliation(s)
- Zoran Tiganj
- Center for Memory and Brain, Department of Psychological and Brain Sciences, Boston, MA 02215, U.S.A.
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, U.S.A.
| | - Per B Sederberg
- Department of Psychology, University of Virginia, Charlottesville, VA, 22904, U.S.A.
| | - Marc W Howard
- Center for Memory and Brain, Department of Psychological and Brain Sciences, Boston, MA 02215, U.S.A.
| |
Collapse
|
35
|
Jung MW, Lee H, Jeong Y, Lee JW, Lee I. Remembering rewarding futures: A simulation-selection model of the hippocampus. Hippocampus 2018; 28:913-930. [PMID: 30155938 PMCID: PMC6587829 DOI: 10.1002/hipo.23023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 07/06/2018] [Accepted: 08/23/2018] [Indexed: 02/06/2023]
Abstract
Despite tremendous progress, the neural circuit dynamics underlying hippocampal mnemonic processing remain poorly understood. We propose a new model for hippocampal function-the simulation-selection model-based on recent experimental findings and neuroecological considerations. Under this model, the mammalian hippocampus evolved to simulate and evaluate arbitrary navigation sequences. Specifically, we suggest that CA3 simulates unexperienced navigation sequences in addition to remembering experienced ones, and CA1 selects from among these CA3-generated sequences, reinforcing those that are likely to maximize reward during offline idling states. High-value sequences reinforced in CA1 may allow flexible navigation toward a potential rewarding location during subsequent navigation. We argue that the simulation-selection functions of the hippocampus have evolved in mammals mostly because of the unique navigational needs of land mammals. Our model may account for why the mammalian hippocampus has evolved not only to remember, but also to imagine episodes, and how this might be implemented in its neural circuits.
Collapse
Affiliation(s)
- Min Whan Jung
- Center for Synaptic Brain Dysfunctions, Institute for Basic ScienceDaejeonSouth Korea
- Department of Biological SciencesKorea Advanced Institute of Science and TechnologyDaejeonSouth Korea
| | - Hyunjung Lee
- Department of AnatomyKyungpook National University School of MedicineDaeguSouth Korea
| | - Yeongseok Jeong
- Center for Synaptic Brain Dysfunctions, Institute for Basic ScienceDaejeonSouth Korea
- Department of Biological SciencesKorea Advanced Institute of Science and TechnologyDaejeonSouth Korea
| | - Jong Won Lee
- Center for Synaptic Brain Dysfunctions, Institute for Basic ScienceDaejeonSouth Korea
| | - Inah Lee
- Department of Brain and Cognitive SciencesSeoul National UniversitySeoulSouth Korea
| |
Collapse
|
36
|
Gardner MPH, Schoenbaum G, Gershman SJ. Rethinking dopamine as generalized prediction error. Proc Biol Sci 2018; 285:20181645. [PMID: 30464063 PMCID: PMC6253385 DOI: 10.1098/rspb.2018.1645] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 11/02/2018] [Indexed: 01/10/2023] Open
Abstract
Midbrain dopamine neurons are commonly thought to report a reward prediction error (RPE), as hypothesized by reinforcement learning (RL) theory. While this theory has been highly successful, several lines of evidence suggest that dopamine activity also encodes sensory prediction errors unrelated to reward. Here, we develop a new theory of dopamine function that embraces a broader conceptualization of prediction errors. By signalling errors in both sensory and reward predictions, dopamine supports a form of RL that lies between model-based and model-free algorithms. This account remains consistent with current canon regarding the correspondence between dopamine transients and RPEs, while also accounting for new data suggesting a role for these signals in phenomena such as sensory preconditioning and identity unblocking, which ostensibly draw upon knowledge beyond reward predictions.
Collapse
Affiliation(s)
- Matthew P H Gardner
- Intramural Research Program of the National Institute on Drug Abuse, NIH, Bethesda, MD, USA
| | - Geoffrey Schoenbaum
- Intramural Research Program of the National Institute on Drug Abuse, NIH, Bethesda, MD, USA
- Department of Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
37
|
Braun EK, Wimmer GE, Shohamy D. Retroactive and graded prioritization of memory by reward. Nat Commun 2018; 9:4886. [PMID: 30459310 PMCID: PMC6244210 DOI: 10.1038/s41467-018-07280-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/24/2018] [Indexed: 11/08/2022] Open
Abstract
Many decisions are based on an internal model of the world. Yet, how such a model is constructed from experience and represented in memory remains unknown. We test the hypothesis that reward shapes memory for sequences of events by retroactively prioritizing memory for objects as a function of their distance from reward. Human participants encountered neutral objects while exploring a series of mazes for reward. Across six data sets, we find that reward systematically modulates memory for neutral objects, retroactively prioritizing memory for objects closest to the reward. This effect of reward on memory emerges only after a 24-hour delay and is stronger for mazes followed by a longer rest interval, suggesting a role for post-reward replay and overnight consolidation, as predicted by neurobiological data in animals. These findings demonstrate that reward retroactively prioritizes memory along a sequential gradient, consistent with the role of memory in supporting adaptive decision-making.
Collapse
Affiliation(s)
- Erin Kendall Braun
- Department of Psychology, Columbia University, 406 Schermerhorn Hall, 1190 Amsterdam Ave MC 5501, New York, NY, 10027, USA.
| | - G Elliott Wimmer
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research and Wellcome Centre for Human Neuroimaging, University College London, London, WC1B 5EH, UK
| | - Daphna Shohamy
- Department of Psychology, Columbia University, 406 Schermerhorn Hall, 1190 Amsterdam Ave MC 5501, New York, NY, 10027, USA
- Zuckerman Mind Brain Behavior Institute and Kavli Institute for Brain Science, Columbia University, 3327 Broadway, New York, NY, 10027, USA
| |
Collapse
|
38
|
Gershman SJ. The Successor Representation: Its Computational Logic and Neural Substrates. J Neurosci 2018; 38:7193-7200. [PMID: 30006364 PMCID: PMC6096039 DOI: 10.1523/jneurosci.0151-18.2018] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 06/28/2018] [Accepted: 07/05/2018] [Indexed: 01/19/2023] Open
Abstract
Reinforcement learning is the process by which an agent learns to predict long-term future reward. We now understand a great deal about the brain's reinforcement learning algorithms, but we know considerably less about the representations of states and actions over which these algorithms operate. A useful starting point is asking what kinds of representations we would want the brain to have, given the constraints on its computational architecture. Following this logic leads to the idea of the successor representation, which encodes states of the environment in terms of their predictive relationships with other states. Recent behavioral and neural studies have provided evidence for the successor representation, and computational studies have explored ways to extend the original idea. This paper reviews progress on these fronts, organizing them within a broader framework for understanding how the brain negotiates tradeoffs between efficiency and flexibility for reinforcement learning.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138
| |
Collapse
|
39
|
Sadacca BF, Wied HM, Lopatina N, Saini GK, Nemirovsky D, Schoenbaum G. Orbitofrontal neurons signal sensory associations underlying model-based inference in a sensory preconditioning task. eLife 2018; 7:e30373. [PMID: 29513220 PMCID: PMC5847331 DOI: 10.7554/elife.30373] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 03/02/2018] [Indexed: 11/20/2022] Open
Abstract
Using knowledge of the structure of the world to infer value is at the heart of model-based reasoning and relies on a circuit that includes the orbitofrontal cortex (OFC). Some accounts link this to the representation of biological significance or value by neurons in OFC, while other models focus on the representation of associative structure or cognitive maps. Here we tested between these accounts by recording OFC neurons in rats during an OFC-dependent sensory preconditioning task. We found that while OFC neurons were strongly driven by biological significance or reward predictions at the end of training, they also showed clear evidence of acquiring the incidental stimulus-stimulus pairings in the preconditioning phase, prior to reward training. These results support a role for OFC in representing associative structure, independent of value.
Collapse
Affiliation(s)
- Brian F Sadacca
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
| | - Heather M Wied
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
| | - Nina Lopatina
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
| | - Gurpreet K Saini
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
| | - Daniel Nemirovsky
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
| | - Geoffrey Schoenbaum
- Intramural Research program of the National Institute on Drug AbuseNIHBaltimoreUnited States
- Department of Anatomy and NeurobiologyUniversity of Maryland School of MedicineBaltimoreUnited States
- Department of NeuroscienceJohns Hopkins School of MedicineBaltimoreUnited States
| |
Collapse
|
40
|
Fakhari P, Khodadadi A, Busemeyer JR. The detour problem in a stochastic environment: Tolman revisited. Cogn Psychol 2018; 101:29-49. [PMID: 29294373 DOI: 10.1016/j.cogpsych.2017.12.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 12/22/2017] [Accepted: 12/23/2017] [Indexed: 10/18/2022]
Abstract
We designed a grid world task to study human planning and re-planning behavior in an unknown stochastic environment. In our grid world, participants were asked to travel from a random starting point to a random goal position while maximizing their reward. Because they were not familiar with the environment, they needed to learn its characteristics from experience to plan optimally. Later in the task, we randomly blocked the optimal path to investigate whether and how people adjust their original plans to find a detour. To this end, we developed and compared 12 different models. These models were different on how they learned and represented the environment and how they planned to catch the goal. The majority of our participants were able to plan optimally. We also showed that people were capable of revising their plans when an unexpected event occurred. The result from the model comparison showed that the model-based reinforcement learning approach provided the best account for the data and outperformed heuristics in explaining the behavioral data in the re-planning trials.
Collapse
Affiliation(s)
- Pegah Fakhari
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States.
| | - Arash Khodadadi
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States
| | - Jerome R Busemeyer
- Indiana University, Department of Psychological and Brain Sciences, Bloomington, IN, United States
| |
Collapse
|
41
|
Stachenfeld KL, Botvinick MM, Gershman SJ. The hippocampus as a predictive map. Nat Neurosci 2017; 20:1643-1653. [PMID: 28967910 DOI: 10.1038/nn.4650] [Citation(s) in RCA: 369] [Impact Index Per Article: 52.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 08/29/2017] [Indexed: 12/19/2022]
Abstract
A cognitive map has long been the dominant metaphor for hippocampal function, embracing the idea that place cells encode a geometric representation of space. However, evidence for predictive coding, reward sensitivity and policy dependence in place cells suggests that the representation is not purely spatial. We approach this puzzle from a reinforcement learning perspective: what kind of spatial representation is most useful for maximizing future reward? We show that the answer takes the form of a predictive representation. This representation captures many aspects of place cell responses that fall outside the traditional view of a cognitive map. Furthermore, we argue that entorhinal grid cells encode a low-dimensionality basis set for the predictive representation, useful for suppressing noise in predictions and extracting multiscale structure for hierarchical planning.
Collapse
Affiliation(s)
- Kimberly L Stachenfeld
- DeepMind, London, UK.,Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA
| | - Matthew M Botvinick
- DeepMind, London, UK.,Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
42
|
Abstract
Rational analyses of memory suggest that retrievability of past experience depends on its usefulness for predicting the future: memory is adapted to the temporal structure of the environment. Recent research has enriched this view by applying it to semantic memory and reinforcement learning. This paper describes how multiple forms of memory can be linked via common predictive principles, possibly subserved by a shared neural substrate in the hippocampus. Predictive principles offer an explanation for a wide range of behavioral and neural phenomena, including semantic fluency, temporal contiguity effects in episodic memory, and the topological properties of hippocampal place cells.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University
| |
Collapse
|
43
|
Gravina MT, Sederberg PB. The neural architecture of prediction over a continuum of spatiotemporal scales. Curr Opin Behav Sci 2017. [DOI: 10.1016/j.cobeha.2017.09.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
44
|
Momennejad I, Russek EM, Cheong JH, Botvinick MM, Daw ND, Gershman SJ. The successor representation in human reinforcement learning. Nat Hum Behav 2017; 1:680-692. [PMID: 31024137 PMCID: PMC6941356 DOI: 10.1038/s41562-017-0180-8] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 07/07/2017] [Indexed: 11/08/2022]
Abstract
Theories of reward learning in neuroscience have focused on two families of algorithms thought to capture deliberative versus habitual choice. 'Model-based' algorithms compute the value of candidate actions from scratch, whereas 'model-free' algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation, which balances flexibility and efficiency by storing partially computed action values: predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. The successor representation's reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task's sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioural studies with humans. These results suggest that the successor representation is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.
Collapse
Affiliation(s)
- I Momennejad
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
| | - E M Russek
- Center for Neural Science, New York University, New York, NY, USA
| | - J H Cheong
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - M M Botvinick
- DeepMind and Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - N D Daw
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - S J Gershman
- Department of Psychology, Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
45
|
Russek EM, Momennejad I, Botvinick MM, Gershman SJ, Daw ND. Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLoS Comput Biol 2017; 13:e1005768. [PMID: 28945743 PMCID: PMC5628940 DOI: 10.1371/journal.pcbi.1005768] [Citation(s) in RCA: 127] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 10/05/2017] [Accepted: 09/04/2017] [Indexed: 11/19/2022] Open
Abstract
Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that neural circuits supporting model-based behavior are structurally homologous to and overlapping with those thought to carry out model-free temporal difference (TD) learning. Here, we lay out a family of approaches by which model-based computation may be built upon a core of TD learning. The foundation of this framework is the successor representation, a predictive state representation that, when combined with TD learning of value predictions, can produce a subset of the behaviors associated with model-based learning, while requiring less decision-time computation than dynamic programming. Using simulations, we delineate the precise behavioral capabilities enabled by evaluating actions using this approach, and compare them to those demonstrated by biological organisms. We then introduce two new algorithms that build upon the successor representation while progressively mitigating its limitations. Because this framework can account for the full range of observed putatively model-based behaviors while still utilizing a core TD framework, we suggest that it represents a neurally plausible family of mechanisms for model-based evaluation.
Collapse
Affiliation(s)
- Evan M. Russek
- Center for Neural Science, New York University, New York, NY, United States of America
| | - Ida Momennejad
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, United States of America
| | - Matthew M. Botvinick
- DeepMind, London, United Kingdom and Gatsby Computational Neuroscience Unit, University College London, United Kingdom
| | - Samuel J. Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, United States of America
| | - Nathaniel D. Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, United States of America
| |
Collapse
|
46
|
Tartaglia EM, Clarke AM, Herzog MH. What to Choose Next? A Paradigm for Testing Human Sequential Decision Making. Front Psychol 2017; 8:312. [PMID: 28326050 PMCID: PMC5339299 DOI: 10.3389/fpsyg.2017.00312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Accepted: 02/20/2017] [Indexed: 11/13/2022] Open
Abstract
Many of the decisions we make in our everyday lives are sequential and entail sparse rewards. While sequential decision-making has been extensively investigated in theory (e.g., by reinforcement learning models) there is no systematic experimental paradigm to test it. Here, we developed such a paradigm and investigated key components of reinforcement learning models: the eligibility trace (i.e., the memory trace of previous decision steps), the external reward, and the ability to exploit the statistics of the environment's structure (model-free vs. model-based mechanisms). We show that the eligibility trace decays not with sheer time, but rather with the number of discrete decision steps made by the participants. We further show that, unexpectedly, neither monetary rewards nor the environment's spatial regularity significantly modulate behavioral performance. Finally, we found that model-free learning algorithms describe human performance better than model-based algorithms.
Collapse
Affiliation(s)
- Elisa M. Tartaglia
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL)Lausanne, Switzerland
- Aging in Vision and Action Lab, Sorbonne Universités, UPMC Univ Paris 06, INSERM, CNRS, Institut de la VisionParis, France
| | - Aaron M. Clarke
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL)Lausanne, Switzerland
- Psychology Department and Neuroscience Department, Aysel Sabuncu Brain Research Center, Bilkent UniversityAnkara, Turkey
| | - Michael H. Herzog
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL)Lausanne, Switzerland
| |
Collapse
|
47
|
Kaplan R, King J, Koster R, Penny WD, Burgess N, Friston KJ. The Neural Representation of Prospective Choice during Spatial Planning and Decisions. PLoS Biol 2017; 15:e1002588. [PMID: 28081125 PMCID: PMC5231323 DOI: 10.1371/journal.pbio.1002588] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 12/14/2016] [Indexed: 01/17/2023] Open
Abstract
We are remarkably adept at inferring the consequences of our actions, yet the neuronal mechanisms that allow us to plan a sequence of novel choices remain unclear. We used functional magnetic resonance imaging (fMRI) to investigate how the human brain plans the shortest path to a goal in novel mazes with one (shallow maze) or two (deep maze) choice points. We observed two distinct anterior prefrontal responses to demanding choices at the second choice point: one in rostrodorsal medial prefrontal cortex (rd-mPFC)/superior frontal gyrus (SFG) that was also sensitive to (deactivated by) demanding initial choices and another in lateral frontopolar cortex (lFPC), which was only engaged by demanding choices at the second choice point. Furthermore, we identified hippocampal responses during planning that correlated with subsequent choice accuracy and response time, particularly in mazes affording sequential choices. Psychophysiological interaction (PPI) analyses showed that coupling between the hippocampus and rd-mPFC increases during sequential (deep versus shallow) planning and is higher before correct versus incorrect choices. In short, using a naturalistic spatial planning paradigm, we reveal how the human brain represents sequential choices during planning without extensive training. Our data highlight a network centred on the cortical midline and hippocampus that allows us to make prospective choices while maintaining initial choices during planning in novel environments. Using neuroimaging and computational modelling, this study explains how the human brain represents initial versus subsequent choices during spatial planning in novel environments. We are remarkably adept at inferring the consequences of our actions, even in novel situations. However, the neuronal mechanisms that allow us to plan a sequence of novel choices remain a mystery. One hypothesis is that anterior prefrontal brain regions can jump ahead from an initial decision to evaluate subsequent choices. Here, we examine how the brain represents initial versus subsequent choices of varying difficulty during spatial planning in novel environments. Specifically, participants visually searched for the shortest path to a goal in pictures of novel mazes that contained one or two path junctions. We monitored the participants’ brain activity during the task with functional magnetic resonance imaging (fMRI). We observed, in the anterior prefrontal brain, two distinct responses to demanding choices at the second junction: one in the rostrodorsal medial prefrontal cortex (rd-mPFC), which also signalled less demanding initial choices, and another one in the lateral frontopolar cortex (lFPC), which was only engaged by demanding choices at the second junction. Notably, interactions of the rd-mPFC with the hippocampus, a region associated with memory, increased when planning required extensive deliberation and particularly when planning led to accurate choices. Our findings show how humans can rapidly formulate a plan in novel environments. More broadly, these data uncover potential neural mechanisms underlying how we make inferences about states beyond a current subjective state.
Collapse
Affiliation(s)
- Raphael Kaplan
- Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom
- * E-mail:
| | - John King
- UCL Institute of Cognitive Neuroscience, University College London, London, United Kingdom
- Clinical, Education and Health Psychology, University College London, London, United Kingdom
| | - Raphael Koster
- Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom
- UCL Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| | - William D. Penny
- Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom
| | - Neil Burgess
- Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom
- UCL Institute of Cognitive Neuroscience, University College London, London, United Kingdom
- UCL Institute of Neurology, University College London, London, United Kingdom
| | - Karl J. Friston
- Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
48
|
Abstract
To many, the poster child for David Marr's famous three levels of scientific inquiry is reinforcement learning-a computational theory of reward optimization, which readily prescribes algorithmic solutions that evidence striking resemblance to signals found in the brain, suggesting a straightforward neural implementation. Here we review questions that remain open at each level of analysis, concluding that the path forward to their resolution calls for inspiration across levels, rather than a focus on mutual constraints.
Collapse
Affiliation(s)
- Yael Niv
- Psychology Department & Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, 08540
| | - Angela Langdon
- Psychology Department & Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, 08540
| |
Collapse
|
49
|
Marblestone AH, Wayne G, Kording KP. Toward an Integration of Deep Learning and Neuroscience. Front Comput Neurosci 2016; 10:94. [PMID: 27683554 PMCID: PMC5021692 DOI: 10.3389/fncom.2016.00094] [Citation(s) in RCA: 243] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 08/24/2016] [Indexed: 01/22/2023] Open
Abstract
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain's specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
Collapse
Affiliation(s)
- Adam H. Marblestone
- Synthetic Neurobiology Group, Massachusetts Institute of Technology, Media LabCambridge, MA, USA
| | | | - Konrad P. Kording
- Rehabilitation Institute of Chicago, Northwestern UniversityChicago, IL, USA
| |
Collapse
|
50
|
Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning. Nat Commun 2016; 7:12438. [PMID: 27511383 PMCID: PMC4987535 DOI: 10.1038/ncomms12438] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 07/03/2016] [Indexed: 11/08/2022] Open
Abstract
Organisms appear to learn and make decisions using different strategies known as model-free and model-based learning; the former is mere reinforcement of previously rewarded actions and the latter is a forward-looking strategy that involves evaluation of action-state transition probabilities. Prior work has used neural data to argue that both model-based and model-free learners implement a value comparison process at trial onset, but model-based learners assign more weight to forward-looking computations. Here using eye-tracking, we report evidence for a different interpretation of prior results: model-based subjects make their choices prior to trial onset. In contrast, model-free subjects tend to ignore model-based aspects of the task and instead seem to treat the decision problem as a simple comparison process between two differentially valued items, consistent with previous work on sequential-sampling models of decision making. These findings illustrate a problem with assuming that experimental subjects make their decisions at the same prescribed time.
Collapse
|