1
|
Urbaniak R, Xie M, Mackevicius E. Linking cognitive strategy, neural mechanism, and movement statistics in group foraging behaviors. Sci Rep 2024; 14:21770. [PMID: 39294261 PMCID: PMC11411083 DOI: 10.1038/s41598-024-71931-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 09/02/2024] [Indexed: 09/20/2024] Open
Abstract
Foraging for food is a rich and ubiquitous animal behavior that involves complex cognitive decisions, and interactions between different individuals and species. There has been exciting recent progress in understanding multi-agent foraging behavior from cognitive, neuroscience, and statistical perspectives, but integrating these perspectives can be elusive. This paper seeks to unify these perspectives, allowing statistical analysis of observational animal movement data to shed light on the viability of cognitive models of foraging strategies. We start with cognitive agents with internal preferences expressed as value functions, and implement this in a biologically plausible neural network, and an equivalent statistical model, where statistical predictors of agents' movements correspond to the components of the value functions. We test this framework by simulating foraging agents and using Bayesian statistical modeling to correctly identify the factors that best predict the agents' behavior. As further validation, we use this framework to analyze an open-source locust foraging dataset. Finally, we collect new multi-agent real-world bird foraging data, and apply this method to analyze the preferences of different species. Together, this work provides an initial roadmap to integrate cognitive, neuroscience, and statistical approaches for reasoning about animal foraging in complex multi-agent environments.
Collapse
Affiliation(s)
| | - Marjorie Xie
- Basis Research Institute, New York, 10026, USA
- Arizona State University, School for the Future of Innovation in Society, Tempe, 85287, USA
- New York Academy of Sciences, New York, 10006, USA
- Columbia University, New York, 10027, USA
| | - Emily Mackevicius
- Basis Research Institute, New York, 10026, USA.
- Columbia University, New York, 10027, USA.
| |
Collapse
|
2
|
Kessler F, Frankenstein J, Rothkopf CA. Human navigation strategies and their errors result from dynamic interactions of spatial uncertainties. Nat Commun 2024; 15:5677. [PMID: 38971789 PMCID: PMC11227593 DOI: 10.1038/s41467-024-49722-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 06/14/2024] [Indexed: 07/08/2024] Open
Abstract
Goal-directed navigation requires continuously integrating uncertain self-motion and landmark cues into an internal sense of location and direction, concurrently planning future paths, and sequentially executing motor actions. Here, we provide a unified account of these processes with a computational model of probabilistic path planning in the framework of optimal feedback control under uncertainty. This model gives rise to diverse human navigational strategies previously believed to be distinct behaviors and predicts quantitatively both the errors and the variability of navigation across numerous experiments. This furthermore explains how sequential egocentric landmark observations form an uncertain allocentric cognitive map, how this internal map is used both in route planning and during execution of movements, and reconciles seemingly contradictory results about cue-integration behavior in navigation. Taken together, the present work provides a parsimonious explanation of how patterns of human goal-directed navigation behavior arise from the continuous and dynamic interactions of spatial uncertainties in perception, cognition, and action.
Collapse
Affiliation(s)
- Fabian Kessler
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany.
| | - Julia Frankenstein
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Constantin A Rothkopf
- Centre for Cognitive Science & Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
- Frankfurt Institute for Advanced Studies, Goethe University, Frankfurt, Germany
| |
Collapse
|
3
|
Alejandro RJ, Holroyd CB. Hierarchical control over foraging behavior by anterior cingulate cortex. Neurosci Biobehav Rev 2024; 160:105623. [PMID: 38490499 DOI: 10.1016/j.neubiorev.2024.105623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/14/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024]
Abstract
Foraging is a natural behavior that involves making sequential decisions to maximize rewards while minimizing the costs incurred when doing so. The prevalence of foraging across species suggests that a common brain computation underlies its implementation. Although anterior cingulate cortex is believed to contribute to foraging behavior, its specific role has been contentious, with predominant theories arguing either that it encodes environmental value or choice difficulty. Additionally, recent attempts to characterize foraging have taken place within the reinforcement learning framework, with increasingly complex models scaling with task complexity. Here we review reinforcement learning foraging models, highlighting the hierarchical structure of many foraging problems. We extend this literature by proposing that ACC guides foraging according to principles of model-based hierarchical reinforcement learning. This idea holds that ACC function is organized hierarchically along a rostral-caudal gradient, with rostral structures monitoring the status and completion of high-level task goals (like finding food), and midcingulate structures overseeing the execution of task options (subgoals, like harvesting fruit) and lower-level actions (such as grabbing an apple).
Collapse
Affiliation(s)
| | - Clay B Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
4
|
Shahidi N, Franch M, Parajuli A, Schrater P, Wright A, Pitkow X, Dragoi V. Population coding of strategic variables during foraging in freely moving macaques. Nat Neurosci 2024; 27:772-781. [PMID: 38443701 PMCID: PMC11001579 DOI: 10.1038/s41593-024-01575-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 01/09/2024] [Indexed: 03/07/2024]
Abstract
Until now, it has been difficult to examine the neural bases of foraging in naturalistic environments because previous approaches have relied on restrained animals performing trial-based foraging tasks. Here we allowed unrestrained monkeys to freely interact with concurrent reward options while we wirelessly recorded population activity in the dorsolateral prefrontal cortex. The animals decided when and where to forage based on whether their prediction of reward was fulfilled or violated. This prediction was not solely based on a history of reward delivery, but also on the understanding that waiting longer improves the chance of reward. The task variables were continuously represented in a subspace of the high-dimensional population activity, and this compressed representation predicted the animal's subsequent choices better than the true task variables and as well as the raw neural activity. Our results indicate that monkeys' foraging strategies are based on a cortical model of reward dynamics as animals freely explore their environment.
Collapse
Affiliation(s)
- Neda Shahidi
- Department of Neurobiology and Anatomy, McGovern Medical School, University of Texas, Houston, Houston, TX, USA
- Georg-Elias-Müller-Institute for Psychology, Georg August-Universität, Göttingen, Germany
- Cognitive Neuroscience Laboratory, German Primate Center, Göttingen, Germany
| | - Melissa Franch
- Department of Neurobiology and Anatomy, McGovern Medical School, University of Texas, Houston, Houston, TX, USA
| | - Arun Parajuli
- Department of Neurobiology and Anatomy, McGovern Medical School, University of Texas, Houston, Houston, TX, USA
| | - Paul Schrater
- Department of Computer Science, University of Minnesota, Minneapolis, MN, USA
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Anthony Wright
- Department of Neurobiology and Anatomy, McGovern Medical School, University of Texas, Houston, Houston, TX, USA
| | - Xaq Pitkow
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA.
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA.
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA.
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
- Department of Machine Learning, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Valentin Dragoi
- Department of Neurobiology and Anatomy, McGovern Medical School, University of Texas, Houston, Houston, TX, USA.
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA.
- Neuroengineering Initiative, Rice University, Houston, TX, USA.
| |
Collapse
|
5
|
Thomas T, Straub D, Tatai F, Shene M, Tosik T, Kersting K, Rothkopf CA. Modelling dataset bias in machine-learned theories of economic decision-making. Nat Hum Behav 2024; 8:679-691. [PMID: 38216691 PMCID: PMC11045447 DOI: 10.1038/s41562-023-01784-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 11/14/2023] [Indexed: 01/14/2024]
Abstract
Normative and descriptive models have long vied to explain and predict human risky choices, such as those between goods or gambles. A recent study reported the discovery of a new, more accurate model of human decision-making by training neural networks on a new online large-scale dataset, choices13k. Here we systematically analyse the relationships between several models and datasets using machine-learning methods and find evidence for dataset bias. Because participants' choices in stochastically dominated gambles were consistently skewed towards equipreference in the choices13k dataset, we hypothesized that this reflected increased decision noise. Indeed, a probabilistic generative model adding structured decision noise to a neural network trained on data from a laboratory study transferred best, that is, outperformed all models apart from those trained on choices13k. We conclude that a careful combination of theory and data analysis is still required to understand the complex interactions of machine-learning models and data of human risky choices.
Collapse
Affiliation(s)
- Tobias Thomas
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany.
- Hessian Center for Artificial Intelligence, Darmstadt, Germany.
| | - Dominik Straub
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Fabian Tatai
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Megan Shene
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Tümer Tosik
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
| | - Kristian Kersting
- Hessian Center for Artificial Intelligence, Darmstadt, Germany
- Centre for Cognitive Science and Computer Science Department, Technical University of Darmstadt, Darmstadt, Germany
| | - Constantin A Rothkopf
- Centre for Cognitive Science and Institute of Psychology, Technical University of Darmstadt, Darmstadt, Germany
- Hessian Center for Artificial Intelligence, Darmstadt, Germany
| |
Collapse
|
6
|
Hennig JA, Romero Pinto SA, Yamaguchi T, Linderman SW, Uchida N, Gershman SJ. Emergence of belief-like representations through reinforcement learning. PLoS Comput Biol 2023; 19:e1011067. [PMID: 37695776 PMCID: PMC10513382 DOI: 10.1371/journal.pcbi.1011067] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 09/21/2023] [Accepted: 08/27/2023] [Indexed: 09/13/2023] Open
Abstract
To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming "beliefs"-optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial observability, it is not the only way, nor is it the most computationally scalable solution in complex, real-world environments. Here we show that a recurrent neural network (RNN) can learn to estimate value directly from observations, generating reward prediction errors that resemble those observed experimentally, without any explicit objective of estimating beliefs. We integrate statistical, functional, and dynamical systems perspectives on beliefs to show that the RNN's learned representation encodes belief information, but only when the RNN's capacity is sufficiently large. These results illustrate how animals can estimate value in tasks without explicitly estimating beliefs, yielding a representation useful for systems with limited capacity.
Collapse
Affiliation(s)
- Jay A. Hennig
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America
| | - Sandra A. Romero Pinto
- Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, Massachusetts, USA
| | - Takahiro Yamaguchi
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Future Research Department, Toyota Research Institute of North America, Toyota Motor North America, Ann Arbor, Michigan, United States of America
| | - Scott W. Linderman
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, California, United States of America
- Department of Statistics, Stanford University, Stanford, California, United States of America
| | - Naoshige Uchida
- Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Samuel J. Gershman
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- Center for Brain Science, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
7
|
Maselli A, Gordon J, Eluchans M, Lancia GL, Thiery T, Moretti R, Cisek P, Pezzulo G. Beyond simple laboratory studies: Developing sophisticated models to study rich behavior. Phys Life Rev 2023; 46:220-244. [PMID: 37499620 DOI: 10.1016/j.plrev.2023.07.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 07/06/2023] [Indexed: 07/29/2023]
Abstract
Psychology and neuroscience are concerned with the study of behavior, of internal cognitive processes, and their neural foundations. However, most laboratory studies use constrained experimental settings that greatly limit the range of behaviors that can be expressed. While focusing on restricted settings ensures methodological control, it risks impoverishing the object of study: by restricting behavior, we might miss key aspects of cognitive and neural functions. In this article, we argue that psychology and neuroscience should increasingly adopt innovative experimental designs, measurement methods, analysis techniques and sophisticated computational models to probe rich, ecologically valid forms of behavior, including social behavior. We discuss the challenges of studying rich forms of behavior as well as the novel opportunities offered by state-of-the-art methodologies and new sensing technologies, and we highlight the importance of developing sophisticated formal models. We exemplify our arguments by reviewing some recent streams of research in psychology, neuroscience and other fields (e.g., sports analytics, ethology and robotics) that have addressed rich forms of behavior in a model-based manner. We hope that these "success cases" will encourage psychologists and neuroscientists to extend their toolbox of techniques with sophisticated behavioral models - and to use them to study rich forms of behavior as well as the cognitive and neural processes that they engage.
Collapse
Affiliation(s)
- Antonella Maselli
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Jeremy Gordon
- University of California, Berkeley, Berkeley, CA, 94704, United States
| | - Mattia Eluchans
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy; University of Rome "La Sapienza", Rome, Italy
| | - Gian Luca Lancia
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy; University of Rome "La Sapienza", Rome, Italy
| | - Thomas Thiery
- Department of Psychology, University of Montréal, Montréal, Québec, Canada
| | - Riccardo Moretti
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy; University of Rome "La Sapienza", Rome, Italy
| | - Paul Cisek
- Department of Neuroscience, University of Montréal, Montréal, Québec, Canada
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| |
Collapse
|
8
|
Lakshminarasimhan KJ, Avila E, Pitkow X, Angelaki DE. Dynamical latent state computation in the male macaque posterior parietal cortex. Nat Commun 2023; 14:1832. [PMID: 37005470 PMCID: PMC10067966 DOI: 10.1038/s41467-023-37400-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 03/15/2023] [Indexed: 04/04/2023] Open
Abstract
Success in many real-world tasks depends on our ability to dynamically track hidden states of the world. We hypothesized that neural populations estimate these states by processing sensory history through recurrent interactions which reflect the internal model of the world. To test this, we recorded brain activity in posterior parietal cortex (PPC) of monkeys navigating by optic flow to a hidden target location within a virtual environment, without explicit position cues. In addition to sequential neural dynamics and strong interneuronal interactions, we found that the hidden state - monkey's displacement from the goal - was encoded in single neurons, and could be dynamically decoded from population activity. The decoded estimates predicted navigation performance on individual trials. Task manipulations that perturbed the world model induced substantial changes in neural interactions, and modified the neural representation of the hidden state, while representations of sensory and motor variables remained stable. The findings were recapitulated by a task-optimized recurrent neural network model, suggesting that task demands shape the neural interactions in PPC, leading them to embody a world model that consolidates information and tracks task-relevant hidden states.
Collapse
Affiliation(s)
| | - Eric Avila
- Center for Neural Science, New York University, New York City, NY, USA
| | - Xaq Pitkow
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX, USA
- Electrical & Computer Engineering, Rice University, Houston, TX, USA
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York City, NY, USA
- Department of Mechanical and Aerospace Engineering, New York University, New York City, NY, USA
| |
Collapse
|
9
|
Alefantis P, Lakshminarasimhan K, Avila E, Noel JP, Pitkow X, Angelaki DE. Sensory Evidence Accumulation Using Optic Flow in a Naturalistic Navigation Task. J Neurosci 2022; 42:5451-5462. [PMID: 35641186 PMCID: PMC9270913 DOI: 10.1523/jneurosci.2203-21.2022] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 04/01/2022] [Accepted: 04/22/2022] [Indexed: 11/21/2022] Open
Abstract
Sensory evidence accumulation is considered a hallmark of decision-making in noisy environments. Integration of sensory inputs has been traditionally studied using passive stimuli, segregating perception from action. Lessons learned from this approach, however, may not generalize to ethological behaviors like navigation, where there is an active interplay between perception and action. We designed a sensory-based sequential decision task in virtual reality in which humans and monkeys navigated to a memorized location by integrating optic flow generated by their own joystick movements. A major challenge in such closed-loop tasks is that subjects' actions will determine future sensory input, causing ambiguity about whether they rely on sensory input rather than expectations based solely on a learned model of the dynamics. To test whether subjects integrated optic flow over time, we used three independent experimental manipulations, unpredictable optic flow perturbations, which pushed subjects off their trajectory; gain manipulation of the joystick controller, which changed the consequences of actions; and manipulation of the optic flow density, which changed the information borne by sensory evidence. Our results suggest that both macaques (male) and humans (female/male) relied heavily on optic flow, thereby demonstrating a critical role for sensory evidence accumulation during naturalistic action-perception closed-loop tasks.SIGNIFICANCE STATEMENT The temporal integration of evidence is a fundamental component of mammalian intelligence. Yet, it has traditionally been studied using experimental paradigms that fail to capture the closed-loop interaction between actions and sensations inherent in real-world continuous behaviors. These conventional paradigms use binary decision tasks and passive stimuli with statistics that remain stationary over time. Instead, we developed a naturalistic visuomotor visual navigation paradigm that mimics the causal structure of real-world sensorimotor interactions and probed the extent to which participants integrate sensory evidence by adding task manipulations that reveal complementary aspects of the computation.
Collapse
Affiliation(s)
- Panos Alefantis
- Center for Neural Science, New York University, New York, New York 10003
| | | | - Eric Avila
- Center for Neural Science, New York University, New York, New York 10003
| | - Jean-Paul Noel
- Center for Neural Science, New York University, New York, New York 10003
| | - Xaq Pitkow
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas 77030
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas 77005-1892
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas 77030
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York, New York 10003
- Tandon School of Engineering, New York University, New York, New York 11201
| |
Collapse
|
10
|
Abstract
Recent breakthroughs in artificial intelligence (AI) have enabled machines to plan in tasks previously thought to be uniquely human. Meanwhile, the planning algorithms implemented by the brain itself remain largely unknown. Here, we review neural and behavioral data in sequential decision-making tasks that elucidate the ways in which the brain does-and does not-plan. To systematically review available biological data, we create a taxonomy of planning algorithms by summarizing the relevant design choices for such algorithms in AI. Across species, recording techniques, and task paradigms, we find converging evidence that the brain represents future states consistent with a class of planning algorithms within our taxonomy-focused, depth-limited, and serial. However, we argue that current data are insufficient for addressing more detailed algorithmic questions. We propose a new approach leveraging AI advances to drive experiments that can adjudicate between competing candidate algorithms.
Collapse
|
11
|
Noel JP, Caziot B, Bruni S, Fitzgerald NE, Avila E, Angelaki DE. Supporting generalization in non-human primate behavior by tapping into structural knowledge: Examples from sensorimotor mappings, inference, and decision-making. Prog Neurobiol 2021; 201:101996. [PMID: 33454361 PMCID: PMC8096669 DOI: 10.1016/j.pneurobio.2021.101996] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 12/15/2020] [Accepted: 01/12/2021] [Indexed: 02/05/2023]
Abstract
The complex behaviors we ultimately wish to understand are far from those currently used in systems neuroscience laboratories. A salient difference are the closed loops between action and perception prominently present in natural but not laboratory behaviors. The framework of reinforcement learning and control naturally wades across action and perception, and thus is poised to inform the neurosciences of tomorrow, not only from a data analyses and modeling framework, but also in guiding experimental design. We argue that this theoretical framework emphasizes active sensing, dynamical planning, and the leveraging of structural regularities as key operations for intelligent behavior within uncertain, time-varying environments. Similarly, we argue that we may study natural task strategies and their neural circuits without over-training animals when the tasks we use tap into our animal's structural knowledge. As proof-of-principle, we teach animals to navigate through a virtual environment - i.e., explore a well-defined and repetitive structure governed by the laws of physics - using a joystick. Once these animals have learned to 'drive', without further training they naturally (i) show zero- or one-shot learning of novel sensorimotor contingencies, (ii) infer the evolving path of dynamically changing latent variables, and (iii) make decisions consistent with maximizing reward rate. Such task designs allow for the study of flexible and generalizable, yet controlled, behaviors. In turn, they allow for the exploitation of pillars of intelligence - flexibility, prediction, and generalization -, properties whose neural underpinning have remained elusive.
Collapse
Affiliation(s)
- Jean-Paul Noel
- Center for Neural Science, New York University, New York, USA
| | - Baptiste Caziot
- Center for Neural Science, New York University, New York, USA
| | - Stefania Bruni
- Center for Neural Science, New York University, New York, USA
| | | | - Eric Avila
- Center for Neural Science, New York University, New York, USA
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York, USA; Tandon School of Engineering, New York University, New York, USA.
| |
Collapse
|
12
|
Kwon M, Daptardar S, Schrater P, Pitkow X. Inverse Rational Control with Partially Observable Continuous Nonlinear Dynamics. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2020; 33:7898-7909. [PMID: 34712038 PMCID: PMC8549572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A fundamental question in neuroscience is how the brain creates an internal model of the world to guide actions using sequences of ambiguous sensory information. This is naturally formulated as a reinforcement learning problem under partial observations, where an agent must estimate relevant latent variables in the world from its evidence, anticipate possible future states, and choose actions that optimize total expected reward. This problem can be solved by control theory, which allows us to find the optimal actions for a given system dynamics and objective function. However, animals often appear to behave suboptimally. Why? We hypothesize that animals have their own flawed internal model of the world, and choose actions with the highest expected subjective reward according to that flawed model. We describe this behavior as rational but not optimal. The problem of Inverse Rational Control (IRC) aims to identify which internal model would best explain an agent's actions. Our contribution here generalizes past work on Inverse Rational Control which solved this problem for discrete control in partially observable Markov decision processes. Here we accommodate continuous nonlinear dynamics and continuous actions, and impute sensory observations corrupted by unknown noise that is private to the animal. We first build an optimal Bayesian agent that learns an optimal policy generalized over the entire model space of dynamics and subjective rewards using deep reinforcement learning. Crucially, this allows us to compute a likelihood over models for experimentally observable action trajectories acquired from a suboptimal agent. We then find the model parameters that maximize the likelihood using gradient ascent. Our method successfully recovers the true model of rational agents. This approach provides a foundation for interpreting the behavioral and neural dynamics of animal brains during complex tasks.
Collapse
Affiliation(s)
- Minhae Kwon
- School of Electronic Engineering, Soongsil University, Seoul, Republic of Korea
| | | | - Paul Schrater
- Department of Computer Science, University of Minnesota, Minnesota, IN, USA
| | - Xaq Pitkow
- Electrical and Computer Engineering, Rice University, Houston, TX, USA
| |
Collapse
|
13
|
Shiffrin RM, Bassett DS, Kriegeskorte N, Tenenbaum JB. The brain produces mind by modeling. Proc Natl Acad Sci U S A 2020; 117:29299-29301. [PMID: 33229525 PMCID: PMC7703556 DOI: 10.1073/pnas.1912340117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2023] Open
Affiliation(s)
- Richard M Shiffrin
- Psychological and Brain Sciences Department, Indiana University, Bloomington, IN 47405;
| | - Danielle S Bassett
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104
| | | | - Joshua B Tenenbaum
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139-4307
| |
Collapse
|