1
|
Huang J, Zhang Z, Ruan X. An Improved Dyna-Q Algorithm Inspired by the Forward Prediction Mechanism in the Rat Brain for Mobile Robot Path Planning. Biomimetics (Basel) 2024; 9:315. [PMID: 38921195 PMCID: PMC11202125 DOI: 10.3390/biomimetics9060315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 05/04/2024] [Accepted: 05/08/2024] [Indexed: 06/27/2024] Open
Abstract
The traditional Model-Based Reinforcement Learning (MBRL) algorithm has high computational cost, poor convergence, and poor performance in robot spatial cognition and navigation tasks, and it cannot fully explain the ability of animals to quickly adapt to environmental changes and learn a variety of complex tasks. Studies have shown that vicarious trial and error (VTE) and the hippocampus forward prediction mechanism in rats and other mammals can be used as key components of action selection in MBRL to support "goal-oriented" behavior. Therefore, we propose an improved Dyna-Q algorithm inspired by the forward prediction mechanism of the hippocampus to solve the above problems and tackle the exploration-exploitation dilemma of Reinforcement Learning (RL). This algorithm alternately presents the potential path in the future for mobile robots and dynamically adjusts the sweep length according to the decision certainty, so as to determine action selection. We test the performance of the algorithm in a two-dimensional maze environment with static and dynamic obstacles, respectively. Compared with classic RL algorithms like State-Action-Reward-State-Action (SARSA) and Dyna-Q, the algorithm can speed up spatial cognition and improve the global search ability of path planning. In addition, our method reflects key features of how the brain organizes MBRL to effectively solve difficult tasks such as navigation, and it provides a new idea for spatial cognitive tasks from a biological perspective.
Collapse
Affiliation(s)
- Jing Huang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China
| | - Ziheng Zhang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China
| | - Xiaogang Ruan
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
- Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China
| |
Collapse
|
2
|
Reinshagen A. Grid cells: the missing link in understanding Parkinson's disease? Front Neurosci 2024; 18:1276714. [PMID: 38389787 PMCID: PMC10881698 DOI: 10.3389/fnins.2024.1276714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 01/24/2024] [Indexed: 02/24/2024] Open
Abstract
The mechanisms underlying Parkinson's disease (PD) are complex and not fully understood, and the box-and-arrow model among other current models present significant challenges. This paper explores the potential role of the allocentric brain and especially its grid cells in several PD motor symptoms, including bradykinesia, kinesia paradoxa, freezing of gait, the bottleneck phenomenon, and their dependency on cueing. It is argued that central hubs, like the locus coeruleus and the pedunculopontine nucleus, often narrowly interpreted in the context of PD, play an equally important role in governing the allocentric brain as the basal ganglia. Consequently, the motor and secondary motor (e.g., spatially related) symptoms of PD linked with dopamine depletion may be more closely tied to erroneous computation by grid cells than to the basal ganglia alone. Because grid cells and their associated central hubs introduce both spatial and temporal information to the brain influencing velocity perception they may cause bradykinesia or hyperkinesia as well. In summary, PD motor symptoms may primarily be an allocentric disturbance resulting from virtual faulty computation by grid cells revealed by dopamine depletion in PD.
Collapse
|
3
|
Pezzulo G, Parr T, Cisek P, Clark A, Friston K. Generating meaning: active inference and the scope and limits of passive AI. Trends Cogn Sci 2024; 28:97-112. [PMID: 37973519 DOI: 10.1016/j.tics.2023.10.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 10/03/2023] [Accepted: 10/05/2023] [Indexed: 11/19/2023]
Abstract
Prominent accounts of sentient behavior depict brains as generative models of organismic interaction with the world, evincing intriguing similarities with current advances in generative artificial intelligence (AI). However, because they contend with the control of purposive, life-sustaining sensorimotor interactions, the generative models of living organisms are inextricably anchored to the body and world. Unlike the passive models learned by generative AI systems, they must capture and control the sensory consequences of action. This allows embodied agents to intervene upon their worlds in ways that constantly put their best models to the test, thus providing a solid bedrock that is - we argue - essential to the development of genuine understanding. We review the resulting implications and consider future directions for generative AI.
Collapse
Affiliation(s)
- Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| | - Thomas Parr
- Nuffield Department of Clinical Neurosciences, University of Oxford
| | - Paul Cisek
- Department of Neuroscience, University of Montréal, Montréal, Québec, Canada
| | - Andy Clark
- Department of Philosophy, University of Sussex, Brighton, UK; Department of Informatics, University of Sussex, Brighton, UK; Department of Philosophy, Macquarie University, Sydney, New South Wales, Australia
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, Queen Square Institute of Neurology, University College London, London, UK; VERSES AI Research Lab, Los Angeles, CA, USA
| |
Collapse
|
4
|
Huo H, Lesage E, Dong W, Verguts T, Seger CA, Diao S, Feng T, Chen Q. The neural substrates of how model-based learning affects risk taking: Functional coupling between right cerebellum and left caudate. Brain Cogn 2023; 172:106088. [PMID: 37783018 DOI: 10.1016/j.bandc.2023.106088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/04/2023]
Abstract
Higher executive control capacity allows people to appropriately evaluate risk and avoid both excessive risk aversion and excessive risk-taking. The neural mechanisms underlying this relationship between executive function and risk taking are still unknown. We used voxel-based morphometry (VBM) analysis combined with resting-state functional connectivity (rs-FC) to evaluate how one component of executive function, model-based learning, relates to risk taking. We measured individuals' use of the model-based learning system with the two-step task, and risk taking with the Balloon Analogue Risk Task. Behavioral results indicated that risk taking was positively correlated with the model-based weighting parameter ω. The VBM results showed a positive association between model-based learning and gray matter volume in the right cerebellum (RCere) and left inferior parietal lobule (LIPL). Functional connectivity results suggested that the coupling between RCere and the left caudate (LCAU) was correlated with both model-based learning and risk taking. Mediation analysis indicated that RCere-LCAU functional connectivity completely mediated the effect of model-based learning on risk taking. These results indicate that learners who favor model-based strategies also engage in more appropriate risky behaviors through interactions between reward-based learning, error-based learning and executive control subserved by a caudate, cerebellar and parietal network.
Collapse
Affiliation(s)
- Hangfeng Huo
- Department of Psychology, Faculty of Education, Guangxi Normal University, Guilin, China; School of Psychology, South China Normal University, 510631 Guangzhou, China; Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China; Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China
| | - Elise Lesage
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Wenshan Dong
- School of Psychology, South China Normal University, 510631 Guangzhou, China; Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China; Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China
| | - Tom Verguts
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Carol A Seger
- School of Psychology, South China Normal University, 510631 Guangzhou, China; Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China; Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China; Department of Psychology and Program in Molecular, Cellular, and Integrative Neurosciences, Colorado State University, Fort Collins, CO, 80523, USA
| | - Sitong Diao
- School of Psychology, Shenzhen University, 518060 Shenzhen, China
| | - Tingyong Feng
- Research Center of Psychology and Social Development, Faculty of Psychology, Southwest University, Chongqing, China; Key Laboratory of Cognition and Personality, Ministry of Education, Chongqing, China.
| | - Qi Chen
- School of Psychology, Shenzhen University, 518060 Shenzhen, China.
| |
Collapse
|
5
|
Yamada K, Toda K. Habit formation viewed as structural change in the behavioral network. Commun Biol 2023; 6:303. [PMID: 37016036 PMCID: PMC10073220 DOI: 10.1038/s42003-023-04500-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 01/18/2023] [Indexed: 04/06/2023] Open
Abstract
Habit formation is a process in which an action becomes involuntary. While goal-directed behavior is driven by its consequences, habits are elicited by a situation rather than its consequences. Existing theories have proposed that actions are controlled by corresponding two distinct systems. Although canonical theories based on such distinctions are starting to be challenged, there are a few theoretical frameworks that implement goal-directed behavior and habits within a single system. Here, we propose a novel theoretical framework by hypothesizing that behavior is a network composed of several responses. With this framework, we have shown that the transition of goal-directed actions to habits is caused by a change in a single network structure. Furthermore, we confirmed that the proposed network model behaves in a manner consistent with the existing experimental results reported in animal behavioral studies. Our results revealed that habit could be formed under the control of a single system rather than two distinct systems. By capturing the behavior as a single network change, this framework provides a new perspective on studying the structure of the behavior for experimental and theoretical research.
Collapse
Affiliation(s)
- Kota Yamada
- Department of Psychology, Keio University, Tokyo, Japan.
- Japan Society for Promotion of Science, Tokyo, Japan.
| | - Koji Toda
- Department of Psychology, Keio University, Tokyo, Japan
| |
Collapse
|
6
|
Lind EB, Sweis BM, Asp AJ, Esguerra M, Silvis KA, David Redish A, Thomas MJ. A quadruple dissociation of reward-related behaviour in mice across excitatory inputs to the nucleus accumbens shell. Commun Biol 2023; 6:119. [PMID: 36717646 PMCID: PMC9886947 DOI: 10.1038/s42003-023-04429-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 01/05/2023] [Indexed: 02/01/2023] Open
Abstract
The nucleus accumbens shell (NAcSh) is critically important for reward valuations, yet it remains unclear how valuation information is integrated in this region to drive behaviour during reinforcement learning. Using an optogenetic spatial self-stimulation task in mice, here we show that contingent activation of different excitatory inputs to the NAcSh change expression of different reward-related behaviours. Our data indicate that medial prefrontal inputs support place preference via repeated actions, ventral hippocampal inputs consistently promote place preferences, basolateral amygdala inputs produce modest place preferences but as a byproduct of increased sensitivity to time investments, and paraventricular inputs reduce place preferences yet do not produce full avoidance behaviour. These findings suggest that each excitatory input provides distinct information to the NAcSh, and we propose that this reflects the reinforcement of different credit assignment functions. Our finding of a quadruple dissociation of NAcSh input-specific behaviours provides insights into how types of information carried by distinct inputs to the NAcSh could be integrated to help drive reinforcement learning and situationally appropriate behavioural responses.
Collapse
Affiliation(s)
- Erin B Lind
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Brian M Sweis
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
- Department of Psychiatry, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY, 10029, USA
| | - Anders J Asp
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Rehabilitation Medicine Research Center, Department of Physical Medicine and Rehabilitation, Mayo Clinic, 200 First St SW, Rochester, MN, 55905, USA
| | - Manuel Esguerra
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Keelia A Silvis
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - A David Redish
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Mark J Thomas
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA.
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA.
| |
Collapse
|
7
|
Priorelli M, Stoianov IP. Flexible intentions: An Active Inference theory. Front Comput Neurosci 2023; 17:1128694. [PMID: 37021085 PMCID: PMC10067605 DOI: 10.3389/fncom.2023.1128694] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 03/03/2023] [Indexed: 04/07/2023] Open
Abstract
We present a normative computational theory of how the brain may support visually-guided goal-directed actions in dynamically changing environments. It extends the Active Inference theory of cortical processing according to which the brain maintains beliefs over the environmental state, and motor control signals try to fulfill the corresponding sensory predictions. We propose that the neural circuitry in the Posterior Parietal Cortex (PPC) compute flexible intentions-or motor plans from a belief over targets-to dynamically generate goal-directed actions, and we develop a computational formalization of this process. A proof-of-concept agent embodying visual and proprioceptive sensors and an actuated upper limb was tested on target-reaching tasks. The agent behaved correctly under various conditions, including static and dynamic targets, different sensory feedbacks, sensory precisions, intention gains, and movement policies; limit conditions were individuated, too. Active Inference driven by dynamic and flexible intentions can thus support goal-directed behavior in constantly changing environments, and the PPC might putatively host its core intention mechanism. More broadly, the study provides a normative computational basis for research on goal-directed behavior in end-to-end settings and further advances mechanistic theories of active biological systems.
Collapse
|
8
|
Stoianov I, Maisto D, Pezzulo G. The hippocampal formation as a hierarchical generative model supporting generative replay and continual learning. Prog Neurobiol 2022; 217:102329. [PMID: 35870678 DOI: 10.1016/j.pneurobio.2022.102329] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 07/15/2022] [Accepted: 07/19/2022] [Indexed: 11/28/2022]
Abstract
We advance a novel computational theory of the hippocampal formation as a hierarchical generative model that organizes sequential experiences, such as rodent trajectories during spatial navigation, into coherent spatiotemporal contexts. We propose that the hippocampal generative model is endowed with inductive biases to identify individual items of experience (first hierarchical layer), organize them into sequences (second layer) and cluster them into maps (third layer). This theory entails a novel characterization of hippocampal reactivations as generative replay: the offline resampling of fictive sequences from the generative model, which supports the continual learning of multiple sequential experiences. We show that the model learns and efficiently retains multiple spatial navigation trajectories, by organizing them into spatial maps. Furthermore, the model reproduces flexible and prospective aspects of hippocampal dynamics that are challenging to explain within existing frameworks. This theory reconciles multiple roles of the hippocampal formation in map-based navigation, episodic memory and imagination.
Collapse
Affiliation(s)
- Ivilin Stoianov
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Domenico Maisto
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| |
Collapse
|
9
|
Badea A, Li D, Niculescu AR, Anderson RJ, Stout JA, Williams CL, Colton CA, Maeda N, Dunson DB. Absolute Winding Number Differentiates Mouse Spatial Navigation Strategies With Genetic Risk for Alzheimer's Disease. Front Neurosci 2022; 16:848654. [PMID: 35784847 PMCID: PMC9247395 DOI: 10.3389/fnins.2022.848654] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 04/29/2022] [Indexed: 11/13/2022] Open
Abstract
Spatial navigation and orientation are emerging as promising markers for altered cognition in prodromal Alzheimer's disease, and even in cognitively normal individuals at risk for Alzheimer's disease. The different APOE gene alleles confer various degrees of risk. The APOE2 allele is considered protective, APOE3 is seen as control, while APOE4 carriage is the major known genetic risk for Alzheimer's disease. We have used mouse models carrying the three humanized APOE alleles and tested them in a spatial memory task in the Morris water maze. We introduce a new metric, the absolute winding number, to characterize the spatial search strategy, through the shape of the swim path. We show that this metric is robust to noise, and works for small group samples. Moreover, the absolute winding number better differentiated APOE3 carriers, through their straighter swim paths relative to both APOE2 and APOE4 genotypes. Finally, this novel metric supported increased vulnerability in APOE4 females. We hypothesized differences in spatial memory and navigation strategies are linked to differences in brain networks, and showed that different genotypes have different reliance on the hippocampal and caudate putamen circuits, pointing to a role for white matter connections. Moreover, differences were most pronounced in females. This departure from a hippocampal centric to a brain network approach may open avenues for identifying regions linked to increased risk for Alzheimer's disease, before overt disease manifestation. Further exploration of novel biomarkers based on spatial navigation strategies may enlarge the windows of opportunity for interventions. The proposed framework will be significant in dissecting vulnerable circuits associated with cognitive changes in prodromal Alzheimer's disease.
Collapse
Affiliation(s)
- Alexandra Badea
- Department of Radiology, Duke University, Durham, NC, United States
- Department of Neurology, Duke University, Durham, NC, United States
- Brain Imaging and Analysis Center, Duke University, Durham, NC, United States
- Biomedical Engineering, Duke University, Durham, NC, United States
| | - Didong Li
- Department of Computer Science, Princeton University, Princeton, NJ, United States
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, United States
| | | | | | - Jacques A. Stout
- Brain Imaging and Analysis Center, Duke University, Durham, NC, United States
| | - Christina L. Williams
- Department of Psychology and Neuroscience, Duke University, Durham, NC, United States
| | - Carol A. Colton
- Department of Neurology, Duke University, Durham, NC, United States
| | - Nobuyo Maeda
- Department of Pathology and Laboratory Medicine, The University of North Carolina, Chapel Hill, Chapel Hill, NC, United States
| | - David B. Dunson
- Department of Statistical Science, Duke University, Durham, NC, United States
| |
Collapse
|
10
|
Dennison JB, Sazhin D, Smith DV. Decision neuroscience and neuroeconomics: Recent progress and ongoing challenges. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2022; 13:e1589. [PMID: 35137549 PMCID: PMC9124684 DOI: 10.1002/wcs.1589] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/28/2021] [Accepted: 12/21/2021] [Indexed: 01/10/2023]
Abstract
In the past decade, decision neuroscience and neuroeconomics have developed many new insights in the study of decision making. This review provides an overarching update on how the field has advanced in this time period. Although our initial review a decade ago outlined several theoretical, conceptual, methodological, empirical, and practical challenges, there has only been limited progress in resolving these challenges. We summarize significant trends in decision neuroscience through the lens of the challenges outlined for the field and review examples where the field has had significant, direct, and applicable impacts across economics and psychology. First, we review progress on topics including reward learning, explore-exploit decisions, risk and ambiguity, intertemporal choice, and valuation. Next, we assess the impacts of emotion, social rewards, and social context on decision making. Then, we follow up with how individual differences impact choices and new exciting developments in the prediction and neuroforecasting of future decisions. Finally, we consider how trends in decision-neuroscience research reflect progress toward resolving past challenges, discuss new and exciting applications of recent research, and identify new challenges for the field. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Emotion and Motivation.
Collapse
Affiliation(s)
- Jeffrey B Dennison
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - Daniel Sazhin
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - David V Smith
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
11
|
Gmaz JM, van der Meer MAA. Context coding in the mouse nucleus accumbens modulates motivationally relevant information. PLoS Biol 2022; 20:e3001338. [PMID: 35486662 PMCID: PMC9094556 DOI: 10.1371/journal.pbio.3001338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 05/11/2022] [Accepted: 04/04/2022] [Indexed: 11/18/2022] Open
Abstract
Neural activity in the nucleus accumbens (NAc) is thought to track fundamentally value-centric quantities linked to reward and effort. However, the NAc also contributes to flexible behavior in ways that are difficult to explain based on value signals alone, raising the question of if and how nonvalue signals are encoded in NAc. We recorded NAc neural ensembles while head-fixed mice performed an odor-based biconditional discrimination task where an initial discrete cue modulated the behavioral significance of a subsequently presented reward-predictive cue. We extracted single-unit and population-level correlates related to the cues and found value-independent coding for the initial, context-setting cue. This context signal occupied a population-level coding space orthogonal to outcome-related representations and was predictive of subsequent behaviorally relevant responses to the reward-predictive cues. Together, these findings support a gating model for how the NAc contributes to behavioral flexibility and provide a novel population-level perspective from which to view NAc computations.
Collapse
Affiliation(s)
- Jimmie M. Gmaz
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, United States of America
| | - Matthijs A. A. van der Meer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, United States of America
- * E-mail:
| |
Collapse
|
12
|
|
13
|
Bermudez-Contreras E, Clark BJ, Wilber A. The Neuroscience of Spatial Navigation and the Relationship to Artificial Intelligence. Front Comput Neurosci 2020; 14:63. [PMID: 32848684 PMCID: PMC7399088 DOI: 10.3389/fncom.2020.00063] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 05/28/2020] [Indexed: 11/13/2022] Open
Abstract
Recent advances in artificial intelligence (AI) and neuroscience are impressive. In AI, this includes the development of computer programs that can beat a grandmaster at GO or outperform human radiologists at cancer detection. A great deal of these technological developments are directly related to progress in artificial neural networks-initially inspired by our knowledge about how the brain carries out computation. In parallel, neuroscience has also experienced significant advances in understanding the brain. For example, in the field of spatial navigation, knowledge about the mechanisms and brain regions involved in neural computations of cognitive maps-an internal representation of space-recently received the Nobel Prize in medicine. Much of the recent progress in neuroscience has partly been due to the development of technology used to record from very large populations of neurons in multiple regions of the brain with exquisite temporal and spatial resolution in behaving animals. With the advent of the vast quantities of data that these techniques allow us to collect there has been an increased interest in the intersection between AI and neuroscience, many of these intersections involve using AI as a novel tool to explore and analyze these large data sets. However, given the common initial motivation point-to understand the brain-these disciplines could be more strongly linked. Currently much of this potential synergy is not being realized. We propose that spatial navigation is an excellent area in which these two disciplines can converge to help advance what we know about the brain. In this review, we first summarize progress in the neuroscience of spatial navigation and reinforcement learning. We then turn our attention to discuss how spatial navigation has been modeled using descriptive, mechanistic, and normative approaches and the use of AI in such models. Next, we discuss how AI can advance neuroscience, how neuroscience can advance AI, and the limitations of these approaches. We finally conclude by highlighting promising lines of research in which spatial navigation can be the point of intersection between neuroscience and AI and how this can contribute to the advancement of the understanding of intelligent behavior.
Collapse
Affiliation(s)
| | - Benjamin J. Clark
- Department of Psychology, University of New Mexico, Albuquerque, NM, United States
| | - Aaron Wilber
- Department of Psychology, Program in Neuroscience, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
14
|
Huang Y, Yaple ZA, Yu R. Goal-oriented and habitual decisions: Neural signatures of model-based and model-free learning. Neuroimage 2020; 215:116834. [PMID: 32283275 DOI: 10.1016/j.neuroimage.2020.116834] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 03/03/2020] [Accepted: 04/08/2020] [Indexed: 11/26/2022] Open
Abstract
Human decision-making is mainly driven by two fundamental learning processes: a slow, deliberative, goal-directed model-based process that maps out the potential outcomes of all options and a rapid habitual model-free process that enables reflexive repetition of previously successful choices. Although many model-informed neuroimaging studies have examined the neural correlates of model-based and model-free learning, the concordant activity among these two processes remains unclear. We used quantitative meta-analyses of functional magnetic resonance imaging experiments to identify the concordant activity pertaining to model-based and model-free learning over a range of reward-related paradigms. We found that: 1) both processes yielded concordant ventral striatum activity, 2) model-based learning activated the medial prefrontal cortex and orbital frontal cortex, and 3) model-free learning specifically activated the left globus pallidus and right caudate head. Our findings suggest that model-free and model-based decision making engage overlapping yet distinct neural regions. These stereotaxic maps improve our understanding of how deliberative goal-directed and reflexive habitual learning are implemented in the brain.
Collapse
Affiliation(s)
- Yi Huang
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore
| | - Zachary A Yaple
- Department of Psychology, National University of Singapore, Singapore
| | - Rongjun Yu
- NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore; Department of Psychology, National University of Singapore, Singapore.
| |
Collapse
|
15
|
Rusu SI, Pennartz CMA. Learning, memory and consolidation mechanisms for behavioral control in hierarchically organized cortico-basal ganglia systems. Hippocampus 2019; 30:73-98. [PMID: 31617622 PMCID: PMC6972576 DOI: 10.1002/hipo.23167] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Revised: 09/09/2019] [Accepted: 09/11/2019] [Indexed: 01/05/2023]
Abstract
This article aims to provide a synthesis on the question how brain structures cooperate to accomplish hierarchically organized behaviors, characterized by low‐level, habitual routines nested in larger sequences of planned, goal‐directed behavior. The functioning of a connected set of brain structures—prefrontal cortex, hippocampus, striatum, and dopaminergic mesencephalon—is reviewed in relation to two important distinctions: (a) goal‐directed as opposed to habitual behavior and (b) model‐based and model‐free learning. Recent evidence indicates that the orbitomedial prefrontal cortices not only subserve goal‐directed behavior and model‐based learning, but also code the “landscape” (task space) of behaviorally relevant variables. While the hippocampus stands out for its role in coding and memorizing world state representations, it is argued to function in model‐based learning but is not required for coding of action–outcome contingencies, illustrating that goal‐directed behavior is not congruent with model‐based learning. While the dorsolateral and dorsomedial striatum largely conform to the dichotomy between habitual versus goal‐directed behavior, ventral striatal functions go beyond this distinction. Next, we contextualize findings on coding of reward‐prediction errors by ventral tegmental dopamine neurons to suggest a broader role of mesencephalic dopamine cells, viz. in behavioral reactivity and signaling unexpected sensory changes. We hypothesize that goal‐directed behavior is hierarchically organized in interconnected cortico‐basal ganglia loops, where a limbic‐affective prefrontal‐ventral striatal loop controls action selection in a dorsomedial prefrontal–striatal loop, which in turn regulates activity in sensorimotor‐dorsolateral striatal circuits. This structure for behavioral organization requires alignment with mechanisms for memory formation and consolidation. We propose that frontal corticothalamic circuits form a high‐level loop for memory processing that initiates and temporally organizes nested activities in lower‐level loops, including the hippocampus and the ripple‐associated replay it generates. The evidence on hierarchically organized behavior converges with that on consolidation mechanisms in suggesting a frontal‐to‐caudal directionality in processing control.
Collapse
Affiliation(s)
- Silviu I Rusu
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.,Research Priority Program Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Cyriel M A Pennartz
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands.,Research Priority Program Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
16
|
Pezzulo G, Donnarumma F, Maisto D, Stoianov I. Planning at decision time and in the background during spatial navigation. Curr Opin Behav Sci 2019. [DOI: 10.1016/j.cobeha.2019.04.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
17
|
Abstract
A popular distinction in the human and animal learning literature is between deliberate (or willed) and habitual (or automatic) modes of control. Extensive evidence indicates that, after sufficient learning, living organisms develop behavioural habits that permit them saving computational resources. Furthermore, humans and other animals are able to transfer control from deliberate to habitual modes (and vice versa), trading off efficiently flexibility and parsimony - an ability that is currently unparalleled by artificial control systems. Here, we discuss a computational implementation of habit formation, and the transfer of control from deliberate to habitual modes (and vice versa) within Active Inference: a computational framework that merges aspects of cybernetic theory and of Bayesian inference. To model habit formation, we endow an Active Inference agent with a mechanism to "cache" (or memorize) policy probabilities from previous trials, and reuse them to skip - in part or in full - the inferential steps of deliberative processing. We exploit the fact that the relative quality of policies, conditioned upon hidden states, is constant over trials; provided that contingencies and prior preferences do not change. This means the only quantity that can change policy selection is the prior distribution over the initial state - where this prior is based upon the posterior beliefs from previous trials. Thus, an agent that caches the quality (or the probability) of policies can safely reuse cached values to save on cognitive and computational resources - unless contingencies change. Our simulations illustrate the computational benefits, but also the limits, of three caching schemes under Active Inference. They suggest that key aspects of habitual behaviour - such as perseveration - can be explained in terms of caching policy probabilities. Furthermore, they suggest that there may be many kinds (or stages) of habitual behaviour, each associated with a different caching scheme; for example, caching associated or not associated with contextual estimation. These schemes are more or less impervious to contextual and contingency changes.
Collapse
Affiliation(s)
- D Maisto
- Institute for High Performance Computing and Networking, National Research Council, Via P. Castellino, 111, Naples 80131, Italy
| | - K Friston
- The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, UK
| | - G Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Via San Martino della Battaglia 44, Rome 00185, Italy
| |
Collapse
|
18
|
Abstract
We discuss how uncertainty underwrites exploration and epistemic foraging from the perspective of active inference: a generic scheme that places pragmatic (utility maximization) and epistemic (uncertainty minimization) imperatives on an equal footing - as primary determinants of proximal behavior. This formulation contextualizes the complementary motivational incentives for reward-related stimuli and environmental uncertainty, offering a normative treatment of their trade-off.
Collapse
|
19
|
Making the Environment an Informative Place: A Conceptual Analysis of Epistemic Policies and Sensorimotor Coordination. ENTROPY 2019; 21:e21040350. [PMID: 33267064 PMCID: PMC7514834 DOI: 10.3390/e21040350] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Revised: 03/20/2019] [Accepted: 03/25/2019] [Indexed: 01/02/2023]
Abstract
How do living organisms decide and act with limited and uncertain information? Here, we discuss two computational approaches to solving these challenging problems: a "cognitive" and a "sensorimotor" enrichment of stimuli, respectively. In both approaches, the key notion is that agents can strategically modulate their behavior in informative ways, e.g., to disambiguate amongst alternative hypotheses or to favor the perception of stimuli providing the information necessary to later act appropriately. We discuss how, despite their differences, both approaches appeal to the notion that actions must obey both epistemic (i.e., information-gathering or uncertainty-reducing) and pragmatic (i.e., goal- or reward-maximizing) imperatives and balance them. Our computationally-guided analysis reveals that epistemic behavior is fundamental to understanding several facets of cognitive processing, including perception, decision making, and social interaction.
Collapse
|