Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Khamassi M, Humphries MD. Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies. Front Behav Neurosci 2012. [PMID: 23205006 PMCID: PMC3506961 DOI: 10.3389/fnbeh.2012.00079] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

For:	Khamassi M, Humphries MD. Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies. Front Behav Neurosci 2012. [PMID: 23205006 PMCID: PMC3506961 DOI: 10.3389/fnbeh.2012.00079] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Number

Cited by Other Article(s)

Buckley M, McGregor A, Ihssen N, Austen J, Thurlbeck S, Smith SP, Heinecke A, Lew AR. The well-worn route revisited: Striatal and hippocampal system contributions to familiar route navigation. Hippocampus 2024;34:310-326. [PMID: 38721743 DOI: 10.1002/hipo.23607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 02/20/2024] [Accepted: 04/17/2024] [Indexed: 06/21/2024]

Abstract

Classic research has shown a division in the neuroanatomical structures that support flexible (e.g., short-cutting) and habitual (e.g., familiar route following) navigational behavior, with hippocampal-caudate systems associated with the former and putamen systems with the latter. There is, however, disagreement about whether the neural structures involved in navigation process particular forms of spatial information, such as associations between constellations of cues forming a cognitive map, versus single landmark-action associations, or alternatively, perform particular reinforcement learning algorithms that allow the use of different spatial strategies, so-called model-based (flexible) or model-free (habitual) forms of learning. We sought to test these theories by asking participants (N = 24) to navigate within a virtual environment through a previously learned, 9-junction route with distinctive landmarks at each junction while undergoing functional magnetic resonance imaging (fMRI). In a series of probe trials, we distinguished knowledge of individual landmark-action associations along the route versus knowledge of the correct sequence of landmark-action associations, either by having absent landmarks, or "out-of-sequence" landmarks. Under a map-based perspective, sequence knowledge would not require hippocampal systems, because there are no constellations of cues available for cognitive map formation. Within a learning-based model, however, responding based on knowledge of sequence would require hippocampal systems because prior context has to be utilized. We found that hippocampal-caudate systems were more active in probes requiring sequence knowledge, supporting the learning-based model. However, we also found greater putamen activation in probes where navigation based purely on sequence memory could be planned, supporting models of putamen function that emphasize its role in action sequencing.

Collapse

Parrini M, Tricot G, Caroni P, Spolidoro M. Circuit mechanisms of navigation strategy learning in mice. Curr Biol 2024;34:79-91.e4. [PMID: 38101403 DOI: 10.1016/j.cub.2023.11.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 10/09/2023] [Accepted: 11/22/2023] [Indexed: 12/17/2023]

Abstract

Navigation tasks involve the gradual selection and deployment of increasingly effective searching procedures to reach targets. The brain mechanisms underlying such complex behavior are poorly understood, but their elucidation might provide insights into the systems linking exploration and decision making in complex learning. Here, we developed a trial-by-trial goal-related search strategy analysis as mice learned to navigate identical water mazes encompassing distinct goal-related rules and monitored the strategy deployment process throughout learning. We found that navigation learning involved the following three distinct phases: an early phase during which maze-specific search strategies are deployed in a minority of trials, a second phase of preferential increasing deployment of one search strategy, and a final phase of increasing commitment to this strategy only. The three maze learning phases were affected differently by inhibition of retrosplenial cortex (RSC), dorsomedial striatum (DMS), or dorsolateral striatum (DLS). Through brain region-specific inactivation experiments and gain-of-function experiments involving activation of learning-related cFos+ ensembles, we unraveled how goal-related strategy selection relates to deployment throughout these sequential processes. We found that RSC is critically important for search strategy selection, DMS mediates strategy deployment, and DLS ensures searching consistency throughout maze learning. Notably, activation of specific learning-related ensembles was sufficient to direct strategy selection (RSC) or strategy deployment (DMS) in a different maze. Our results establish a goal-related search strategy deployment approach to dissect unsupervised navigation learning processes and suggest that effective searching in navigation involves evidence-based goal-related strategy direction by RSC, reinforcement-modulated strategy deployment through DMS, and online guidance through DLS.

Collapse

Diekmann N, Vijayabaskaran S, Zeng X, Kappel D, Menezes MC, Cheng S. CoBeL-RL: A neuroscience-oriented simulation framework for complex behavior and learning. Front Neuroinform 2023;17:1134405. [PMID: 36970657 PMCID: PMC10033763 DOI: 10.3389/fninf.2023.1134405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 02/17/2023] [Indexed: 03/11/2023] Open

Kassim FM, Lahooti SK, Keay EA, Iyyalol R, Rodger J, Albrecht MA, Martin-Iverson MT. Dexamphetamine widens temporal and spatial binding windows in healthy participants. J Psychiatry Neurosci 2023;48:E90-E98. [PMID: 36918195 PMCID: PMC10019325 DOI: 10.1503/jpn.220149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/28/2022] [Accepted: 11/11/2022] [Indexed: 03/16/2023] Open

Abstract

BACKGROUND

The pathophysiology of psychosis is complex, but a better understanding of stimulus binding windows (BWs) could help to improve our knowledge base. Previous studies have shown that dopamine release is associated with psychosis and widened BWs. We can probe BW mechanisms using drugs of specific interest to psychosis. Therefore, we were interested in understanding how manipulation of the dopamine or catecholamine systems affect psychosis and BWs. We aimed to investigate the effect of dexamphetamine, as a dopamine-releasing stimulant, on the BWs in a unimodal illusion: the tactile funneling illusion (TFI).

METHODS

We conducted a randomized, double-blind, counterbalanced placebo-controlled crossover study to investigate funnelling and errors of localization. We administered dexamphetamine (0.45 mg/kg) to 46 participants. We manipulated 5 spatial (5-1 cm) and 3 temporal (0, 500 and 750 ms) conditions in the TFI.

RESULTS

We found that dexamphetamine increased funnelling illusion (p = 0.009) and increased the error of localization in a delay-dependent manner (p = 0.03). We also found that dexamphetamine significantly increased the error of localization at 500 ms temporal separation and 4 cm spatial separation (p _interaction = 0.009; p _{500ms|4cm v. baseline} = 0.01).

LIMITATIONS

Although amphetamine-induced models of psychosis are a useful approach to understanding the physiology of psychosis related to dopamine hyperactivity, dexamphetamine is equally effective at releasing noradrenaline and dopamine, and, therefore, we were unable to tease apart the effects of the 2 systems on BWs in our study.

CONCLUSION

We found that dexamphetamine increases illusory perception on the unimodal TFI in healthy participants, which suggests that dopamine or other catecholamines have a role in increasing tactile spatial and temporal BWs.

Collapse

Affiliation(s)

Faiz M Kassim From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Samra Krakonja Lahooti From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Elizabeth Ann Keay From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Rajan Iyyalol From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Jennifer Rodger From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Matthew A Albrecht From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)
Mathew T Martin-Iverson From the Department of Psychiatry, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia (Kassim); the Psychopharmacology Unit, School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia (Kassim, Lahooti, Keay, Martin-Iverson); the Psychiatry, Graylands Hospital, Mt Claremont, Perth, WA, Australia (Iyyalol); the Experimental and Regenerative Neurosciences, School of Biological Sciences, University of Western Australia, Crawley, WA, Australia (Rodger); the Brain Plasticity Group, Perron Institute for Neurological and Translational Science, Nedlands, WA, Australia (Rodger); the Western Australian Centre for Road Safety Research, School of Psychological Science, University of Western Australia, Perth, WA, Australia (Albrecht)

Collapse

Reducing Computational Cost During Robot Navigation and Human–Robot Interaction with a Human-Inspired Reinforcement Learning Architecture. Int J Soc Robot 2022. [DOI: 10.1007/s12369-022-00942-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Vijayabaskaran S, Cheng S. Navigation task and action space drive the emergence of egocentric and allocentric spatial representations. PLoS Comput Biol 2022;18:e1010320. [PMID: 36315587 PMCID: PMC9648855 DOI: 10.1371/journal.pcbi.1010320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/10/2022] [Accepted: 10/18/2022] [Indexed: 11/12/2022] Open

Abstract

In general, strategies for spatial navigation could employ one of two spatial reference frames: egocentric or allocentric. Notwithstanding intuitive explanations, it remains unclear however under what circumstances one strategy is chosen over another, and how neural representations should be related to the chosen strategy. Here, we first use a deep reinforcement learning model to investigate whether a particular type of navigation strategy arises spontaneously during spatial learning without imposing a bias onto the model. We then examine the spatial representations that emerge in the network to support navigation. To this end, we study two tasks that are ethologically valid for mammals—guidance, where the agent has to navigate to a goal location fixed in allocentric space, and aiming, where the agent navigates to a visible cue. We find that when both navigation strategies are available to the agent, the solutions it develops for guidance and aiming are heavily biased towards the allocentric or the egocentric strategy, respectively, as one might expect. Nevertheless, the agent can learn both tasks using either type of strategy. Furthermore, we find that place-cell-like allocentric representations emerge preferentially in guidance when using an allocentric strategy, whereas egocentric vector representations emerge when using an egocentric strategy in aiming. We thus find that alongside the type of navigational strategy, the nature of the task plays a pivotal role in the type of spatial representations that emerge.

Most species rely on navigation in space to find water, food, and mates, as well as to return home. When navigating, humans and animals can use one of two reference frames: one based on stable landmarks in the external environment, such as moving due north and then east, or one centered on oneself, such as moving forward and turning left. However, it remains unclear how these reference frames are chosen and interact in navigation tasks, as well as how they are supported by representations in the brain. We therefore modeled two navigation tasks that would each benefit from using one of these reference frames, and trained an artificial agent to learn to solve them through trial and error. Our results show that when given the choice, the agent leveraged the appropriate reference frame to solve the task, but surprisingly could also use the other reference frame when constrained to do so. We also show that the representations that emerge to enable the agent to solve the tasks exist on a spectrum, and are more complex than commonly thought. These representations reflect both the task and reference frame being used, and provide useful insights for the design of experimental tasks to study the use of navigational strategies.

Collapse

Qiao H, Chen J, Huang X. A Survey of Brain-Inspired Intelligent Robots: Integration of Vision, Decision, Motion Control, and Musculoskeletal Systems. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:11267-11280. [PMID: 33909584 DOI: 10.1109/tcyb.2021.3071312] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

A Brain-Inspired Model of Hippocampal Spatial Cognition Based on a Memory-Replay Mechanism. Brain Sci 2022;12:brainsci12091176. [PMID: 36138911 PMCID: PMC9496859 DOI: 10.3390/brainsci12091176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/13/2022] [Accepted: 08/19/2022] [Indexed: 11/17/2022] Open

Suzuki M, Nishimura Y. The ventral striatum contributes to the activity of the motor cortex and motor outputs in monkeys. Front Syst Neurosci 2022;16:979272. [PMID: 36211590 PMCID: PMC9540202 DOI: 10.3389/fnsys.2022.979272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 08/10/2022] [Indexed: 11/13/2022] Open

Massi E, Barthélemy J, Mailly J, Dromnelle R, Canitrot J, Poniatowski E, Girard B, Khamassi M. Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics. Front Neurorobot 2022;16:864380. [PMID: 35812782 PMCID: PMC9263850 DOI: 10.3389/fnbot.2022.864380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 05/05/2022] [Indexed: 11/22/2022] Open

Gmaz JM, van der Meer MAA. Context coding in the mouse nucleus accumbens modulates motivationally relevant information. PLoS Biol 2022;20:e3001338. [PMID: 35486662 PMCID: PMC9094556 DOI: 10.1371/journal.pbio.3001338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 05/11/2022] [Accepted: 04/04/2022] [Indexed: 11/18/2022] Open

Feng Z, Nagase AM, Morita K. A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task? Front Neurosci 2021;15:660595. [PMID: 34602962 PMCID: PMC8481628 DOI: 10.3389/fnins.2021.660595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 08/16/2021] [Indexed: 11/27/2022] Open

Abstract

Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a "student" doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the "student" had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The "student" learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the "student" made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the "student," to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.

Collapse

Wittkuhn L, Chien S, Hall-McMaster S, Schuck NW. Replay in minds and machines. Neurosci Biobehav Rev 2021;129:367-388. [PMID: 34371078 DOI: 10.1016/j.neubiorev.2021.08.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 07/19/2021] [Accepted: 08/01/2021] [Indexed: 11/19/2022]

Humphries MD, Gurney K. Making decisions in the dark basement of the brain: A look back at the GPR model of action selection and the basal ganglia. BIOLOGICAL CYBERNETICS 2021;115:323-329. [PMID: 34272969 DOI: 10.1007/s00422-021-00887-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/06/2021] [Indexed: 06/13/2023]

Visual instrumental learning in blindsight monkeys. Sci Rep 2021;11:14819. [PMID: 34285293 PMCID: PMC8292513 DOI: 10.1038/s41598-021-94192-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/06/2021] [Indexed: 11/08/2022] Open

Parkin regulates drug-taking behavior in rat model of methamphetamine use disorder. Transl Psychiatry 2021;11:293. [PMID: 34001858 PMCID: PMC8129108 DOI: 10.1038/s41398-021-01387-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 03/25/2021] [Accepted: 04/14/2021] [Indexed: 01/02/2023] Open

Huang X, Wu W, Qiao H. Computational Modeling of Emotion-Motivated Decisions for Continuous Control of Mobile Robots. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2019.2963545] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Neural Mechanisms of Human Decision-Making. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021;21:35-57. [PMID: 33409958 DOI: 10.3758/s13415-020-00842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/28/2020] [Indexed: 11/08/2022]

Tessereau C, O’Dea R, Coombes S, Bast T. Reinforcement learning approaches to hippocampus-dependent flexible spatial navigation. Brain Neurosci Adv 2021;5:2398212820975634. [PMID: 33954259 PMCID: PMC8042550 DOI: 10.1177/2398212820975634] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 10/21/2020] [Indexed: 11/17/2022] Open

Abstract

Humans and non-human animals show great flexibility in spatial navigation, including the ability to return to specific locations based on as few as one single experience. To study spatial navigation in the laboratory, watermaze tasks, in which rats have to find a hidden platform in a pool of cloudy water surrounded by spatial cues, have long been used. Analogous tasks have been developed for human participants using virtual environments. Spatial learning in the watermaze is facilitated by the hippocampus. In particular, rapid, one-trial, allocentric place learning, as measured in the delayed-matching-to-place variant of the watermaze task, which requires rodents to learn repeatedly new locations in a familiar environment, is hippocampal dependent. In this article, we review some computational principles, embedded within a reinforcement learning framework, that utilise hippocampal spatial representations for navigation in watermaze tasks. We consider which key elements underlie their efficacy, and discuss their limitations in accounting for hippocampus-dependent navigation, both in terms of behavioural performance (i.e. how well do they reproduce behavioural measures of rapid place learning) and neurobiological realism (i.e. how well do they map to neurobiological substrates involved in rapid place learning). We discuss how an actor-critic architecture, enabling simultaneous assessment of the value of the current location and of the optimal direction to follow, can reproduce one-trial place learning performance as shown on watermaze and virtual delayed-matching-to-place tasks by rats and humans, respectively, if complemented with map-like place representations. The contribution of actor-critic mechanisms to delayed-matching-to-place performance is consistent with neurobiological findings implicating the striatum and hippocampo-striatal interaction in delayed-matching-to-place performance, given that the striatum has been associated with actor-critic mechanisms. Moreover, we illustrate that hierarchical computations embedded within an actor-critic architecture may help to account for aspects of flexible spatial navigation. The hierarchical reinforcement learning approach separates trajectory control via a temporal-difference error from goal selection via a goal prediction error and may account for flexible, trial-specific, navigation to familiar goal locations, as required in some arm-maze place memory tasks, although it does not capture one-trial learning of new goal locations, as observed in open field, including watermaze and virtual, delayed-matching-to-place tasks. Future models of one-shot learning of new goal locations, as observed on delayed-matching-to-place tasks, should incorporate hippocampal plasticity mechanisms that integrate new goal information with allocentric place representation, as such mechanisms are supported by substantial empirical evidence.

Collapse

Bermudez-Contreras E, Clark BJ, Wilber A. The Neuroscience of Spatial Navigation and the Relationship to Artificial Intelligence. Front Comput Neurosci 2020;14:63. [PMID: 32848684 PMCID: PMC7399088 DOI: 10.3389/fncom.2020.00063] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 05/28/2020] [Indexed: 11/13/2022] Open

Abstract

Recent advances in artificial intelligence (AI) and neuroscience are impressive. In AI, this includes the development of computer programs that can beat a grandmaster at GO or outperform human radiologists at cancer detection. A great deal of these technological developments are directly related to progress in artificial neural networks-initially inspired by our knowledge about how the brain carries out computation. In parallel, neuroscience has also experienced significant advances in understanding the brain. For example, in the field of spatial navigation, knowledge about the mechanisms and brain regions involved in neural computations of cognitive maps-an internal representation of space-recently received the Nobel Prize in medicine. Much of the recent progress in neuroscience has partly been due to the development of technology used to record from very large populations of neurons in multiple regions of the brain with exquisite temporal and spatial resolution in behaving animals. With the advent of the vast quantities of data that these techniques allow us to collect there has been an increased interest in the intersection between AI and neuroscience, many of these intersections involve using AI as a novel tool to explore and analyze these large data sets. However, given the common initial motivation point-to understand the brain-these disciplines could be more strongly linked. Currently much of this potential synergy is not being realized. We propose that spatial navigation is an excellent area in which these two disciplines can converge to help advance what we know about the brain. In this review, we first summarize progress in the neuroscience of spatial navigation and reinforcement learning. We then turn our attention to discuss how spatial navigation has been modeled using descriptive, mechanistic, and normative approaches and the use of AI in such models. Next, we discuss how AI can advance neuroscience, how neuroscience can advance AI, and the limitations of these approaches. We finally conclude by highlighting promising lines of research in which spatial navigation can be the point of intersection between neuroscience and AI and how this can contribute to the advancement of the understanding of intelligent behavior.

Collapse

Khamassi M, Girard B. Modeling awake hippocampal reactivations with model-based bidirectional search. BIOLOGICAL CYBERNETICS 2020;114:231-248. [PMID: 32065253 DOI: 10.1007/s00422-020-00817-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 01/21/2020] [Indexed: 06/10/2023]

CB₁ Activity Drives the Selection of Navigational Strategies: A Behavioral and c-Fos Immunoreactivity Study. Int J Mol Sci 2020;21:ijms21031072. [PMID: 32041135 PMCID: PMC7036945 DOI: 10.3390/ijms21031072] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 01/29/2020] [Accepted: 01/31/2020] [Indexed: 11/26/2022] Open

The Nucleus Accumbens Core Is Necessary for Responding to Incentive But Not Instructive Stimuli. J Neurosci 2019;40:1332-1343. [PMID: 31862857 DOI: 10.1523/jneurosci.0194-19.2019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 12/11/2019] [Accepted: 12/12/2019] [Indexed: 11/21/2022] Open

Abstract

An abundant literature has highlighted the importance of the nucleus accumbens core (NAcC) in behavioral tasks dependent on external stimuli. Yet, some studies have also reported the absence of involvement of the NAcC in stimuli processing. We aimed at comparing, in male rats, the underlying neuronal determinants of incentive and instructive stimuli in the same task. We developed a variant of a GO/NOGO task that reveals important differences in these two types of stimuli. The incentive stimulus invites the rat to engage in the task sequence. Once the rat has decided to initiate a trial, it remains engaged in the task until the end of the trial. This task revealed the differential contribution of the NAcC to responding to different types of stimuli: responding to the incentive stimulus depended on NAcC AMPA/NMDA and dopamine D1 receptors, but the retrieval of the response associated with the instructive stimuli (lever pressing on GO, withholding on NOGO) did not. Our electrophysiological study showed that more NAcC neurons responded more strongly to the incentive than the instructive stimuli. Furthermore, when animals did not respond to the incentive stimulus, the induced excitation was suppressed for most projection neurons, whereas interneurons were strongly activated at a latency preceding that found in projection neurons. This work provides insight on the underlying neuronal processes explaining the preferential implication of the NAcC in deciding whether and when to engage in reward-seeking rather than to decide which action to perform.SIGNIFICANCE STATEMENT The nucleus accumbens core (NAcC) is essential to process information carried by reward-predicting stimuli. Yet, stimuli have distinct properties: incentive stimuli orient the attention toward reward-seeking, whereas instructive stimuli inform about the action to perform. Our study shows that, in male rats, NAcC perturbation with glutamate or dopamine antagonists impeded responses to the incentive but not to the instructive stimulus. NAcC neuronal recordings revealed a stronger representation of incentive than instructive stimuli. Furthermore, we found that interneurons are recruited when rats fail to respond to incentive stimuli. This work provides insight on the underlying neuronal processes explaining the preferential implication of the NAcC in deciding whether and when to engage in reward-seeking rather than to decide which action to perform.

Collapse

Kwak S, Jung MW. Distinct roles of striatal direct and indirect pathways in value-based decision making. eLife 2019;8:46050. [PMID: 31310237 PMCID: PMC6658164 DOI: 10.7554/elife.46050] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 07/09/2019] [Indexed: 12/12/2022] Open

Cazé R, Khamassi M, Aubin L, Girard B. Hippocampal replays under the scrutiny of reinforcement learning models. J Neurophysiol 2018;120:2877-2896. [DOI: 10.1152/jn.00145.2018] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Gmaz JM, Carmichael JE, van der Meer MA. Persistent coding of outcome-predictive cue features in the rat nucleus accumbens. eLife 2018;7:37275. [PMID: 30234485 PMCID: PMC6195350 DOI: 10.7554/elife.37275] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 09/15/2018] [Indexed: 01/09/2023] Open

Boraud T, Leblois A, Rougier NP. A natural history of skills. Prog Neurobiol 2018;171:114-124. [PMID: 30171867 DOI: 10.1016/j.pneurobio.2018.08.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 07/19/2018] [Accepted: 08/21/2018] [Indexed: 10/28/2022]

Herweg NA, Kahana MJ. Spatial Representations in the Human Brain. Front Hum Neurosci 2018;12:297. [PMID: 30104966 PMCID: PMC6078001 DOI: 10.3389/fnhum.2018.00297] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 07/06/2018] [Indexed: 11/13/2022] Open

Abstract

While extensive research on the neurophysiology of spatial memory has been carried out in rodents, memory research in humans had traditionally focused on more abstract, language-based tasks. Recent studies have begun to address this gap using virtual navigation tasks in combination with electrophysiological recordings in humans. These studies suggest that the human medial temporal lobe (MTL) is equipped with a population of place and grid cells similar to that previously observed in the rodent brain. Furthermore, theta oscillations have been linked to spatial navigation and, more specifically, to the encoding and retrieval of spatial information. While some studies suggest a single navigational theta rhythm which is of lower frequency in humans than rodents, other studies advocate for the existence of two functionally distinct delta-theta frequency bands involved in both spatial and episodic memory. Despite the general consensus between rodent and human electrophysiology, behavioral work in humans does not unequivocally support the use of a metric Euclidean map for navigation. Formal models of navigational behavior, which specifically consider the spatial scale of the environment and complementary learning mechanisms, may help to better understand different navigational strategies and their neurophysiological mechanisms. Finally, the functional overlap of spatial and declarative memory in the MTL calls for a unified theory of MTL function. Such a theory will critically rely upon linking task-related phenomena at multiple temporal and spatial scales. Understanding how single cell responses relate to ongoing theta oscillations during both the encoding and retrieval of spatial and non-spatial associations appears to be key toward developing a more mechanistic understanding of memory processes in the MTL.

Collapse

Goodroe SC, Starnes J, Brown TI. The Complex Nature of Hippocampal-Striatal Interactions in Spatial Navigation. Front Hum Neurosci 2018;12:250. [PMID: 29977198 PMCID: PMC6021746 DOI: 10.3389/fnhum.2018.00250] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 05/30/2018] [Indexed: 12/15/2022] Open

Dollé L, Chavarriaga R, Guillot A, Khamassi M. Interactions of spatial strategies producing generalization gradient and blocking: A computational approach. PLoS Comput Biol 2018;14:e1006092. [PMID: 29630600 PMCID: PMC5908205 DOI: 10.1371/journal.pcbi.1006092] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 04/19/2018] [Accepted: 03/15/2018] [Indexed: 12/16/2022] Open

A hippocampo-cerebellar centred network for the learning and execution of sequence-based navigation. Sci Rep 2017;7:17812. [PMID: 29259243 PMCID: PMC5736633 DOI: 10.1038/s41598-017-18004-7] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 12/05/2017] [Indexed: 12/29/2022] Open

Russek EM, Momennejad I, Botvinick MM, Gershman SJ, Daw ND. Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLoS Comput Biol 2017;13:e1005768. [PMID: 28945743 PMCID: PMC5628940 DOI: 10.1371/journal.pcbi.1005768] [Citation(s) in RCA: 122] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 10/05/2017] [Accepted: 09/04/2017] [Indexed: 11/19/2022] Open

Murata S, Yamashita Y, Arie H, Ogata T, Sugano S, Tani J. Learning to Perceive the World as Probabilistic or Deterministic via Interaction With Others: A Neuro-Robotics Experiment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017;28:830-848. [PMID: 26595928 DOI: 10.1109/tnnls.2015.2492140] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Savalia T, Shukla A, Bapi RS. A Unified Theoretical Framework for Cognitive Sequencing. Front Psychol 2016;7:1821. [PMID: 27917146 PMCID: PMC5114455 DOI: 10.3389/fpsyg.2016.01821] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Accepted: 11/03/2016] [Indexed: 11/24/2022] Open

Kato A, Morita K. Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation. PLoS Comput Biol 2016;12:e1005145. [PMID: 27736881 PMCID: PMC5063413 DOI: 10.1371/journal.pcbi.1005145] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 09/14/2016] [Indexed: 12/12/2022] Open

Abstract

It has been suggested that dopamine (DA) represents reward-prediction-error (RPE) defined in reinforcement learning and therefore DA responds to unpredicted but not predicted reward. However, recent studies have found DA response sustained towards predictable reward in tasks involving self-paced behavior, and suggested that this response represents a motivational signal. We have previously shown that RPE can sustain if there is decay/forgetting of learned-values, which can be implemented as decay of synaptic strengths storing learned-values. This account, however, did not explain the suggested link between tonic/sustained DA and motivation. In the present work, we explored the motivational effects of the value-decay in self-paced approach behavior, modeled as a series of ‘Go’ or ‘No-Go’ selections towards a goal. Through simulations, we found that the value-decay can enhance motivation, specifically, facilitate fast goal-reaching, albeit counterintuitively. Mathematical analyses revealed that underlying potential mechanisms are twofold: (1) decay-induced sustained RPE creates a gradient of ‘Go’ values towards a goal, and (2) value-contrasts between ‘Go’ and ‘No-Go’ are generated because while chosen values are continually updated, unchosen values simply decay. Our model provides potential explanations for the key experimental findings that suggest DA's roles in motivation: (i) slowdown of behavior by post-training blockade of DA signaling, (ii) observations that DA blockade severely impairs effortful actions to obtain rewards while largely sparing seeking of easily obtainable rewards, and (iii) relationships between the reward amount, the level of motivation reflected in the speed of behavior, and the average level of DA. These results indicate that reinforcement learning with value-decay, or forgetting, provides a parsimonious mechanistic account for the DA's roles in value-learning and motivation. Our results also suggest that when biological systems for value-learning are active even though learning has apparently converged, the systems might be in a state of dynamic equilibrium, where learning and forgetting are balanced.

Dopamine (DA) has been suggested to have two reward-related roles: (1) representing reward-prediction-error (RPE), and (2) providing motivational drive. Role(1) is based on the physiological results that DA responds to unpredicted but not predicted reward, whereas role(2) is supported by the pharmacological results that blockade of DA signaling causes motivational impairments such as slowdown of self-paced behavior. So far, these two roles are considered to be played by two different temporal patterns of DA signals: role(1) by phasic signals and role(2) by tonic/sustained signals. However, recent studies have found sustained DA signals with features indicative of both roles (1) and (2), complicating this picture. Meanwhile, whereas synaptic/circuit mechanisms for role(1), i.e., how RPE is calculated in the upstream of DA neurons and how RPE-dependent update of learned-values occurs through DA-dependent synaptic plasticity, have now become clarified, mechanisms for role(2) remain unclear. In this work, we modeled self-paced behavior by a series of ‘Go’ or ‘No-Go’ selections in the framework of reinforcement-learning assuming DA's role(1), and demonstrated that incorporation of decay/forgetting of learned-values, which is presumably implemented as decay of synaptic strengths storing learned-values, provides a potential unified mechanistic account for the DA's two roles, together with its various temporal patterns.

Collapse

Neuronal activity in dorsomedial and dorsolateral striatum under the requirement for temporal credit assignment. Sci Rep 2016;6:27056. [PMID: 27245401 PMCID: PMC4887996 DOI: 10.1038/srep27056] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 05/13/2016] [Indexed: 11/17/2022] Open

Menegas W, Bergan JF, Ogawa SK, Isogai Y, Umadevi Venkataraju K, Osten P, Uchida N, Watabe-Uchida M. Dopamine neurons projecting to the posterior striatum form an anatomically distinct subclass. eLife 2015;4:e10032. [PMID: 26322384 PMCID: PMC4598831 DOI: 10.7554/elife.10032] [Citation(s) in RCA: 200] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2015] [Accepted: 08/28/2015] [Indexed: 12/18/2022] Open

Abstract

Combining rabies-virus tracing, optical clearing (CLARITY), and whole-brain light-sheet imaging, we mapped the monosynaptic inputs to midbrain dopamine neurons projecting to different targets (different parts of the striatum, cortex, amygdala, etc) in mice. We found that most populations of dopamine neurons receive a similar set of inputs rather than forming strong reciprocal connections with their target areas. A common feature among most populations of dopamine neurons was the existence of dense ‘clusters’ of inputs within the ventral striatum. However, we found that dopamine neurons projecting to the posterior striatum were outliers, receiving relatively few inputs from the ventral striatum and instead receiving more inputs from the globus pallidus, subthalamic nucleus, and zona incerta. These results lay a foundation for understanding the input/output structure of the midbrain dopamine circuit and demonstrate that dopamine neurons projecting to the posterior striatum constitute a unique class of dopamine neurons regulated by different inputs.

DOI:http://dx.doi.org/10.7554/eLife.10032.001

Most neurons send their messages to recipient neurons by releasing a substance called a ‘neurotransmitter’ that binds to receptors on the target cell. The sites of this type of signal transmission are called synapses. Some small populations of neurons modulate the activity of hundreds or thousands of these synapses all across the brain by releasing ‘neuromodulators’ that affect how they work. These neuromodulators are essential because they broadcast information that is likely to be useful to many brain regions, like a ‘news channel’ for the brain.

One important neuromodulator in the mammalian brain is dopamine, which contributes to motivation, learning, and the control of movement. Clusters of cells deep in the brain release dopamine, and people with Parkinson's disease gradually lose these cells. This makes it increasingly difficult for their brains to produce the correct amount of dopamine, and results in symptoms such as tremors and stiff muscles.

Individual dopamine neurons typically send information to a single part of the brain. This suggests that dopamine neurons with different targets might have different roles. To explore this possibility, Menegas et al. classified dopamine neurons in the mouse brain into eight types based on the areas to which they project, and then mapped which neurons send input signals to each type. These inputs are likely to shape the activity of each type (that is, their ‘message’ to the rest of the brain). The mapping revealed that most dopamine neurons do not receive substantial input from the area to which they project (i.e., they do not form ‘closed loops’). Instead, most of their input comes from a common set of brain regions, including a particularly large number of inputs from the ventral striatum.

However, Menegas et al. found one exception. Dopamine neurons that target part of the brain called the posterior striatum receive relatively little input from the ventral striatum. Their input comes instead from a set of other brain structures, and in particular from a region called the subthalamic nucleus. Electrical stimulation of the subthalamic nucleus can help to relieve the symptoms of Parkinson's disease. Therefore, the results presented by Menegas et al. suggest that this population of dopamine neurons might be particularly relevant to Parkinson's disease and that focusing future studies on them could ultimately be beneficial for patients.

DOI:http://dx.doi.org/10.7554/eLife.10032.002

Collapse

Viejo G, Khamassi M, Brovelli A, Girard B. Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning. Front Behav Neurosci 2015;9:225. [PMID: 26379518 PMCID: PMC4549628 DOI: 10.3389/fnbeh.2015.00225] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 08/10/2015] [Indexed: 11/18/2022] Open

Chou TS, Bucci LD, Krichmar JL. Learning touch preferences with a tactile robot using dopamine modulated STDP in a model of insular cortex. Front Neurorobot 2015;9:6. [PMID: 26257639 PMCID: PMC4510776 DOI: 10.3389/fnbot.2015.00006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 07/02/2015] [Indexed: 11/17/2022] Open

Abstract

Neurorobots enable researchers to study how behaviors are produced by neural mechanisms in an uncertain, noisy, real-world environment. To investigate how the somatosensory system processes noisy, real-world touch inputs, we introduce a neurorobot called CARL-SJR, which has a full-body tactile sensory area. The design of CARL-SJR is such that it encourages people to communicate with it through gentle touch. CARL-SJR provides feedback to users by displaying bright colors on its surface. In the present study, we show that CARL-SJR is capable of learning associations between conditioned stimuli (CS; a color pattern on its surface) and unconditioned stimuli (US; a preferred touch pattern) by applying a spiking neural network (SNN) with neurobiologically inspired plasticity. Specifically, we modeled the primary somatosensory cortex, prefrontal cortex, striatum, and the insular cortex, which is important for hedonic touch, to process noisy data generated directly from CARL-SJR's tactile sensory area. To facilitate learning, we applied dopamine-modulated Spike Timing Dependent Plasticity (STDP) to our simulated prefrontal cortex, striatum, and insular cortex. To cope with noisy, varying inputs, the SNN was tuned to produce traveling waves of activity that carried spatiotemporal information. Despite the noisy tactile sensors, spike trains, and variations in subject hand swipes, the learning was quite robust. Further, insular cortex activities in the incremental pathway of dopaminergic reward system allowed us to control CARL-SJR's preference for touch direction without heavily pre-processed inputs. The emerged behaviors we found in this model match animal's behaviors wherein they prefer touch in particular areas and directions. Thus, the results in this paper could serve as an explanation on the underlying neural mechanisms for developing tactile preferences and hedonic touch.

Collapse

Gurney KN, Humphries MD, Redgrave P. A new framework for cortico-striatal plasticity: behavioural theory meets in vitro data at the reinforcement-action interface. PLoS Biol 2015;13:e1002034. [PMID: 25562526 PMCID: PMC4285402 DOI: 10.1371/journal.pbio.1002034] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 11/20/2014] [Indexed: 11/23/2022] Open

Abstract

A computational model yields new insights into the bewildering complexity of cortico-striatal plasticity and its rationale for supporting operant learning.

Operant learning requires that reinforcement signals interact with action representations at a suitable neural interface. Much evidence suggests that this occurs when phasic dopamine, acting as a reinforcement prediction error, gates plasticity at cortico-striatal synapses, and thereby changes the future likelihood of selecting the action(s) coded by striatal neurons. But this hypothesis faces serious challenges. First, cortico-striatal plasticity is inexplicably complex, depending on spike timing, dopamine level, and dopamine receptor type. Second, there is a credit assignment problem—action selection signals occur long before the consequent dopamine reinforcement signal. Third, the two types of striatal output neuron have apparently opposite effects on action selection. Whether these factors rule out the interface hypothesis and how they interact to produce reinforcement learning is unknown. We present a computational framework that addresses these challenges. We first predict the expected activity changes over an operant task for both types of action-coding striatal neuron, and show they co-operate to promote action selection in learning and compete to promote action suppression in extinction. Separately, we derive a complete model of dopamine and spike-timing dependent cortico-striatal plasticity from in vitro data. We then show this model produces the predicted activity changes necessary for learning and extinction in an operant task, a remarkable convergence of a bottom-up data-driven plasticity model with the top-down behavioural requirements of learning theory. Moreover, we show the complex dependencies of cortico-striatal plasticity are not only sufficient but necessary for learning and extinction. Validating the model, we show it can account for behavioural data describing extinction, renewal, and reacquisition, and replicate in vitro experimental data on cortico-striatal plasticity. By bridging the levels between the single synapse and behaviour, our model shows how striatum acts as the action-reinforcement interface.

A key component of survival is the ability to learn which actions, in what contexts, yield useful and rewarding outcomes. Actions are encoded in the brain in the cortex but, as many actions are possible at any one time, there needs to be a mechanism to select which one is to be performed. This problem of action selection is mediated by a set of nuclei known as the basal ganglia, which receive convergent “action requests” from all over the cortex and select the one that is currently most important. Working out which is most important is determined by the strength of the input from each action request: the stronger the connection, the more important that action. Understanding learning thus requires understanding how that strength is changed by the outcome of each action. We built a computational model that demonstrates how the brain's internal signal for outcome (carried by the neurotransmitter dopamine) changes the strength of these cortical connections to learn the selection of rewarded actions, and the suppression of unrewarded ones. Our model shows how several known signals in the brain work together to shape the influence of cortical inputs to the basal ganglia at the interface between our actions and their outcomes.

Collapse

Floresco SB. The Nucleus Accumbens: An Interface Between Cognition, Emotion, and Action. Annu Rev Psychol 2015;66:25-52. [DOI: 10.1146/annurev-psych-010213-115159] [Citation(s) in RCA: 454] [Impact Index Per Article: 50.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Woolley DG, Mantini D, Coxon JP, D'Hooge R, Swinnen SP, Wenderoth N. Virtual water maze learning in human increases functional connectivity between posterior hippocampus and dorsal caudate. Hum Brain Mapp 2014;36:1265-77. [PMID: 25418860 DOI: 10.1002/hbm.22700] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Revised: 10/27/2014] [Accepted: 11/17/2014] [Indexed: 11/10/2022] Open

Matsuda E, Hubert J, Ikegami T. A robotic approach to understanding the role and the mechanism of vicarious trial-and-error in a T-maze task. PLoS One 2014;9:e102708. [PMID: 25050548 PMCID: PMC4106851 DOI: 10.1371/journal.pone.0102708] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Accepted: 06/23/2014] [Indexed: 11/22/2022] Open

Experimental predictions drawn from a computational model of sign-trackers and goal-trackers. ACTA ACUST UNITED AC 2014;109:78-86. [PMID: 24954026 DOI: 10.1016/j.jphysparis.2014.06.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Revised: 05/30/2014] [Accepted: 06/02/2014] [Indexed: 11/20/2022]

Easy rider: monkeys learn to drive a wheelchair to navigate through a complex maze. PLoS One 2014;9:e96275. [PMID: 24831130 PMCID: PMC4022652 DOI: 10.1371/journal.pone.0096275] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 04/07/2014] [Indexed: 11/19/2022] Open

Retailleau A, Boraud T. The Michelin red guide of the brain: role of dopamine in goal-oriented navigation. Front Syst Neurosci 2014;8:32. [PMID: 24672436 PMCID: PMC3957057 DOI: 10.3389/fnsys.2014.00032] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 02/18/2014] [Indexed: 11/13/2022] Open

Abstract

Spatial learning has been recognized over the years to be under the control of the hippocampus and related temporal lobe structures. Hippocampal damage often causes severe impairments in the ability to learn and remember a location in space defined by distal visual cues. Such cognitive disabilities are found in Parkinsonian patients. We recently investigated the role of dopamine in navigation in the 6-Hydroxy-dopamine (6-OHDA) rat, a model of Parkinson’s disease (PD) commonly used to investigate the pathophysiology of dopamine depletion (Retailleau et al., 2013). We demonstrated that dopamine (DA) is essential to spatial learning as its depletion results in spatial impairments. Our results showed that the behavioral effect of DA depletion is correlated with modification of the neural encoding of spatial features and decision making processes in hippocampus. However, the origin of these alterations in the neural processing of the spatial information needs to be clarified. It could result from a local effect: dopamine depletion disturbs directly the processing of relevant spatial information at hippocampal level. Alternatively, it could result from a more distributed network effect: dopamine depletion elsewhere in the brain (entorhinal cortex, striatum, etc.) modifies the way hippocampus processes spatial information. Recent experimental evidence in rodents, demonstrated indeed, that other brain areas are involved in the acquisition of spatial information. Amongst these, the cortex—basal ganglia (BG) loop is known to be involved in reinforcement learning and has been identified as an important contributor to spatial learning. In particular, it has been shown that altered activity of the BG striatal complex can impair the ability to perform spatial learning tasks. The present review provides a glimpse of the findings obtained over the past decade that support a dialog between these two structures during spatial learning under DA control.

Collapse

Lesaint F, Sigaud O, Flagel SB, Robinson TE, Khamassi M. Modelling individual differences in the form of Pavlovian conditioned approach responses: a dual learning systems approach with factored representations. PLoS Comput Biol 2014;10:e1003466. [PMID: 24550719 PMCID: PMC3923662 DOI: 10.1371/journal.pcbi.1003466] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2013] [Accepted: 12/19/2013] [Indexed: 12/04/2022] Open

Abstract

Reinforcement Learning has greatly influenced models of conditioning, providing powerful explanations of acquired behaviour and underlying physiological observations. However, in recent autoshaping experiments in rats, variation in the form of Pavlovian conditioned responses (CRs) and associated dopamine activity, have questioned the classical hypothesis that phasic dopamine activity corresponds to a reward prediction error-like signal arising from a classical Model-Free system, necessary for Pavlovian conditioning. Over the course of Pavlovian conditioning using food as the unconditioned stimulus (US), some rats (sign-trackers) come to approach and engage the conditioned stimulus (CS) itself - a lever - more and more avidly, whereas other rats (goal-trackers) learn to approach the location of food delivery upon CS presentation. Importantly, although both sign-trackers and goal-trackers learn the CS-US association equally well, only in sign-trackers does phasic dopamine activity show classical reward prediction error-like bursts. Furthermore, neither the acquisition nor the expression of a goal-tracking CR is dopamine-dependent. Here we present a computational model that can account for such individual variations. We show that a combination of a Model-Based system and a revised Model-Free system can account for the development of distinct CRs in rats. Moreover, we show that revising a classical Model-Free system to individually process stimuli by using factored representations can explain why classical dopaminergic patterns may be observed for some rats and not for others depending on the CR they develop. In addition, the model can account for other behavioural and pharmacological results obtained using the same, or similar, autoshaping procedures. Finally, the model makes it possible to draw a set of experimental predictions that may be verified in a modified experimental protocol. We suggest that further investigation of factored representations in computational neuroscience studies may be useful.

Collapse

Renaudo E, Girard B, Chatila R, Khamassi M. Design of a Control Architecture for Habit Learning in Robots. BIOMIMETIC AND BIOHYBRID SYSTEMS 2014. [DOI: 10.1007/978-3-319-09435-9_22] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Penny WD, Zeidman P, Burgess N. Forward and backward inference in spatial cognition. PLoS Comput Biol 2013;9:e1003383. [PMID: 24348230 PMCID: PMC3861045 DOI: 10.1371/journal.pcbi.1003383] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Accepted: 10/23/2013] [Indexed: 12/26/2022] Open

Mannella F, Gurney K, Baldassarre G. The nucleus accumbens as a nexus between values and goals in goal-directed behavior: a review and a new hypothesis. Front Behav Neurosci 2013;7:135. [PMID: 24167476 PMCID: PMC3805952 DOI: 10.3389/fnbeh.2013.00135] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 09/15/2013] [Indexed: 01/01/2023] Open