1
|
Jamey K, Foster NEV, Hyde KL, Dalla Bella S. Does music training improve inhibition control in children? A systematic review and meta-analysis. Cognition 2024; 252:105913. [PMID: 39197250 DOI: 10.1016/j.cognition.2024.105913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 07/24/2024] [Accepted: 07/30/2024] [Indexed: 09/01/2024]
Abstract
Inhibition control is an essential executive function during children's development, underpinning self-regulation and the acquisition of social and language abilities. This executive function is intensely engaged in music training while learning an instrument, a complex multisensory task requiring monitoring motor performance and auditory stream prioritization. This novel meta-analysis examined music-based training on inhibition control in children. Records from 1980 to 2023 yielded 22 longitudinal studies with controls (N = 1734), including 8 RCTs and 14 others. A random-effects meta-analysis showed that music training improved inhibition control (moderate-to-large effect size) in the RCTs and the superset of twenty-two longitudinal studies (small-to-moderate effect size). Music training plays a privileged role compared to other activities (sports, visual arts, drama) in improving children's executive functioning, with a particular effect on inhibition control. We recommend music training for complementing education and as a clinical tool focusing on inhibition control remediation (e.g., in autism and ADHD).
Collapse
Affiliation(s)
- Kevin Jamey
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montreal, Canada; Department of Psychology, University of Montreal, Montreal, Canada; Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada.
| | - Nicholas E V Foster
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montreal, Canada; Department of Psychology, University of Montreal, Montreal, Canada; Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada
| | - Krista L Hyde
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montreal, Canada; Department of Psychology, University of Montreal, Montreal, Canada; Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada
| | - Simone Dalla Bella
- International Laboratory for Brain, Music, and Sound Research (BRAMS), Montreal, Canada; Department of Psychology, University of Montreal, Montreal, Canada; Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada; University of Economics and Human Sciences in Warsaw, Warsaw, Poland.
| |
Collapse
|
2
|
Moskovitz T, Miller KJ, Sahani M, Botvinick MM. Understanding dual process cognition via the minimum description length principle. PLoS Comput Biol 2024; 20:e1012383. [PMID: 39423224 DOI: 10.1371/journal.pcbi.1012383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 08/01/2024] [Indexed: 10/21/2024] Open
Abstract
Dual-process theories play a central role in both psychology and neuroscience, figuring prominently in domains ranging from executive control to reward-based learning to judgment and decision making. In each of these domains, two mechanisms appear to operate concurrently, one relatively high in computational complexity, the other relatively simple. Why is neural information processing organized in this way? We propose an answer to this question based on the notion of compression. The key insight is that dual-process structure can enhance adaptive behavior by allowing an agent to minimize the description length of its own behavior. We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles.
Collapse
Affiliation(s)
- Ted Moskovitz
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| | - Kevin J Miller
- Google DeepMind, London, United Kingdom
- Department of Ophthalmology, University College London, London, United Kingdom
| | - Maneesh Sahani
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Matthew M Botvinick
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| |
Collapse
|
3
|
Köhler RM, Binns TS, Merk T, Zhu G, Yin Z, Zhao B, Chikermane M, Vanhoecke J, Busch JL, Habets JGV, Faust K, Schneider GH, Cavallo A, Haufe S, Zhang J, Kühn AA, Haynes JD, Neumann WJ. Dopamine and deep brain stimulation accelerate the neural dynamics of volitional action in Parkinson's disease. Brain 2024; 147:3358-3369. [PMID: 38954651 PMCID: PMC11449126 DOI: 10.1093/brain/awae219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 05/16/2024] [Accepted: 06/13/2024] [Indexed: 07/04/2024] Open
Abstract
The ability to initiate volitional action is fundamental to human behaviour. Loss of dopaminergic neurons in Parkinson's disease is associated with impaired action initiation, also termed akinesia. Both dopamine and subthalamic deep brain stimulation (DBS) can alleviate akinesia, but the underlying mechanisms are unknown. An important question is whether dopamine and DBS facilitate de novo build-up of neural dynamics for motor execution or accelerate existing cortical movement initiation signals through shared modulatory circuit effects. Answering these questions can provide the foundation for new closed-loop neurotherapies with adaptive DBS, but the objectification of neural processing delays prior to performance of volitional action remains a significant challenge. To overcome this challenge, we studied readiness potentials and trained brain signal decoders on invasive neurophysiology signals in 25 DBS patients (12 female) with Parkinson's disease during performance of self-initiated movements. Combined sensorimotor cortex electrocorticography and subthalamic local field potential recordings were performed OFF therapy (n = 22), ON dopaminergic medication (n = 18) and on subthalamic deep brain stimulation (n = 8). This allowed us to compare their therapeutic effects on neural latencies between the earliest cortical representation of movement intention as decoded by linear discriminant analysis classifiers and onset of muscle activation recorded with electromyography. In the hypodopaminergic OFF state, we observed long latencies between motor intention and motor execution for readiness potentials and machine learning classifications. Both, dopamine and DBS significantly shortened these latencies, hinting towards a shared therapeutic mechanism for alleviation of akinesia. To investigate this further, we analysed directional cortico-subthalamic oscillatory communication with multivariate granger causality. Strikingly, we found that both therapies independently shifted cortico-subthalamic oscillatory information flow from antikinetic beta (13-35 Hz) to prokinetic theta (4-10 Hz) rhythms, which was correlated with latencies in motor execution. Our study reveals a shared brain network modulation pattern of dopamine and DBS that may underlie the acceleration of neural dynamics for augmentation of movement initiation in Parkinson's disease. Instead of producing or increasing preparatory brain signals, both therapies modulate oscillatory communication. These insights provide a link between the pathophysiology of akinesia and its' therapeutic alleviation with oscillatory network changes in other non-motor and motor domains, e.g. related to hyperkinesia or effort and reward perception. In the future, our study may inspire the development of clinical brain computer interfaces based on brain signal decoders to provide temporally precise support for action initiation in patients with brain disorders.
Collapse
Affiliation(s)
- Richard M Köhler
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Thomas S Binns
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience, Berlin 10115, Germany
| | - Timon Merk
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Guanyu Zhu
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
- Beijing Key Laboratory of Neurostimulation, Beijing 100070, China
| | - Zixiao Yin
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
- Beijing Key Laboratory of Neurostimulation, Beijing 100070, China
| | - Baotian Zhao
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
- Beijing Key Laboratory of Neurostimulation, Beijing 100070, China
| | - Meera Chikermane
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Jojo Vanhoecke
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Johannes L Busch
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Jeroen G V Habets
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Katharina Faust
- Department of Neurosurgery, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Gerd-Helge Schneider
- Department of Neurosurgery, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Alessia Cavallo
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
| | - Stefan Haufe
- Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience, Berlin 10115, Germany
- Research Group for Uncertainty, Inverse Modeling and Machine Learning, Technische Universität Berlin, Berlin 10623, Germany
- Physikalisch-Technische Bundesanstalt Braunschweig und Berlin, Berlin 10587, Germany
- Berlin Center for Advanced Neuroimaging, Bernstein Center for Computational Neuroscience, Berlin 10117, Germany
| | - Jianguo Zhang
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
- Beijing Key Laboratory of Neurostimulation, Beijing 100070, China
| | - Andrea A Kühn
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience, Berlin 10115, Germany
- NeuroCure Clinical Research Centre, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin 10115, Germany
| | - John-Dylan Haynes
- Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience, Berlin 10115, Germany
- Physikalisch-Technische Bundesanstalt Braunschweig und Berlin, Berlin 10587, Germany
- Berlin Center for Advanced Neuroimaging, Bernstein Center for Computational Neuroscience, Berlin 10117, Germany
- NeuroCure Clinical Research Centre, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin 10115, Germany
| | - Wolf-Julian Neumann
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin 10117, Germany
- Bernstein Center for Computational Neuroscience, Berlin 10115, Germany
| |
Collapse
|
4
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
5
|
Maroto-Gómez M, Burguete-Alventosa J, Álvarez-Arias S, Malfaz M, Salichs MÁ. A Bio-Inspired Dopamine Model for Robots with Autonomous Decision-Making. Biomimetics (Basel) 2024; 9:504. [PMID: 39194483 DOI: 10.3390/biomimetics9080504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 08/18/2024] [Accepted: 08/19/2024] [Indexed: 08/29/2024] Open
Abstract
Decision-making systems allow artificial agents to adapt their behaviours, depending on the information they perceive from the environment and internal processes. Human beings possess unique decision-making capabilities, adapting to current situations and anticipating future challenges. Autonomous robots with adaptive and anticipatory decision-making emulating humans can bring robots with skills that users can understand more easily. Human decisions highly depend on dopamine, a brain substance that regulates motivation and reward, acknowledging positive and negative situations. Considering recent neuroscience studies about the dopamine role in the human brain and its influence on decision-making and motivated behaviour, this paper proposes a model based on how dopamine drives human motivation and decision-making. The model allows robots to behave autonomously in dynamic environments, learning the best action selection strategy and anticipating future rewards. The results show the model's performance in five scenarios, emphasising how dopamine levels vary depending on the robot's situation and stimuli perception. Moreover, we show the model's integration into the Mini social robot to provide insights into how dopamine levels drive motivated autonomous behaviour regulating biologically inspired internal processes emulated in the robot.
Collapse
Affiliation(s)
- Marcos Maroto-Gómez
- Department of Systems Engineering and Automation, University Carlos III of Madrid, Av. de la Universidad, 30, 28911 Leganes, Madrid, Spain
| | - Javier Burguete-Alventosa
- Department of Systems Engineering and Automation, University Carlos III of Madrid, Av. de la Universidad, 30, 28911 Leganes, Madrid, Spain
| | - Sofía Álvarez-Arias
- Department of Systems Engineering and Automation, University Carlos III of Madrid, Av. de la Universidad, 30, 28911 Leganes, Madrid, Spain
| | - María Malfaz
- Department of Systems Engineering and Automation, University Carlos III of Madrid, Av. de la Universidad, 30, 28911 Leganes, Madrid, Spain
| | - Miguel Ángel Salichs
- Department of Systems Engineering and Automation, University Carlos III of Madrid, Av. de la Universidad, 30, 28911 Leganes, Madrid, Spain
| |
Collapse
|
6
|
Lee RS, Sagiv Y, Engelhard B, Witten IB, Daw ND. A feature-specific prediction error model explains dopaminergic heterogeneity. Nat Neurosci 2024; 27:1574-1586. [PMID: 38961229 DOI: 10.1038/s41593-024-01689-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 05/22/2024] [Indexed: 07/05/2024]
Abstract
The hypothesis that midbrain dopamine (DA) neurons broadcast a reward prediction error (RPE) is among the great successes of computational neuroscience. However, recent results contradict a core aspect of this theory: specifically that the neurons convey a scalar, homogeneous signal. While the predominant family of extensions to the RPE model replicates the classic model in multiple parallel circuits, we argue that these models are ill suited to explain reports of heterogeneity in task variable encoding across DA neurons. Instead, we introduce a complementary 'feature-specific RPE' model, positing that individual ventral tegmental area DA neurons report RPEs for different aspects of an animal's moment-to-moment situation. Further, we show how our framework can be extended to explain patterns of heterogeneity in action responses reported among substantia nigra pars compacta DA neurons. This theory reconciles new observations of DA heterogeneity with classic ideas about RPE coding while also providing a new perspective of how the brain performs reinforcement learning in high-dimensional environments.
Collapse
Affiliation(s)
- Rachel S Lee
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | - Yotam Sagiv
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | - Ben Engelhard
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | | | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton, NJ, USA.
- Department of Psychology, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
7
|
Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024; 20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open
Abstract
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - John P. O’Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, California, United States of America
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, California, United States of America
| | - Scott T. Grafton
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, United States of America
| |
Collapse
|
8
|
Wientjes S, Holroyd CB. The successor representation subserves hierarchical abstraction for goal-directed behavior. PLoS Comput Biol 2024; 20:e1011312. [PMID: 38377074 PMCID: PMC10906840 DOI: 10.1371/journal.pcbi.1011312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 03/01/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open
Abstract
Humans have the ability to craft abstract, temporally extended and hierarchically organized plans. For instance, when considering how to make spaghetti for dinner, we typically concern ourselves with useful "subgoals" in the task, such as cutting onions, boiling pasta, and cooking a sauce, rather than particulars such as how many cuts to make to the onion, or exactly which muscles to contract. A core question is how such decomposition of a more abstract task into logical subtasks happens in the first place. Previous research has shown that humans are sensitive to a form of higher-order statistical learning named "community structure". Community structure is a common feature of abstract tasks characterized by a logical ordering of subtasks. This structure can be captured by a model where humans learn predictions of upcoming events multiple steps into the future, discounting predictions of events further away in time. One such model is the "successor representation", which has been argued to be useful for hierarchical abstraction. As of yet, no study has convincingly shown that this hierarchical abstraction can be put to use for goal-directed behavior. Here, we investigate whether participants utilize learned community structure to craft hierarchically informed action plans for goal-directed behavior. Participants were asked to search for paintings in a virtual museum, where the paintings were grouped together in "wings" representing community structure in the museum. We find that participants' choices accord with the hierarchical structure of the museum and that their response times are best predicted by a successor representation. The degree to which the response times reflect the community structure of the museum correlates with several measures of performance, including the ability to craft temporally abstract action plans. These results suggest that successor representation learning subserves hierarchical abstractions relevant for goal-directed behavior.
Collapse
Affiliation(s)
- Sven Wientjes
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Clay B. Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
9
|
Garau C, Hayes J, Chiacchierini G, McCutcheon JE, Apergis-Schoute J. Involvement of A13 dopaminergic neurons in prehensile movements but not reward in the rat. Curr Biol 2023; 33:4786-4797.e4. [PMID: 37816347 DOI: 10.1016/j.cub.2023.09.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 08/14/2023] [Accepted: 09/18/2023] [Indexed: 10/12/2023]
Abstract
Tyrosine hydroxylase (TH)-containing neurons of the dopamine (DA) cell group A13 are well positioned to impact known DA-related functions as their descending projections innervate target regions that regulate vigilance, sensory integration, and motor execution. Despite this connectivity, little is known regarding the functionality of A13-DA circuits. Using TH-specific loss-of-function methodology and techniques to monitor population activity in transgenic rats in vivo, we investigated the contribution of A13-DA neurons in reward and movement-related actions. Our work demonstrates a role for A13-DA neurons in grasping and handling of objects but not reward. A13-DA neurons responded strongly when animals grab and manipulate food items, whereas their inactivation or degeneration prevented animals from successfully doing so-a deficit partially attributed to a reduction in grip strength. By contrast, there was no relation between A13-DA activity and food-seeking behavior when animals were tested on a reward-based task that did not include a reaching/grasping response. Motivation for food was unaffected, as goal-directed behavior for food items was in general intact following A13 neuronal inactivation/degeneration. An anatomical investigation confirmed that A13-DA neurons project to the superior colliculus (SC) and also demonstrated a novel A13-DA projection to the reticular formation (RF). These results establish a functional role for A13-DA neurons in prehensile actions that are uncoupled from the motivational factors that contribute to the initiation of forelimb movements and help position A13-DA circuits into the functional framework regarding centrally located DA populations and their ability to coordinate movement.
Collapse
Affiliation(s)
- Celia Garau
- Department of Neuroscience, Psychology & Behaviour, University of Leicester, University Road, Leicester LE1 9HN, UK.
| | - Jessica Hayes
- Department of Neuroscience, Psychology & Behaviour, University of Leicester, University Road, Leicester LE1 9HN, UK
| | - Giulia Chiacchierini
- Department of Neuroscience, Psychology & Behaviour, University of Leicester, University Road, Leicester LE1 9HN, UK; Department of Physiology and Pharmacology, La Sapienza University of Rome, 00185 Rome, Italy; Laboratory of Neuropsychopharmacology, Santa Lucia Foundation, 00143 Rome, Italy
| | - James E McCutcheon
- Department of Neuroscience, Psychology & Behaviour, University of Leicester, University Road, Leicester LE1 9HN, UK; Department of Psychology, UiT The Arctic University of Norway, Huginbakken 32, 9037 Tromsø, Norway
| | - John Apergis-Schoute
- Department of Neuroscience, Psychology & Behaviour, University of Leicester, University Road, Leicester LE1 9HN, UK; Department of Biological and Experimental Psychology, Queen Mary University of London, London E1 4NS, UK.
| |
Collapse
|
10
|
Browning M. Enhancing reward learning in the absence of an effect on reward. Brain 2023; 146:3574-3575. [PMID: 37471505 DOI: 10.1093/brain/awad248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 07/18/2023] [Indexed: 07/22/2023] Open
Abstract
This scientific commentary refers to ‘Impulse control disorder in Parkinson’s disease is associated with abnormal frontal value signalling’ by Tichelaar et al. (https://doi.org/10.1093/brain/awad162).
Collapse
Affiliation(s)
- Michael Browning
- Department of Psychiatry, University of Oxford, UK
- Oxford Health NHS Trust, Oxford, UK
| |
Collapse
|
11
|
Barnett WH, Kuznetsov A, Lapish CC. Distinct cortico-striatal compartments drive competition between adaptive and automatized behavior. PLoS One 2023; 18:e0279841. [PMID: 36943842 PMCID: PMC10030038 DOI: 10.1371/journal.pone.0279841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 12/15/2022] [Indexed: 03/23/2023] Open
Abstract
Cortical and basal ganglia circuits play a crucial role in the formation of goal-directed and habitual behaviors. In this study, we investigate the cortico-striatal circuitry involved in learning and the role of this circuitry in the emergence of inflexible behaviors such as those observed in addiction. Specifically, we develop a computational model of cortico-striatal interactions that performs concurrent goal-directed and habit learning. The model accomplishes this by distinguishing learning processes in the dorsomedial striatum (DMS) that rely on reward prediction error signals as distinct from the dorsolateral striatum (DLS) where learning is supported by salience signals. These striatal subregions each operate on unique cortical input: the DMS receives input from the prefrontal cortex (PFC) which represents outcomes, and the DLS receives input from the premotor cortex which determines action selection. Following an initial learning of a two-alternative forced choice task, we subjected the model to reversal learning, reward devaluation, and learning a punished outcome. Behavior driven by stimulus-response associations in the DLS resisted goal-directed learning of new reward feedback rules despite devaluation or punishment, indicating the expression of habit. We repeated these simulations after the impairment of executive control, which was implemented as poor outcome representation in the PFC. The degraded executive control reduced the efficacy of goal-directed learning, and stimulus-response associations in the DLS were even more resistant to the learning of new reward feedback rules. In summary, this model describes how circuits of the dorsal striatum are dynamically engaged to control behavior and how the impairment of executive control by the PFC enhances inflexible behavior.
Collapse
Affiliation(s)
- William H. Barnett
- Department of Psychology, Indiana University—Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| | - Alexey Kuznetsov
- Department of Mathematics, Indiana University—Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| | - Christopher C. Lapish
- Department of Psychology, Indiana University—Purdue University Indianapolis, Indianapolis, Indiana, United States of America
- Stark Neurosciences Research Institute, Indiana University—Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| |
Collapse
|
12
|
Morita K, Shimomura K, Kawaguchi Y. Opponent Learning with Different Representations in the Cortico-Basal Ganglia Circuits. eNeuro 2023; 10:ENEURO.0422-22.2023. [PMID: 36653187 PMCID: PMC9884109 DOI: 10.1523/eneuro.0422-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/06/2022] [Accepted: 01/03/2023] [Indexed: 01/20/2023] Open
Abstract
The direct and indirect pathways of the basal ganglia (BG) have been suggested to learn mainly from positive and negative feedbacks, respectively. Since these pathways unevenly receive inputs from different cortical neuron types and/or regions, they may preferentially use different state/action representations. We explored whether such a combined use of different representations, coupled with different learning rates from positive and negative reward prediction errors (RPEs), has computational benefits. We modeled animal as an agent equipped with two learning systems, each of which adopted individual representation (IR) or successor representation (SR) of states. With varying the combination of IR or SR and also the learning rates from positive and negative RPEs in each system, we examined how the agent performed in a dynamic reward navigation task. We found that combination of SR-based system learning mainly from positive RPEs and IR-based system learning mainly from negative RPEs could achieve a good performance in the task, as compared with other combinations. In such a combination of appetitive SR-based and aversive IR-based systems, both systems show activities of comparable magnitudes with opposite signs, consistent with the suggested profiles of the two BG pathways. Moreover, the architecture of such a combination provides a novel coherent explanation for the functional significance and underlying mechanism of diverse findings about the cortico-BG circuits. These results suggest that particularly combining different representations with appetitive and aversive learning could be an effective learning strategy in certain dynamic environments, and it might actually be implemented in the cortico-BG circuits.
Collapse
Affiliation(s)
- Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo 113-0033, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
- Department of Behavioral Medicine, National Institute of Mental Health, National Center of Neurology and Psychiatry, Kodaira 187-8551, Japan
| | - Yasuo Kawaguchi
- Brain Science Institute, Tamagawa University, Machida 194-8610, Japan
- National Institute for Physiological Sciences (NIPS), Okazaki 444-8787, Japan
| |
Collapse
|
13
|
Striatal D2: Where habits and newly learned actions meet. Learn Behav 2022; 50:267-268. [DOI: 10.3758/s13420-022-00526-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2022] [Indexed: 11/08/2022]
|
14
|
Raj V, Thekkuveettil A. Dopamine plays a critical role in the olfactory adaptive learning pathway in Caenorhabditis elegans. J Neurosci Res 2022; 100:2028-2043. [PMID: 35906758 DOI: 10.1002/jnr.25112] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 06/26/2022] [Accepted: 07/16/2022] [Indexed: 11/11/2022]
Abstract
Encoding and consolidating information through learning and memory is vital in adaptation and survival. Dopamine (DA) is a critical neurotransmitter that modulates behavior. However, the role of DA in learning and memory processes is not well defined. Herein, we used the olfactory adaptive learning paradigm in Caenorhabditis elegans to elucidate the role of DA in the memory pathway. Cat-2 mutant worms with low DA synthesis showed a significant reduction in chemotaxis index (CI) compared to the wild type (WT) after short-term conditioning. In dat-1::ICE worms, having degeneration of DA neurons, there was a significant reduction in adaptive learning and memory. When the worms were trained in the presence of exogenous DA (10 mM) instead of food, a substantial increase in CI value was observed. Furthermore, our results suggest that both dop-1 and dop-3 DA receptors are involved in memory retention. The release of DA during conditioning is essential to initiate the learning pathway. We also noted an enhanced cholinergic receptor activity in the absence of dopaminergic neurons. The strains expressing GCaMP6 in DA neurons (pdat-1::GCaMP-6::mCherry) showed a rise in intracellular calcium influx in the presence of the conditional stimulus after training, suggesting DA neurons are activated during memory recall. These results reveal the critical role of DA in adaptive learning and memory, indicating that DA neurons play a crucial role in the effective processing of cognitive function.
Collapse
Affiliation(s)
- Vishnu Raj
- Division of Molecular Medicine, Sree Chitra Tirunal Institute for Medical Sciences and Technology, BMT Wing, Trivandrum, India
| | - Anoopkumar Thekkuveettil
- Division of Molecular Medicine, Sree Chitra Tirunal Institute for Medical Sciences and Technology, BMT Wing, Trivandrum, India
| |
Collapse
|
15
|
Karin O, Alon U. The dopamine circuit as a reward-taxis navigation system. PLoS Comput Biol 2022; 18:e1010340. [PMID: 35877694 PMCID: PMC9352198 DOI: 10.1371/journal.pcbi.1010340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 08/04/2022] [Accepted: 06/29/2022] [Indexed: 01/29/2023] Open
Abstract
Studying the brain circuits that control behavior is challenging, since in addition to their structural complexity there are continuous feedback interactions between actions and sensed inputs from the environment. It is therefore important to identify mathematical principles that can be used to develop testable hypotheses. In this study, we use ideas and concepts from systems biology to study the dopamine system, which controls learning, motivation, and movement. Using data from neuronal recordings in behavioral experiments, we developed a mathematical model for dopamine responses and the effect of dopamine on movement. We show that the dopamine system shares core functional analogies with bacterial chemotaxis. Just as chemotaxis robustly climbs chemical attractant gradients, the dopamine circuit performs ‘reward-taxis’ where the attractant is the expected value of reward. The reward-taxis mechanism provides a simple explanation for scale-invariant dopaminergic responses and for matching in free operant settings, and makes testable quantitative predictions. We propose that reward-taxis is a simple and robust navigation strategy that complements other, more goal-directed navigation mechanisms.
Collapse
Affiliation(s)
- Omer Karin
- Dept. of Molecular Cell Biology, Weizmann Institute of Science, Rehovot Israel
- Dept. of Applied Mathematics and Theoretical Physics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
- Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, United Kingdom
- * E-mail: (OK); (UA)
| | - Uri Alon
- Dept. of Molecular Cell Biology, Weizmann Institute of Science, Rehovot Israel
- * E-mail: (OK); (UA)
| |
Collapse
|
16
|
Codol O, Gribble PL, Gurney KN. Differential Dopamine Receptor-Dependent Sensitivity Improves the Switch Between Hard and Soft Selection in a Model of the Basal Ganglia. Neural Comput 2022; 34:1588-1615. [PMID: 35671472 DOI: 10.1162/neco_a_01517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 04/01/2022] [Indexed: 11/04/2022]
Abstract
The problem of selecting one action from a set of different possible actions, simply referred to as the problem of action selection, is a ubiquitous challenge in the animal world. For vertebrates, the basal ganglia (BG) are widely thought to implement the core computation to solve this problem, as its anatomy and physiology are well suited to this end. However, the BG still display physiological features whose role in achieving efficient action selection remains unclear. In particular, it is known that the two types of dopaminergic receptors (D1 and D2) present in the BG give rise to mechanistically different responses. The overall effect will be a difference in sensitivity to dopamine, which may have ramifications for action selection. However, which receptor type leads to a stronger response is unclear due to the complexity of the intracellular mechanisms involved. In this study, we use an existing, high-level computational model of the BG, which assumes that dopamine contributes to action selection by enabling a switch between different selection regimes, to predict which of D1 or D2 has the greater sensitivity. Thus, we ask, Assuming dopamine enables a switch between action selection regimes in the BG, what functional sensitivity values would result in improved action selection computation? To do this, we quantitatively assessed the model's capacity to perform action selection as we parametrically manipulated the sensitivity weights of D1 and D2. We show that differential (rather than equal) D1 and D2 sensitivity to dopaminergic input improves the switch between selection regimes during the action selection computation in our model. Specifically, greater D2 sensitivity compared to D1 led to these improvements.
Collapse
Affiliation(s)
- Olivier Codol
- Department of Psychology and Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON N6A 3K7, Canada
| | - Paul L Gribble
- Department of Psychology and Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON N6A 3K7, Canada.,Haskins Laboratories, New Haven, CT 06511, U.S.A.
| | - Kevin N Gurney
- Department of Psychology, University of Sheffield, Sheffield S10 2TN, U.K.
| |
Collapse
|
17
|
Möller M, Manohar S, Bogacz R. Uncertainty-guided learning with scaled prediction errors in the basal ganglia. PLoS Comput Biol 2022; 18:e1009816. [PMID: 35622863 PMCID: PMC9182698 DOI: 10.1371/journal.pcbi.1009816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 06/09/2022] [Accepted: 05/05/2022] [Indexed: 11/19/2022] Open
Abstract
To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning.
Collapse
Affiliation(s)
- Moritz Möller
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Sanjay Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
18
|
Model-based learning retrospectively updates model-free values. Sci Rep 2022; 12:2358. [PMID: 35149713 PMCID: PMC8837618 DOI: 10.1038/s41598-022-05567-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 12/16/2021] [Indexed: 12/02/2022] Open
Abstract
Reinforcement learning (RL) is widely regarded as divisible into two distinct computational strategies. Model-free learning is a simple RL process in which a value is associated with actions, whereas model-based learning relies on the formation of internal models of the environment to maximise reward. Recently, theoretical and animal work has suggested that such models might be used to train model-free behaviour, reducing the burden of costly forward planning. Here we devised a way to probe this possibility in human behaviour. We adapted a two-stage decision task and found evidence that model-based processes at the time of learning can alter model-free valuation in healthy individuals. We asked people to rate subjective value of an irrelevant feature that was seen at the time a model-based decision would have been made. These irrelevant feature value ratings were updated by rewards, but in a way that accounted for whether the selected action retrospectively ought to have been taken. This model-based influence on model-free value ratings was best accounted for by a reward prediction error that was calculated relative to the decision path that would most likely have led to the reward. This effect occurred independently of attention and was not present when participants were not explicitly told about the structure of the environment. These findings suggest that current conceptions of model-based and model-free learning require updating in favour of a more integrated approach. Our task provides an empirical handle for further study of the dialogue between these two learning systems in the future.
Collapse
|
19
|
Hamid AA. Dopaminergic specializations for flexible behavioral control: linking levels of analysis and functional architectures. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.07.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
20
|
Moeller M, Grohn J, Manohar S, Bogacz R. An association between prediction errors and risk-seeking: Theory and behavioral evidence. PLoS Comput Biol 2021; 17:e1009213. [PMID: 34270552 PMCID: PMC8318232 DOI: 10.1371/journal.pcbi.1009213] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 07/28/2021] [Accepted: 06/23/2021] [Indexed: 11/19/2022] Open
Abstract
Reward prediction errors (RPEs) and risk preferences have two things in common: both can shape decision making behavior, and both are commonly associated with dopamine. RPEs drive value learning and are thought to be represented in the phasic release of striatal dopamine. Risk preferences bias choices towards or away from uncertainty; they can be manipulated with drugs that target the dopaminergic system. Based on the common neural substrate, we hypothesize that RPEs and risk preferences are linked on the level of behavior as well. Here, we develop this hypothesis theoretically and test it empirically. First, we apply a recent theory of learning in the basal ganglia to predict how RPEs influence risk preferences. We find that positive RPEs should cause increased risk-seeking, while negative RPEs should cause risk-aversion. We then test our behavioral predictions using a novel bandit task in which value and risk vary independently across options. Critically, conditions are included where options vary in risk but are matched for value. We find that our prediction was correct: participants become more risk-seeking if choices are preceded by positive RPEs, and more risk-averse if choices are preceded by negative RPEs. These findings cannot be explained by other known effects, such as nonlinear utility curves or dynamic learning rates.
Collapse
Affiliation(s)
- Moritz Moeller
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Jan Grohn
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| | - Sanjay Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
21
|
Gilbertson T, Steele D. Tonic dopamine, uncertainty and basal ganglia action selection. Neuroscience 2021; 466:109-124. [PMID: 34015370 DOI: 10.1016/j.neuroscience.2021.05.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 05/04/2021] [Accepted: 05/08/2021] [Indexed: 11/29/2022]
Abstract
To make optimal decisions in uncertain circumstances flexible adaption of behaviour is required; exploring alternatives when the best choice is unknown, exploiting what is known when that is best. Using a computational model of the basal ganglia, we propose that switches between exploratory and exploitative decisions are mediated by the interaction between tonic dopamine and cortical input to the basal ganglia. We show that a biologically detailed action selection circuit model, endowed with dopamine dependant striatal plasticity, can optimally solve the explore-exploit problem, estimating the true underlying state of a noisy Gaussian diffusion process. Critical to the model's performance was a fluctuating level of tonic dopamine which increased under conditions of uncertainty. With an optimal range of tonic dopamine, explore-exploit decisions were mediated by the effects of tonic dopamine on the precision of the model action selection mechanism. Under conditions of uncertain reward pay-out, the model's reduced selectivity allowed disinhibition of multiple alternative actions to be explored at random. Conversely, when uncertainly about reward pay-out was low, enhanced selectivity of the action selection circuit facilitated exploitation of the high value choice. Model performance was at the level of a Kalman filter which provides an optimal solution for the task. These simulations support the idea that this subcortical neural circuit may have evolved to facilitate decision making in non-stationary reward environments. The model generates several experimental predictions with relevance to abnormal decision making in neuropsychiatric and neurological disease.
Collapse
Affiliation(s)
- Tom Gilbertson
- Department of Neurology, Level 6, South Block, Ninewells Hospital & Medical School, Dundee DD2 4BF, UK; Division of Imaging Science and Technology, Medical School, University of Dundee, DD2 4BF, UK.
| | - Douglas Steele
- Division of Imaging Science and Technology, Medical School, University of Dundee, DD2 4BF, UK
| |
Collapse
|
22
|
Kim CS. Bayesian mechanics of perceptual inference and motor control in the brain. BIOLOGICAL CYBERNETICS 2021; 115:87-102. [PMID: 33471182 PMCID: PMC7925488 DOI: 10.1007/s00422-021-00859-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 01/06/2021] [Indexed: 06/12/2023]
Abstract
The free energy principle (FEP) in the neurosciences stipulates that all viable agents induce and minimize informational free energy in the brain to fit their environmental niche. In this study, we continue our effort to make the FEP a more physically principled formalism by implementing free energy minimization based on the principle of least action. We build a Bayesian mechanics (BM) by casting the formulation reported in the earlier publication (Kim in Neural Comput 30:2616-2659, 2018, https://doi.org/10.1162/neco_a_01115 ) to considering active inference beyond passive perception. The BM is a neural implementation of variational Bayes under the FEP in continuous time. The resulting BM is provided as an effective Hamilton's equation of motion and subject to the control signal arising from the brain's prediction errors at the proprioceptive level. To demonstrate the utility of our approach, we adopt a simple agent-based model and present a concrete numerical illustration of the brain performing recognition dynamics by integrating BM in neural phase space. Furthermore, we recapitulate the major theoretical architectures in the FEP by comparing our approach with the common state-space formulations.
Collapse
Affiliation(s)
- Chang Sub Kim
- Department of Physics, Chonnam National University, Gwangju, 61186, Republic of Korea.
| |
Collapse
|