1
|
Hanssen R, Rigoux L, Kuzmanovic B, Iglesias S, Kretschmer AC, Schlamann M, Albus K, Edwin Thanarajah S, Sitnikow T, Melzer C, Cornely OA, Brüning JC, Tittgemeyer M. Liraglutide restores impaired associative learning in individuals with obesity. Nat Metab 2023; 5:1352-1363. [PMID: 37592007 PMCID: PMC10447249 DOI: 10.1038/s42255-023-00859-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 07/07/2023] [Indexed: 08/19/2023]
Abstract
Survival under selective pressure is driven by the ability of our brain to use sensory information to our advantage to control physiological needs. To that end, neural circuits receive and integrate external environmental cues and internal metabolic signals to form learned sensory associations, consequently motivating and adapting our behaviour. The dopaminergic midbrain plays a crucial role in learning adaptive behaviour and is particularly sensitive to peripheral metabolic signals, including intestinal peptides, such as glucagon-like peptide 1 (GLP-1). In a single-blinded, randomized, controlled, crossover basic human functional magnetic resonance imaging study relying on a computational model of the adaptive learning process underlying behavioural responses, we show that adaptive learning is reduced when metabolic sensing is impaired in obesity, as indexed by reduced insulin sensitivity (participants: N = 30 with normal insulin sensitivity; N = 24 with impaired insulin sensitivity). Treatment with the GLP-1 receptor agonist liraglutide normalizes impaired learning of sensory associations in men and women with obesity. Collectively, our findings reveal that GLP-1 receptor activation modulates associative learning in people with obesity via its central effects within the mesoaccumbens pathway. These findings provide evidence for how metabolic signals can act as neuromodulators to adapt our behaviour to our body's internal state and how GLP-1 receptor agonists work in clinics.
Collapse
Affiliation(s)
- Ruth Hanssen
- Max Planck Institute for Metabolism Research, Cologne, Germany
- Faculty of Medicine and University Hospital Cologne, Policlinic for Endocrinology, Diabetology and Preventive Medicine (PEPD), University of Cologne, Cologne, Germany
| | - Lionel Rigoux
- Max Planck Institute for Metabolism Research, Cologne, Germany
| | | | - Sandra Iglesias
- Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich and Swiss Federal Institute of Technology, Zurich, Switzerland
| | - Alina C Kretschmer
- Faculty of Medicine and University Hospital Cologne, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD) and Excellence Center for Medical Mycology (ECMM), University of Cologne, Cologne, Germany
| | - Marc Schlamann
- Faculty of Medicine and University Hospital Cologne, Institute for Diagnostic and Interventional Radiology, University of Cologne, Cologne, Germany
| | - Kerstin Albus
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Sharmili Edwin Thanarajah
- Max Planck Institute for Metabolism Research, Cologne, Germany
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Tamara Sitnikow
- Faculty of Medicine and University Hospital Cologne, Policlinic for Endocrinology, Diabetology and Preventive Medicine (PEPD), University of Cologne, Cologne, Germany
| | - Corina Melzer
- Max Planck Institute for Metabolism Research, Cologne, Germany
| | - Oliver A Cornely
- Faculty of Medicine and University Hospital Cologne, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD) and Excellence Center for Medical Mycology (ECMM), University of Cologne, Cologne, Germany
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, Cologne, Germany
- German Centre for Infection Research (DZIF), Partner Site Bonn-Cologne, Cologne, Germany
- Faculty of Medicine and University Hospital Cologne, Clinical Trials Centre Cologne (ZKS Köln), University of Cologne, Cologne, Germany
| | - Jens C Brüning
- Max Planck Institute for Metabolism Research, Cologne, Germany
- Faculty of Medicine and University Hospital Cologne, Policlinic for Endocrinology, Diabetology and Preventive Medicine (PEPD), University of Cologne, Cologne, Germany
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Marc Tittgemeyer
- Max Planck Institute for Metabolism Research, Cologne, Germany.
- Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, Cologne, Germany.
| |
Collapse
|
2
|
Woo JH, Aguirre CG, Bari BA, Tsutsui KI, Grabenhorst F, Cohen JY, Schultz W, Izquierdo A, Soltani A. Mechanisms of adjustments to different types of uncertainty in the reward environment across mice and monkeys. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023; 23:600-619. [PMID: 36823249 PMCID: PMC10444905 DOI: 10.3758/s13415-022-01059-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/22/2022] [Indexed: 02/25/2023]
Abstract
Despite being unpredictable and uncertain, reward environments often exhibit certain regularities, and animals navigating these environments try to detect and utilize such regularities to adapt their behavior. However, successful learning requires that animals also adjust to uncertainty associated with those regularities. Here, we analyzed choice data from two comparable dynamic foraging tasks in mice and monkeys to investigate mechanisms underlying adjustments to different types of uncertainty. In these tasks, animals selected between two choice options that delivered reward probabilistically, while baseline reward probabilities changed after a variable number (block) of trials without any cues to the animals. To measure adjustments in behavior, we applied multiple metrics based on information theory that quantify consistency in behavior, and fit choice data using reinforcement learning models. We found that in both species, learning and choice were affected by uncertainty about reward outcomes (in terms of determining the better option) and by expectation about when the environment may change. However, these effects were mediated through different mechanisms. First, more uncertainty about the better option resulted in slower learning and forgetting in mice, whereas it had no significant effect in monkeys. Second, expectation of block switches accompanied slower learning, faster forgetting, and increased stochasticity in choice in mice, whereas it only reduced learning rates in monkeys. Overall, while demonstrating the usefulness of metrics based on information theory in examining adaptive behavior, our study provides evidence for multiple types of adjustments in learning and choice behavior according to uncertainty in the reward environment.
Collapse
Affiliation(s)
- Jae Hyung Woo
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Claudia G Aguirre
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bilal A Bari
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Ken-Ichiro Tsutsui
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
- Laboratory of Systems Neuroscience, Tohoku University Graduate School of Life Sciences, Sendai, Japan
| | - Fabian Grabenhorst
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Allen Institute for Neural Dynamics, Seattle, WA, USA
| | - Wolfram Schultz
- Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge, UK
| | - Alicia Izquierdo
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
- The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
3
|
Wittmann MK, Scheuplein M, Gibbons SG, Noonan MP. Local and global reward learning in the lateral frontal cortex show differential development during human adolescence. PLoS Biol 2023; 21:e3002010. [PMID: 36862726 PMCID: PMC10013901 DOI: 10.1371/journal.pbio.3002010] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 03/14/2023] [Accepted: 01/20/2023] [Indexed: 03/03/2023] Open
Abstract
Reward-guided choice is fundamental for adaptive behaviour and depends on several component processes supported by prefrontal cortex. Here, across three studies, we show that two such component processes, linking reward to specific choices and estimating the global reward state, develop during human adolescence and are linked to the lateral portions of the prefrontal cortex. These processes reflect the assignment of rewards contingently to local choices, or noncontingently, to choices that make up the global reward history. Using matched experimental tasks and analysis platforms, we show the influence of both mechanisms increase during adolescence (study 1) and that lesions to lateral frontal cortex (that included and/or disconnected both orbitofrontal and insula cortex) in human adult patients (study 2) and macaque monkeys (study 3) impair both local and global reward learning. Developmental effects were distinguishable from the influence of a decision bias on choice behaviour, known to depend on medial prefrontal cortex. Differences in local and global assignments of reward to choices across adolescence, in the context of delayed grey matter maturation of the lateral orbitofrontal and anterior insula cortex, may underlie changes in adaptive behaviour.
Collapse
Affiliation(s)
- Marco K. Wittmann
- Department of Experimental Psychology, University of Oxford, Radcliffe Observatory, Oxford, United Kingdom
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Headington, Oxford, United Kingdom
- Department of Experimental Psychology, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, United Kingdom
| | - Maximilian Scheuplein
- Department of Experimental Psychology, University of Oxford, Radcliffe Observatory, Oxford, United Kingdom
- Institute of Education and Child Studies, Leiden University, Leiden, the Netherlands
| | - Sophie G. Gibbons
- Department of Experimental Psychology, University of Oxford, Radcliffe Observatory, Oxford, United Kingdom
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
| | - MaryAnn P. Noonan
- Department of Experimental Psychology, University of Oxford, Radcliffe Observatory, Oxford, United Kingdom
- Department of Psychology, University of York, York, United Kingdom
- * E-mail:
| |
Collapse
|
4
|
Experiential values are underweighted in decisions involving symbolic options. Nat Hum Behav 2023; 7:611-626. [PMID: 36604497 DOI: 10.1038/s41562-022-01496-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 11/04/2022] [Indexed: 01/07/2023]
Abstract
Standard models of decision-making assume each option is associated with subjective value, regardless of whether this value is inferred from experience (experiential) or explicitly instructed probabilistic outcomes (symbolic). In this study, we present results that challenge the assumption of unified representation of experiential and symbolic value. Across nine experiments, we presented participants with hybrid decisions between experiential and symbolic options. Participants' choices exhibited a pattern consistent with a systematic neglect of the experiential values. This normatively irrational decision strategy held after accounting for alternative explanations, and persisted even when it bore an economic cost. Overall, our results demonstrate that experiential and symbolic values are not symmetrically considered in hybrid decisions, suggesting they recruit different representational systems that may be assigned different priority levels in the decision process. These findings challenge the dominant models commonly used in value-based decision-making research.
Collapse
|
5
|
Scott DN, Frank MJ. Adaptive control of synaptic plasticity integrates micro- and macroscopic network function. Neuropsychopharmacology 2023; 48:121-144. [PMID: 36038780 PMCID: PMC9700774 DOI: 10.1038/s41386-022-01374-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/23/2022] [Accepted: 06/24/2022] [Indexed: 11/09/2022]
Abstract
Synaptic plasticity configures interactions between neurons and is therefore likely to be a primary driver of behavioral learning and development. How this microscopic-macroscopic interaction occurs is poorly understood, as researchers frequently examine models within particular ranges of abstraction and scale. Computational neuroscience and machine learning models offer theoretically powerful analyses of plasticity in neural networks, but results are often siloed and only coarsely linked to biology. In this review, we examine connections between these areas, asking how network computations change as a function of diverse features of plasticity and vice versa. We review how plasticity can be controlled at synapses by calcium dynamics and neuromodulatory signals, the manifestation of these changes in networks, and their impacts in specialized circuits. We conclude that metaplasticity-defined broadly as the adaptive control of plasticity-forges connections across scales by governing what groups of synapses can and can't learn about, when, and to what ends. The metaplasticity we discuss acts by co-opting Hebbian mechanisms, shifting network properties, and routing activity within and across brain systems. Asking how these operations can go awry should also be useful for understanding pathology, which we address in the context of autism, schizophrenia and Parkinson's disease.
Collapse
Affiliation(s)
- Daniel N Scott
- Cognitive Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA.
- Carney Institute for Brain Science, Brown University, Providence, RI, USA.
| | - Michael J Frank
- Cognitive Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA.
- Carney Institute for Brain Science, Brown University, Providence, RI, USA.
| |
Collapse
|
6
|
Jedlicka P, Tomko M, Robins A, Abraham WC. Contributions by metaplasticity to solving the Catastrophic Forgetting Problem. Trends Neurosci 2022; 45:656-666. [PMID: 35798611 DOI: 10.1016/j.tins.2022.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 06/09/2022] [Indexed: 10/17/2022]
Abstract
Catastrophic forgetting (CF) refers to the sudden and severe loss of prior information in learning systems when acquiring new information. CF has been an Achilles heel of standard artificial neural networks (ANNs) when learning multiple tasks sequentially. The brain, by contrast, has solved this problem during evolution. Modellers now use a variety of strategies to overcome CF, many of which have parallels to cellular and circuit functions in the brain. One common strategy, based on metaplasticity phenomena, controls the future rate of change at key connections to help retain previously learned information. However, the metaplasticity properties so far used are only a subset of those existing in neurobiology. We propose that as models become more sophisticated, there could be value in drawing on a richer set of metaplasticity rules, especially when promoting continual learning in agents moving about the environment.
Collapse
Affiliation(s)
- Peter Jedlicka
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Clinical Neuroanatomy, Neuroscience Center, Goethe University Frankfurt, Frankfurt/Main, Germany; Frankfurt Institute for Advanced Studies, Frankfurt 60438, Germany.
| | - Matus Tomko
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Molecular Physiology and Genetics, Centre of Biosciences, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Anthony Robins
- Department of Computer Science, University of Otago, Dunedin 9016, New Zealand
| | - Wickliffe C Abraham
- Department of Psychology, Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand.
| |
Collapse
|
7
|
Klein-Flügge MC, Bongioanni A, Rushworth MFS. Medial and orbital frontal cortex in decision-making and flexible behavior. Neuron 2022; 110:2743-2770. [PMID: 35705077 DOI: 10.1016/j.neuron.2022.05.022] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/17/2022] [Accepted: 05/19/2022] [Indexed: 11/15/2022]
Abstract
The medial frontal cortex and adjacent orbitofrontal cortex have been the focus of investigations of decision-making, behavioral flexibility, and social behavior. We review studies conducted in humans, macaques, and rodents and argue that several regions with different functional roles can be identified in the dorsal anterior cingulate cortex, perigenual anterior cingulate cortex, anterior medial frontal cortex, ventromedial prefrontal cortex, and medial and lateral parts of the orbitofrontal cortex. There is increasing evidence that the manner in which these areas represent the value of the environment and specific choices is different from subcortical brain regions and more complex than previously thought. Although activity in some regions reflects distributions of reward and opportunities across the environment, in other cases, activity reflects the structural relationships between features of the environment that animals can use to infer what decision to take even if they have not encountered identical opportunities in the past.
Collapse
Affiliation(s)
- Miriam C Klein-Flügge
- Wellcome Centre for Integrative Neuroimaging (WIN), Department of Experimental Psychology, University of Oxford, Tinsley Building, Mansfield Road, Oxford OX1 3TA, UK; Wellcome Centre for Integrative Neuroimaging (WIN), Centre for Functional MRI of the Brain (FMRIB), University of Oxford, Nuffield Department of Clinical Neurosciences, Level 6, West Wing, John Radcliffe Hospital, Oxford OX3 9DU, UK; Department of Psychiatry, University of Oxford, Warneford Lane, Headington, Oxford OX3 7JX, UK.
| | - Alessandro Bongioanni
- Wellcome Centre for Integrative Neuroimaging (WIN), Department of Experimental Psychology, University of Oxford, Tinsley Building, Mansfield Road, Oxford OX1 3TA, UK
| | - Matthew F S Rushworth
- Wellcome Centre for Integrative Neuroimaging (WIN), Department of Experimental Psychology, University of Oxford, Tinsley Building, Mansfield Road, Oxford OX1 3TA, UK; Wellcome Centre for Integrative Neuroimaging (WIN), Centre for Functional MRI of the Brain (FMRIB), University of Oxford, Nuffield Department of Clinical Neurosciences, Level 6, West Wing, John Radcliffe Hospital, Oxford OX3 9DU, UK
| |
Collapse
|
8
|
Palminteri S, Lebreton M. The computational roots of positivity and confirmation biases in reinforcement learning. Trends Cogn Sci 2022; 26:607-621. [PMID: 35662490 DOI: 10.1016/j.tics.2022.04.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 12/16/2022]
Abstract
Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one's own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to 'high-level' belief updates. We present evidence against this account. Learning rates in reinforcement learning (RL) tasks, estimated across different contexts and species, generally present the same characteristic asymmetry, suggesting that belief and value updating processes share key computational principles and distortions. This bias generates over-optimistic expectations about the probability of making the right choices and, consequently, generates over-optimistic reward expectations. We discuss the normative and neurobiological roots of these RL biases and their position within the greater picture of behavioral decision-making theories.
Collapse
Affiliation(s)
- Stefano Palminteri
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France; Département d'Études Cognitives, Ecole Normale Supérieure, Paris, France; Université de Recherche Paris Sciences et Lettres, Paris, France.
| | - Maël Lebreton
- Paris School of Economics, Paris, France; LabNIC, Department of Fundamental Neurosciences, University of Geneva, Geneva, Switzerland; Swiss Center for Affective Science, Geneva, Switzerland.
| |
Collapse
|
9
|
Rational arbitration between statistics and rules in human sequence processing. Nat Hum Behav 2022; 6:1087-1103. [PMID: 35501360 DOI: 10.1038/s41562-021-01259-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 11/17/2021] [Indexed: 01/29/2023]
Abstract
Detecting and learning temporal regularities is essential to accurately predict the future. A long-standing debate in cognitive science concerns the existence in humans of a dissociation between two systems, one for handling statistical regularities governing the probabilities of individual items and their transitions, and another for handling deterministic rules. Here, to address this issue, we used finger tracking to continuously monitor the online build-up of evidence, confidence, false alarms and changes-of-mind during sequence processing. All these aspects of behaviour conformed tightly to a hierarchical Bayesian inference model with distinct hypothesis spaces for statistics and rules, yet linked by a single probabilistic currency. Alternative models based either on a single statistical mechanism or on two non-commensurable systems were rejected. Our results indicate that a hierarchical Bayesian inference mechanism, capable of operating over distinct hypothesis spaces for statistics and rules, underlies the human capability for sequence processing.
Collapse
|
10
|
Monosov IE, Rushworth MFS. Interactions between ventrolateral prefrontal and anterior cingulate cortex during learning and behavioural change. Neuropsychopharmacology 2022; 47:196-210. [PMID: 34234288 PMCID: PMC8617208 DOI: 10.1038/s41386-021-01079-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/27/2021] [Accepted: 06/15/2021] [Indexed: 02/06/2023]
Abstract
Hypotheses and beliefs guide credit assignment - the process of determining which previous events or actions caused an outcome. Adaptive hypothesis formation and testing are crucial in uncertain and changing environments in which associations and meanings are volatile. Despite primates' abilities to form and test hypotheses, establishing what is causally responsible for the occurrence of particular outcomes remains a fundamental challenge for credit assignment and learning. Hypotheses about what surprises are due to stochasticity inherent in an environment as opposed to real, systematic changes are necessary for identifying the environment's predictive features, but are often hard to test. We review evidence that two highly interconnected frontal cortical regions, anterior cingulate cortex and ventrolateral prefrontal area 47/12o, provide a biological substrate for linking two crucial components of hypothesis-formation and testing: the control of information seeking and credit assignment. Neuroimaging, targeted disruptions, and neurophysiological studies link an anterior cingulate - 47/12o circuit to generation of exploratory behaviour, non-instrumental information seeking, and interpretation of subsequent feedback in the service of credit assignment. Our observations support the idea that information seeking and credit assignment are linked at the level of neural circuits and explain why this circuit is important for ensuring behaviour is flexible and adaptive.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Biomedical Engineering, Washington University, St. Louis, MO, USA.
- Department of Electrical Engineering, Washington University, St. Louis, MO, USA.
- Department of Neurosurgery, Washington University, St. Louis, MO, USA.
- Pain Center, Washington University, St. Louis, MO, USA.
| | - Matthew F S Rushworth
- Wellcome Centre for Integrative Neuroimaging (WIN), Department of Experimental Psychology, University of Oxford, Oxford, UK.
| |
Collapse
|
11
|
Rudebeck PH, Izquierdo A. Foraging with the frontal cortex: A cross-species evaluation of reward-guided behavior. Neuropsychopharmacology 2022; 47:134-146. [PMID: 34408279 PMCID: PMC8617092 DOI: 10.1038/s41386-021-01140-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 07/30/2021] [Accepted: 07/30/2021] [Indexed: 02/07/2023]
Abstract
Efficient foraging is essential to survival and depends on frontal cortex in mammals. Because of its role in psychiatric disorders, frontal cortex and its contributions to reward procurement have been studied extensively in both rodents and non-human primates. How frontal cortex of these animal models compares is a source of intense debate. Here we argue that translating findings from rodents to non-human primates requires an appreciation of both the niche in which each animal forages as well as the similarities in frontal cortex anatomy and function. Consequently, we highlight similarities and differences in behavior and anatomy, before focusing on points of convergence in how parts of frontal cortex contribute to distinct aspects of foraging in rats and macaques, more specifically. In doing so, our aim is to emphasize where translation of frontal cortex function between species is clearer, where there is divergence, and where future work should focus. We finish by highlighting aspects of foraging for which have received less attention but we believe are critical to uncovering how frontal cortex promotes survival in each species.
Collapse
Affiliation(s)
| | - Alicia Izquierdo
- Department of Psychology, UCLA, Los Angeles, CA, USA.
- The Brain Research Institute, UCLA, Los Angeles, CA, USA.
- Integrative Center for Learning and Memory, UCLA, Los Angeles, CA, USA.
- Integrative Center for Addictions, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
12
|
Soltani A, Koechlin E. Computational models of adaptive behavior and prefrontal cortex. Neuropsychopharmacology 2022; 47:58-71. [PMID: 34389808 PMCID: PMC8617006 DOI: 10.1038/s41386-021-01123-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/19/2021] [Accepted: 07/20/2021] [Indexed: 02/07/2023]
Abstract
The real world is uncertain, and while ever changing, it constantly presents itself in terms of new sets of behavioral options. To attain the flexibility required to tackle these challenges successfully, most mammalian brains are equipped with certain computational abilities that rely on the prefrontal cortex (PFC). By examining learning in terms of internal models associating stimuli, actions, and outcomes, we argue here that adaptive behavior relies on specific interactions between multiple systems including: (1) selective models learning stimulus-action associations through rewards; (2) predictive models learning stimulus- and/or action-outcome associations through statistical inferences anticipating behavioral outcomes; and (3) contextual models learning external cues associated with latent states of the environment. Critically, the PFC combines these internal models by forming task sets to drive behavior and, moreover, constantly evaluates the reliability of actor task sets in predicting external contingencies to switch between task sets or create new ones. We review different models of adaptive behavior to demonstrate how their components map onto this unifying framework and specific PFC regions. Finally, we discuss how our framework may help to better understand the neural computations and the cognitive architecture of PFC regions guiding adaptive behavior.
Collapse
Affiliation(s)
- Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| | - Etienne Koechlin
- Institut National de la Sante et de la Recherche Medicale, Universite Pierre et Marie Curie, Ecole Normale Superieure, Paris, France.
| |
Collapse
|
13
|
Farashahi S, Soltani A. Computational mechanisms of distributed value representations and mixed learning strategies. Nat Commun 2021; 12:7191. [PMID: 34893597 PMCID: PMC8664930 DOI: 10.1038/s41467-021-27413-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 11/16/2021] [Indexed: 11/25/2022] Open
Abstract
Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and opponency between excitatory and inhibitory neurons through value-dependent disinhibition. Together, our results suggest computational and neural mechanisms underlying emergence of complex learning strategies in naturalistic settings.
Collapse
Affiliation(s)
- Shiva Farashahi
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
- Center for Computational Neuroscience, Flatiron Institute, Simons Foundation, New York, NY, USA.
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
14
|
Womelsdorf T, Watson MR, Tiesinga P. Learning at Variable Attentional Load Requires Cooperation of Working Memory, Meta-learning, and Attention-augmented Reinforcement Learning. J Cogn Neurosci 2021; 34:79-107. [PMID: 34813644 PMCID: PMC9830786 DOI: 10.1162/jocn_a_01780] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves using working memory of recently rewarded objects to guide choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we aim to disentangle their relative contributions in rhesus monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six monkeys is consistently best predicted with a model combining (i) fast working memory and (ii) slower reinforcement learning from differently weighted positive and negative prediction errors as well as (iii) selective suppression of nonchosen feature values and (iv) a meta-learning mechanism that enhances exploration rates based on a memory trace of recent errors. The optimal model parameter settings suggest that these mechanisms cooperate differently at low and high attentional loads. Whereas working memory was essential for efficient learning at lower attentional loads, enhanced weighting of negative prediction errors and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings pinpoint a canonical set of learning mechanisms and suggest how they may cooperate when subjects flexibly adjust to environments with variable real-world attentional demands.
Collapse
Affiliation(s)
- Thilo Womelsdorf
- Department of Psychology, Vanderbilt University, Nashville, TN 37240
| | - Marcus R. Watson
- School of Kinesiology and Health Science, Centre for Vision Research, York University, 4700 Keele Street, Toronto, Ontario M6J 1P3, Canada
| | - Paul Tiesinga
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen 6525 EN, Netherlands
| |
Collapse
|
15
|
Foucault C, Meyniel F. Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments. eLife 2021; 10:71801. [PMID: 34854377 PMCID: PMC8735865 DOI: 10.7554/elife.71801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 12/01/2021] [Indexed: 11/13/2022] Open
Abstract
From decision making to perception to language, predicting what is coming next is crucial. It is also challenging in stochastic, changing, and structured environments; yet the brain makes accurate predictions in many situations. What computational architecture could enable this feat? Bayesian inference makes optimal predictions but is prohibitively difficult to compute. Here, we show that a specific recurrent neural network architecture enables simple and accurate solutions in several environments. This architecture relies on three mechanisms: gating, lateral connections, and recurrent weight training. Like the optimal solution and the human brain, such networks develop internal representations of their changing environment (including estimates of the environment’s latent variables and the precision of these estimates), leverage multiple levels of latent structure, and adapt their effective learning rate to changes without changing their connection weights. Being ubiquitous in the brain, gated recurrence could therefore serve as a generic building block to predict in real-life environments.
Collapse
Affiliation(s)
- Cédric Foucault
- INSERM, CEA, Université Paris-Saclay, Gif sur Yvette, France
| | | |
Collapse
|
16
|
Piray P, Daw ND. A model for learning based on the joint estimation of stochasticity and volatility. Nat Commun 2021; 12:6587. [PMID: 34782597 PMCID: PMC8592992 DOI: 10.1038/s41467-021-26731-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 10/08/2021] [Indexed: 02/08/2023] Open
Abstract
Previous research has stressed the importance of uncertainty for controlling the speed of learning, and how such control depends on the learner inferring the noise properties of the environment, especially volatility: the speed of change. However, learning rates are jointly determined by the comparison between volatility and a second factor, moment-to-moment stochasticity. Yet much previous research has focused on simplified cases corresponding to estimation of either factor alone. Here, we introduce a learning model, in which both factors are learned simultaneously from experience, and use the model to simulate human and animal data across many seemingly disparate neuroscientific and behavioral phenomena. By considering the full problem of joint estimation, we highlight a set of previously unappreciated issues, arising from the mutual interdependence of inference about volatility and stochasticity. This interdependence complicates and enriches the interpretation of previous results, such as pathological learning in individuals with anxiety and following amygdala damage.
Collapse
Affiliation(s)
- Payam Piray
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Nathaniel D Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
17
|
Abstract
We live in a world that changes on many timescales. To learn and make decisions appropriately, the human brain has evolved to integrate various types of information, such as sensory evidence and reward feedback, on multiple timescales. This is reflected in cortical hierarchies of timescales consisting of heterogeneous neuronal activities and expression of genes related to neurotransmitters critical for learning. We review the recent findings on how timescales of sensory and reward integration are affected by the temporal properties of sensory and reward signals in the environment. Despite existing evidence linking behavioral and neuronal timescales, future studies must examine how neural computations at multiple timescales are adjusted and combined to influence behavior flexibly.
Collapse
Affiliation(s)
- Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Moore Hall, 3 Maynard St, Hanover, NH 03755
| | - John D. Murray
- Department of Psychiatry, Yale School of Medicine, 300 George Street, New Haven, CT 06511
| | - Hyojung Seo
- Department of Psychiatry, Yale School of Medicine, 300 George Street, New Haven, CT 06511
| | - Daeyeol Lee
- The Zanvyl Krieger Mind/Brain Institute, Department of Neuroscience, Department of Psychological Sciences, Kavli Neuroscience Discovery Institute, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218
| |
Collapse
|
18
|
Ghambaryan A, Gutkin B, Klucharev V, Koechlin E. Additively Combining Utilities and Beliefs: Research Gaps and Algorithmic Developments. Front Neurosci 2021; 15:704728. [PMID: 34658760 PMCID: PMC8517513 DOI: 10.3389/fnins.2021.704728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 09/13/2021] [Indexed: 11/20/2022] Open
Abstract
Value-based decision making in complex environments, such as those with uncertain and volatile mapping of reward probabilities onto options, may engender computational strategies that are not necessarily optimal in terms of normative frameworks but may ensure effective learning and behavioral flexibility in conditions of limited neural computational resources. In this article, we review a suboptimal strategy - additively combining reward magnitude and reward probability attributes of options for value-based decision making. In addition, we present computational intricacies of a recently developed model (named MIX model) representing an algorithmic implementation of the additive strategy in sequential decision-making with two options. We also discuss its opportunities; and conceptual, inferential, and generalization issues. Furthermore, we suggest future studies that will reveal the potential and serve the further development of the MIX model as a general model of value-based choice making.
Collapse
Affiliation(s)
- Anush Ghambaryan
- Centre for Cognition and Decision Making, HSE University, Moscow, Russia
- Ecole Normale Supérieure, PSL Research University, Paris, France
| | - Boris Gutkin
- Centre for Cognition and Decision Making, HSE University, Moscow, Russia
- Ecole Normale Supérieure, PSL Research University, Paris, France
| | - Vasily Klucharev
- Centre for Cognition and Decision Making, HSE University, Moscow, Russia
| | - Etienne Koechlin
- Ecole Normale Supérieure, PSL Research University, Paris, France
| |
Collapse
|
19
|
|
20
|
Findling C, Wyart V. Computation noise in human learning and decision-making: origin, impact, function. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.02.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
21
|
Inglis JB, Valentin VV, Ashby FG. Modulation of Dopamine for Adaptive Learning: A Neurocomputational Model. COMPUTATIONAL BRAIN & BEHAVIOR 2021; 4:34-52. [PMID: 34151186 PMCID: PMC8210637 DOI: 10.1007/s42113-020-00083-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
There have been many proposals that learning rates in the brain are adaptive, in the sense that they increase or decrease depending on environmental conditions. The majority of these models are abstract and make no attempt to describe the neural circuitry that implements the proposed computations. This article describes a biologically detailed computational model that overcomes this shortcoming. Specifically, we propose a neural circuit that implements adaptive learning rates by modulating the gain on the dopamine response to reward prediction errors, and we model activity within this circuit at the level of spiking neurons. The model generates a dopamine signal that depends on the size of the tonically active dopamine neuron population and the phasic spike rate. The model was tested successfully against results from two single-neuron recording studies and a fast-scan cyclic voltammetry study. We conclude by discussing the general applicability of the model to dopamine mediated tasks that transcend the experimental phenomena it was initially designed to address.
Collapse
Affiliation(s)
- Jeffrey B Inglis
- Interdepartmental Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara
| | - Vivian V Valentin
- Department of Psychological & Brain Sciences, University of California, Santa Barbara
| | - F Gregory Ashby
- Department of Psychological & Brain Sciences, University of California, Santa Barbara
| |
Collapse
|
22
|
Martínez Oportus XP. Efecto de la respiración consciente en la tarea de atención en adultos. REVISTA SCIENTIFIC 2021. [DOI: 10.29394/scientific.issn.2542-2987.2021.6.19.20.383-401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
El presente ensayo pretende revisar las publicaciones asociadas a la tarea de atención en adultos en virtud del impacto de la respiración consiente. Las técnicas de respiración en los diferentes estilos de meditación han cobrado relevancia a la hora de evaluar el proceso de enseñanza aprendizaje en niños, principalmente en algunas funciones superiores cognitivas como lo es el control inhibitorio. En adultos, hay información difusa no sistematizada de cómo podrían impactar estas prácticas en el proceso de enseñanza aprendizaje, considerando que los adultos presentan supresión de la neurogénesis y la neuroprotección, lo que conduce a alteraciones patológicas en el estado de ánimo, la atención, memoria y aprendizaje, según lo descrito por Innes y Selfe (2014). La evidencia determina que es factible generar una intervención para la mejora del ambiente de aprendizaje, basado en el impacto que produce en los procesos atencionales. Este impacto podría determinar la adecuación de políticas públicas o intervenciones de instituciones públicas o privadas, con el fin de potenciar el aprendizaje en adultos y limitar el deterioro cognitivo de estos, a través del estímulo de sus funciones cognitivas que produce la respiración consiente.
Collapse
|
23
|
Gijsen S, Grundei M, Lange RT, Ostwald D, Blankenburg F. Neural surprise in somatosensory Bayesian learning. PLoS Comput Biol 2021; 17:e1008068. [PMID: 33529181 PMCID: PMC7880500 DOI: 10.1371/journal.pcbi.1008068] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 02/12/2021] [Accepted: 12/18/2020] [Indexed: 02/08/2023] Open
Abstract
Tracking statistical regularities of the environment is important for shaping human behavior and perception. Evidence suggests that the brain learns environmental dependencies using Bayesian principles. However, much remains unknown about the employed algorithms, for somesthesis in particular. Here, we describe the cortical dynamics of the somatosensory learning system to investigate both the form of the generative model as well as its neural surprise signatures. Specifically, we recorded EEG data from 40 participants subjected to a somatosensory roving-stimulus paradigm and performed single-trial modeling across peri-stimulus time in both sensor and source space. Our Bayesian model selection procedure indicates that evoked potentials are best described by a non-hierarchical learning model that tracks transitions between observations using leaky integration. From around 70ms post-stimulus onset, secondary somatosensory cortices are found to represent confidence-corrected surprise as a measure of model inadequacy. Indications of Bayesian surprise encoding, reflecting model updating, are found in primary somatosensory cortex from around 140ms. This dissociation is compatible with the idea that early surprise signals may control subsequent model update rates. In sum, our findings support the hypothesis that early somatosensory processing reflects Bayesian perceptual learning and contribute to an understanding of its underlying mechanisms. Our environment features statistical regularities, such as a drop of rain predicting imminent rainfall. Despite the importance for behavior and survival, much remains unknown about how these dependencies are learned, particularly for somatosensation. As surprise signalling about novel observations indicates a mismatch between one’s beliefs and the world, it has been hypothesized that surprise computation plays an important role in perceptual learning. By analyzing EEG data from human participants receiving sequences of tactile stimulation, we compare different formulations of surprise and investigate the employed underlying learning model. Our results indicate that the brain estimates transitions between observations. Furthermore, we identified different signatures of surprise computation and thereby provide a dissociation of the neural correlates of belief inadequacy and belief updating. Specifically, early surprise responses from around 70ms were found to signal the need for changes to the model, with encoding of its subsequent updating occurring from around 140ms. These results provide insights into how somatosensory surprise signals may contribute to the learning of environmental statistics.
Collapse
Affiliation(s)
- Sam Gijsen
- Neurocomputation and Neuroimaging Unit, Freie Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Faculty of Philosophy, Berlin School of Mind and Brain, Berlin, Germany
- * E-mail: (SG); (MG)
| | - Miro Grundei
- Neurocomputation and Neuroimaging Unit, Freie Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Faculty of Philosophy, Berlin School of Mind and Brain, Berlin, Germany
- * E-mail: (SG); (MG)
| | - Robert T. Lange
- Berlin Institute of Technology, Berlin, Germany
- Einstein Center for Neurosciences, Berlin, Germany
| | - Dirk Ostwald
- Computational Cognitive Neuroscience, Freie Universität Berlin, Germany
| | - Felix Blankenburg
- Neurocomputation and Neuroimaging Unit, Freie Universität Berlin, Germany
| |
Collapse
|
24
|
Levy I, Schiller D. Neural Computations of Threat. Trends Cogn Sci 2021; 25:151-171. [PMID: 33384214 PMCID: PMC8084636 DOI: 10.1016/j.tics.2020.11.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 11/16/2020] [Accepted: 11/18/2020] [Indexed: 12/26/2022]
Abstract
A host of learning, memory, and decision-making processes form the individual's response to threat and may be disrupted in anxiety and post-trauma psychopathology. Here we review the neural computations of threat, from the first encounter with a dangerous situation, through learning, storing, and updating cues that predict it, to making decisions about the optimal course of action. The overview highlights the interconnected nature of these processes and their reliance on shared neural and computational mechanisms. We propose an integrative approach to the study of threat-related processes, in which specific computations are studied across the various stages of threat experience rather than in isolation. This approach can generate new insights about the evolution, diagnosis, and treatment of threat-related psychopathology.
Collapse
Affiliation(s)
- Ifat Levy
- Departments of Comparative Medicine, Neuroscience, and Psychology, Yale University, New Haven, CT, USA.
| | - Daniela Schiller
- Department of Psychiatry, Department of Neuroscience, and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
25
|
Monosov IE, Haber SN, Leuthardt EC, Jezzini A. Anterior Cingulate Cortex and the Control of Dynamic Behavior in Primates. Curr Biol 2020; 30:R1442-R1454. [PMID: 33290716 PMCID: PMC8197026 DOI: 10.1016/j.cub.2020.10.009] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The brain mechanism for controlling continuous behavior in dynamic contexts must mediate action selection and learning across many timescales, responding differentially to the level of environmental uncertainty and volatility. In this review, we argue that a part of the frontal cortex known as the anterior cingulate cortex (ACC) is particularly well suited for this function. First, the ACC is interconnected with prefrontal, parietal, and subcortical regions involved in valuation and action selection. Second, the ACC integrates diverse, behaviorally relevant information across multiple timescales, producing output signals that temporally encapsulate decision and learning processes and encode high-dimensional information about the value and uncertainty of future outcomes and subsequent behaviors. Third, the ACC signals behaviorally relevant information flexibly, displaying the capacity to represent information about current and future states in a valence-, context-, task- and action-specific manner. Fourth, the ACC dynamically controls instrumental- and non-instrumental information seeking behaviors to resolve uncertainty about future outcomes. We review electrophysiological and circuit disruption studies in primates to develop this point, discuss its relationship to novel therapeutics for neuropsychiatric disorders in humans, and conclude by relating ongoing research in primates to studies of medial frontal cortical regions in rodents.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, USA; Department of Electrical Engineering, Washington University, St. Louis, MO 63130, USA; Department of Neurosurgery School of Medicine, Washington University, St. Louis, MO 63110, USA; Pain Center, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | - Suzanne N Haber
- Department of Pharmacology and Physiology, University of Rochester, Rochester, NY 14627, USA; Basic Neuroscience, McLean Hospital, Harvard Medical School, Belmont, MA 02478, USA
| | - Eric C Leuthardt
- Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, USA; Department of Neurosurgery School of Medicine, Washington University, St. Louis, MO 63110, USA
| | - Ahmad Jezzini
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
26
|
Blain B, Rutledge RB. Momentary subjective well-being depends on learning and not reward. eLife 2020; 9:57977. [PMID: 33200989 PMCID: PMC7755387 DOI: 10.7554/elife.57977] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 11/16/2020] [Indexed: 01/20/2023] Open
Abstract
Subjective well-being or happiness is often associated with wealth. Recent studies suggest that momentary happiness is associated with reward prediction error, the difference between experienced and predicted reward, a key component of adaptive behaviour. We tested subjects in a reinforcement learning task in which reward size and probability were uncorrelated, allowing us to dissociate between the contributions of reward and learning to happiness. Using computational modelling, we found convergent evidence across stable and volatile learning tasks that happiness, like behaviour, is sensitive to learning-relevant variables (i.e. probability prediction error). Unlike behaviour, happiness is not sensitive to learning-irrelevant variables (i.e. reward prediction error). Increasing volatility reduces how many past trials influence behaviour but not happiness. Finally, depressive symptoms reduce happiness more in volatile than stable environments. Our results suggest that how we learn about our world may be more important for how we feel than the rewards we actually receive. Many people believe they would be happier if only they had more money. And events such as winning the lottery or receiving a large pay rise do make people happy, at least temporarily. But recent studies suggest that the main factor driving happiness on such occasions is not the size of the reward received. Instead, it is how well that reward matches up with expectations. Receiving a 10% pay rise when you were expecting 1% will make you feel happier than receiving 10% when you had been expecting 20%. This difference between an expected and an actual reward is referred to as a reward prediction error. Reward prediction errors have a key role in learning. They motivate people to repeat behaviours that led to unexpectedly large rewards. But they also enable people to update their beliefs about the world, which is rewarding in itself. Could it be that reward prediction errors are associated with happiness mainly because they help us understand the world a little better than before? To test this idea, Blain and Rutledge designed a task in which the likelihood of receiving a reward was unrelated to the size of the reward. This study design makes it possible to separate out the contributions of learning versus reward to moment-by-moment happiness. In the task, volunteers had to decide which of two cars would win a race. In the ‘stable’ condition, one of the cars always had an 80% chance of winning. In the ‘volatile’ condition, one car had an 80% chance of winning for the first 20 trials. The other car then had an 80% chance of winning for the next 20 trials. The volunteers were not told these probabilities in advance, but had to work them out by playing the game. However, on every trial, the volunteers were shown the reward they would receive if they chose either of the cars and that car went on to win. The size of the rewards varied at random and was unrelated to the likelihood of a car winning. Every few trials, the volunteers were asked to indicate their current level of happiness on a scale. The results showed that volunteers were happier after winning than after losing. On average they were also happier in the stable condition than in the volatile condition. This was especially true for volunteers with pre-existing symptoms of depression. Moreover, happiness after wins did not depend on how large the reward they got was, but instead simply on how surprised they were to win. These results suggest that how we learn about the world around us can be more important for how we feel than rewards we receive directly. Measuring happiness in various types of environment could help us understand factors affecting mental health. The current results suggest, for example, that uncertain environments may be especially unpleasant for people with depression. Further research is needed to understand why this might be the case. In the real world, rewards are often uncertain and infrequent, but learning may nevertheless have the potential to boost happiness.
Collapse
Affiliation(s)
- Bastien Blain
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| | - Robb B Rutledge
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom.,Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom.,Department of Psychology, Yale University, New Haven, United States
| |
Collapse
|
27
|
Soltani A, Rakhshan M, Schafer RJ, Burrows BE, Moore T. Separable Influences of Reward on Visual Processing and Choice. J Cogn Neurosci 2020; 33:248-262. [PMID: 33166195 DOI: 10.1162/jocn_a_01647] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Primate vision is characterized by constant, sequential processing and selection of visual targets to fixate. Although expected reward is known to influence both processing and selection of visual targets, similarities and differences between these effects remain unclear mainly because they have been measured in separate tasks. Using a novel paradigm, we simultaneously measured the effects of reward outcomes and expected reward on target selection and sensitivity to visual motion in monkeys. Monkeys freely chose between two visual targets and received a juice reward with varying probability for eye movements made to either of them. Targets were stationary apertures of drifting gratings, causing the end points of eye movements to these targets to be systematically biased in the direction of motion. We used this motion-induced bias as a measure of sensitivity to visual motion on each trial. We then performed different analyses to explore effects of objective and subjective reward values on choice and sensitivity to visual motion to find similarities and differences between reward effects on these two processes. Specifically, we used different reinforcement learning models to fit choice behavior and estimate subjective reward values based on the integration of reward outcomes over multiple trials. Moreover, to compare the effects of subjective reward value on choice and sensitivity to motion directly, we considered correlations between each of these variables and integrated reward outcomes on a wide range of timescales. We found that, in addition to choice, sensitivity to visual motion was also influenced by subjective reward value, although the motion was irrelevant for receiving reward. Unlike choice, however, sensitivity to visual motion was not affected by objective measures of reward value. Moreover, choice was determined by the difference in subjective reward values of the two options, whereas sensitivity to motion was influenced by the sum of values. Finally, models that best predicted visual processing and choice used sets of estimated reward values based on different types of reward integration and timescales. Together, our results demonstrate separable influences of reward on visual processing and choice, and point to the presence of multiple brain circuits for the integration of reward outcomes.
Collapse
|
28
|
Monosov IE. How Outcome Uncertainty Mediates Attention, Learning, and Decision-Making. Trends Neurosci 2020; 43:795-809. [PMID: 32736849 PMCID: PMC8153236 DOI: 10.1016/j.tins.2020.06.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 06/16/2020] [Accepted: 06/24/2020] [Indexed: 01/24/2023]
Abstract
Animals and humans evolved sophisticated nervous systems that endowed them with the ability to form internal-models or beliefs and make predictions about the future to survive and flourish in a world in which future outcomes are often uncertain. Crucial to this capacity is the ability to adjust behavioral and learning policies in response to the level of uncertainty. Until recently, the neuronal mechanisms that could underlie such uncertainty-guided control have been largely unknown. In this review, I discuss newly discovered neuronal circuits in primates that represent uncertainty about future rewards and propose how they guide information-seeking, attention, decision-making, and learning to help us survive in an uncertain world. Lastly, I discuss the possible relevance of these findings to learning in artificial systems.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience and Neurosurgery, Washington University School of Medicine in St. Louis, MO, USA; Department of Biomedical Engineering, Washington University School of Medicine in St. Louis, MO, USA; Washington University Pain Center, Washington University School of Medicine in St. Louis, MO, USA.
| |
Collapse
|
29
|
Farashahi S, Xu J, Wu SW, Soltani A. Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects. Cognition 2020; 205:104425. [PMID: 32958287 DOI: 10.1016/j.cognition.2020.104425] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2019] [Revised: 07/29/2020] [Accepted: 07/31/2020] [Indexed: 10/23/2022]
Abstract
Most cognitive processes are studied using abstract or synthetic stimuli with specific features to fully control what is presented to subjects. However, recent studies have revealed enhancements of cognitive capacities (such as working memory) when processing naturalistic versus abstract stimuli. Using abstract stimuli constructed from distinct visual features (e.g., color and shape), we have recently shown that human subjects can learn multidimensional stimulus-reward associations via initially estimating reward value of individual features (feature-based learning) before gradually switching to learning about reward value of individual stimuli (object-based learning). Here, we examined whether similar strategies are adopted during learning about naturalistic stimuli that are clearly perceived as objects (instead of a combination of features) and contain both task-relevant and irrelevant features. We found that similar to learning about abstract stimuli, subjects initially adopted feature-based learning more strongly before transitioning to object-based learning. However, there were three key differences between learning about naturalistic and abstract stimuli. First, compared with abstract stimuli, the initial learning strategy was less feature-based for naturalistic stimuli. Second, subjects transitioned to object-based learning faster for naturalistic stimuli. Third, unexpectedly, subjects were more likely to adopt feature-based learning for naturalistic stimuli, both at the steady state and overall. These results suggest that despite the stronger tendency to perceive naturalistic stimuli as objects, which leads to greater likelihood of using object-based learning as the initial strategy and a faster transition to object-based learning, the influence of individual features on learning is stronger for these stimuli such that ultimately the object-based strategy is adopted less. Overall, our findings suggest that feature-based learning is a general initial strategy for learning about reward value of all types of multi-dimensional stimuli.
Collapse
Affiliation(s)
- Shiva Farashahi
- Department of Psychological and Brain Sciences, Dartmouth College, NH 03755, United States of America; Flatiron Institute, Simons Foundation, New York, NY 10010, United States of America
| | - Jane Xu
- Department of Psychological and Brain Sciences, Dartmouth College, NH 03755, United States of America
| | - Shih-Wei Wu
- Institute of Neuroscience, National Yang-Ming University, Taipei, Taiwan; Brain Research Center, National Yang-Ming University, Taipei, Taiwan
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, NH 03755, United States of America.
| |
Collapse
|
30
|
Shinn M, Ehrlich DB, Lee D, Murray JD, Seo H. Confluence of Timing and Reward Biases in Perceptual Decision-Making Dynamics. J Neurosci 2020; 40:7326-7342. [PMID: 32839233 PMCID: PMC7534922 DOI: 10.1523/jneurosci.0544-20.2020] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 08/09/2020] [Accepted: 08/12/2020] [Indexed: 01/22/2023] Open
Abstract
Although the decisions of our daily lives often occur in the context of temporal and reward structures, the impact of such regularities on decision-making strategy is poorly understood. Here, to explore how temporal and reward context modulate strategy, we trained 2 male rhesus monkeys to perform a novel perceptual decision-making task with asymmetric rewards and time-varying evidence reliability. To model the choice and response time patterns, we developed a computational framework for fitting generalized drift-diffusion models, which flexibly accommodate diverse evidence accumulation strategies. We found that a dynamic urgency signal and leaky integration, in combination with two independent forms of reward biases, best capture behavior. We also tested how temporal structure influences urgency by systematically manipulating the temporal structure of sensory evidence, and found that the time course of urgency was affected by temporal context. Overall, our approach identified key components of cognitive mechanisms for incorporating temporal and reward structure into decisions.SIGNIFICANCE STATEMENT In everyday life, decisions are influenced by many factors, including reward structures and stimulus timing. While reward and timing have been characterized in isolation, ecologically valid decision-making involves a multiplicity of factors acting simultaneously. This raises questions about whether the same decision-making strategy is used when these two factors are concurrently manipulated. To address these questions, we trained rhesus monkeys to perform a novel decision-making task with both reward asymmetry and temporal uncertainty. In order to understand their strategy and hint at its neural mechanisms, we used the new generalized drift diffusion modeling framework to model both reward and timing mechanisms. We found two of each reward and timing mechanisms are necessary to explain our data.
Collapse
Affiliation(s)
- Maxwell Shinn
- Department of Psychiatry, Yale University, New Haven, Connecticut 06511
- Interdepartmental Neuroscience Program, Yale University, New Haven, Connecticut 06520
| | - Daniel B Ehrlich
- Department of Psychiatry, Yale University, New Haven, Connecticut 06511
- Interdepartmental Neuroscience Program, Yale University, New Haven, Connecticut 06520
| | - Daeyeol Lee
- Department of Neuroscience, Yale University, New Haven, Connecticut 21218
- Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore, Maryland 21218
- Kavli Discovery Neuroscience Institute, Johns Hopkins University, Baltimore, Maryland 21218
- Department of Psychological and Brain Sciences, Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21218
- Department of Neuroscience, Johns Hopkins University, Baltimore, Maryland 21218
| | - John D Murray
- Department of Psychiatry, Yale University, New Haven, Connecticut 06511
- Interdepartmental Neuroscience Program, Yale University, New Haven, Connecticut 06520
| | - Hyojung Seo
- Department of Psychiatry, Yale University, New Haven, Connecticut 06511
- Interdepartmental Neuroscience Program, Yale University, New Haven, Connecticut 06520
| |
Collapse
|
31
|
Multiple timescales of neural dynamics and integration of task-relevant signals across cortex. Proc Natl Acad Sci U S A 2020; 117:22522-22531. [PMID: 32839338 DOI: 10.1073/pnas.2005993117] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
A long-lasting challenge in neuroscience has been to find a set of principles that could be used to organize the brain into distinct areas with specific functions. Recent studies have proposed the orderly progression in the time constants of neural dynamics as an organizational principle of cortical computations. However, relationships between these timescales and their dependence on response properties of individual neurons are unknown, making it impossible to determine how mechanisms underlying such a computational principle are related to other aspects of neural processing. Here, we developed a comprehensive method to simultaneously estimate multiple timescales in neuronal dynamics and integration of task-relevant signals along with selectivity to those signals. By applying our method to neural and behavioral data during a dynamic decision-making task, we found that most neurons exhibited multiple timescales in their response, which consistently increased from parietal to prefrontal and cingulate cortex. While predicting rates of behavioral adjustments, these timescales were not correlated across individual neurons in any cortical area, resulting in independent parallel hierarchies of timescales. Additionally, none of these timescales depended on selectivity to task-relevant signals. Our results not only suggest the existence of multiple canonical mechanisms for increasing timescales of neural dynamics across cortex but also point to additional mechanisms that allow decorrelation of these timescales to enable more flexibility.
Collapse
|
32
|
Martínez Oportus XP. Pandemia por COVID-19 y la virtualización de las aulas: La importancia del juego. REVISTA SCIENTIFIC 2020. [DOI: 10.29394/scientific.issn.2542-2987.2020.5.17.20.370-383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Actualmente, la pandemia que afecta a la totalidad del planeta ha producido cambios. Poniéndonos en contexto y según Arrizabalaga (1992): el vocablo “pandemia” etimológicamente procede de la expresión griega pandêmonnosêma, traducida como enfermedad del pueblo entero, ésta ha hecho tomar diferentes decisiones para el abordaje del proceso enseñanza-aprendizaje. Este ensayo busca cotejar la importancia del juego y de la generación de instancias de aprendizaje lúdicas en esta nueva metodología impuesta por la contingencia sanitaria. Para lograr el análisis del objeto de estudio se realizó una revisión sistemática desde fuentes formales, de los efectos del uso de las pantallas, de la digitalización docente y de cómo potenciar estos instrumentos con el fin del logro de los resultados de aprendizajes.
Collapse
|
33
|
Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys. Nat Commun 2020; 11:3771. [PMID: 32724052 PMCID: PMC7387352 DOI: 10.1038/s41467-020-17343-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 06/05/2020] [Indexed: 01/27/2023] Open
Abstract
People and other animals learn the values of choices by observing the contingencies between them and their outcomes. However, decisions are not guided by choice-linked reward associations alone; macaques also maintain a memory of the general, average reward rate - the global reward state - in an environment. Remarkably, global reward state affects the way that each choice outcome is valued and influences future decisions so that the impact of both choice success and failure is different in rich and poor environments. Successful choices are more likely to be repeated but this is especially the case in rich environments. Unsuccessful choices are more likely to be abandoned but this is especially likely in poor environments. Functional magnetic resonance imaging (fMRI) revealed two distinct patterns of activity, one in anterior insula and one in the dorsal raphe nucleus, that track global reward state as well as specific outcome events.
Collapse
|
34
|
A simple model for learning in volatile environments. PLoS Comput Biol 2020; 16:e1007963. [PMID: 32609755 PMCID: PMC7329063 DOI: 10.1371/journal.pcbi.1007963] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Accepted: 05/18/2020] [Indexed: 11/19/2022] Open
Abstract
Sound principles of statistical inference dictate that uncertainty shapes learning. In this work, we revisit the question of learning in volatile environments, in which both the first and second-order statistics of observations dynamically evolve over time. We propose a new model, the volatile Kalman filter (VKF), which is based on a tractable state-space model of uncertainty and extends the Kalman filter algorithm to volatile environments. The proposed model is algorithmically simple and encompasses the Kalman filter as a special case. Specifically, in addition to the error-correcting rule of Kalman filter for learning observations, the VKF learns volatility according to a second error-correcting rule. These dual updates echo and contextualize classical psychological models of learning, in particular hybrid accounts of Pearce-Hall and Rescorla-Wagner. At the computational level, compared with existing models, the VKF gives up some flexibility in the generative model to enable a more faithful approximation to exact inference. When fit to empirical data, the VKF is better behaved than alternatives and better captures human choice data in two independent datasets of probabilistic learning tasks. The proposed model provides a coherent account of learning in stable or volatile environments and has implications for decision neuroscience research.
Collapse
|
35
|
Bartolo R, Averbeck BB. Prefrontal Cortex Predicts State Switches during Reversal Learning. Neuron 2020; 106:1044-1054.e4. [PMID: 32315603 DOI: 10.1016/j.neuron.2020.03.024] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 01/28/2020] [Accepted: 03/24/2020] [Indexed: 11/25/2022]
Abstract
Reinforcement learning allows organisms to predict future outcomes and to update their beliefs about value in the world. The dorsal-lateral prefrontal cortex (dlPFC) integrates information carried by reward circuits, which can be used to infer the current state of the world under uncertainty. Here, we explored the dlPFC computations related to updating current beliefs during stochastic reversal learning. We recorded the activity of populations up to 1,000 neurons, simultaneously, in two male macaques while they executed a two-armed bandit reversal learning task. Behavioral analyses using a Bayesian framework showed that animals inferred reversals and switched their choice preference rapidly, rather than slowly updating choice values, consistent with state inference. Furthermore, dlPFC neural populations accurately encoded choice preference switches. These results suggest that prefrontal neurons dynamically encode decisions associated with Bayesian subjective values, highlighting the role of the PFC in representing a belief about the current state of the world.
Collapse
Affiliation(s)
- Ramon Bartolo
- Laboratory of Neuropsychology, National Institute of Mental Health/National Institutes of Health, Bethesda, MD 20892-4415, USA.
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health/National Institutes of Health, Bethesda, MD 20892-4415, USA
| |
Collapse
|
36
|
Soltani A, Izquierdo A. Adaptive learning under expected and unexpected uncertainty. Nat Rev Neurosci 2020; 20:635-644. [PMID: 31147631 DOI: 10.1038/s41583-019-0180-y] [Citation(s) in RCA: 105] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The outcome of a decision is often uncertain, and outcomes can vary over repeated decisions. Whether decision outcomes should substantially affect behaviour and learning depends on whether they are representative of a typically experienced range of outcomes or signal a change in the reward environment. Successful learning and decision-making therefore require the ability to estimate expected uncertainty (related to the variability of outcomes) and unexpected uncertainty (related to the variability of the environment). Understanding the bases and effects of these two types of uncertainty and the interactions between them - at the computational and the neural level - is crucial for understanding adaptive learning. Here, we examine computational models and experimental findings to distil computational principles and neural mechanisms for adaptive learning under uncertainty.
Collapse
Affiliation(s)
- Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| | - Alicia Izquierdo
- Department of Psychology, The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
37
|
Cook JL, Swart JC, Froböse MI, Diaconescu AO, Geurts DEM, den Ouden HEM, Cools R. Catecholaminergic modulation of meta-learning. eLife 2019; 8:e51439. [PMID: 31850844 PMCID: PMC6974360 DOI: 10.7554/elife.51439] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 12/18/2019] [Indexed: 01/03/2023] Open
Abstract
The remarkable expedience of human learning is thought to be underpinned by meta-learning, whereby slow accumulative learning processes are rapidly adjusted to the current learning environment. To date, the neurobiological implementation of meta-learning remains unclear. A burgeoning literature argues for an important role for the catecholamines dopamine and noradrenaline in meta-learning. Here, we tested the hypothesis that enhancing catecholamine function modulates the ability to optimise a meta-learning parameter (learning rate) as a function of environmental volatility. 102 participants completed a task which required learning in stable phases, where the probability of reinforcement was constant, and volatile phases, where probabilities changed every 10-30 trials. The catecholamine transporter blocker methylphenidate enhanced participants' ability to adapt learning rate: Under methylphenidate, compared with placebo, participants exhibited higher learning rates in volatile relative to stable phases. Furthermore, this effect was significant only with respect to direct learning based on the participants' own experience, there was no significant effect on inferred-value learning where stimulus values had to be inferred. These data demonstrate a causal link between catecholaminergic modulation and the adjustment of the meta-learning parameter learning rate.
Collapse
Affiliation(s)
- Jennifer L Cook
- School of PsychologyUniversity of BirminghamBirminghamUnited Kingdom
| | - Jennifer C Swart
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Monja I Froböse
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Andreea O Diaconescu
- Translational Neuromodeling Unit, Institute for Biomedical EngineeringUniversity of Zurich and ETH ZurichZurichSwitzerland
- Department of PsychiatryUniversity of BaselBaselSwitzerland
- Krembil Centre for Neuroinformatics,CAMHUniversity of TorontoTorontoCanada
| | - Dirk EM Geurts
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
- Department of PsychiatryRadboud University Medical CentreNijmegenNetherlands
| | - Hanneke EM den Ouden
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Roshan Cools
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
- Department of PsychiatryRadboud University Medical CentreNijmegenNetherlands
| |
Collapse
|
38
|
Rakhshan M, Lee V, Chu E, Harris L, Laiks L, Khorsand P, Soltani A. Influence of Expected Reward on Temporal Order Judgment. J Cogn Neurosci 2019; 32:674-690. [PMID: 31851591 DOI: 10.1162/jocn_a_01516] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Perceptual decision-making has been shown to be influenced by reward expected from alternative options or actions, but the underlying neural mechanisms are currently unknown. More specifically, it is debated whether reward effects are mediated through changes in sensory processing, later stages of decision-making, or both. To address this question, we conducted two experiments in which human participants made saccades to what they perceived to be either the first or second of two visually identical but asynchronously presented targets while we manipulated expected reward from correct and incorrect responses on each trial. By comparing reward-induced bias in target selection (i.e., reward bias) during the two experiments, we determined whether reward caused changes in sensory or decision-making processes. We found similar reward biases in the two experiments indicating that reward information mainly influenced later stages of decision-making. Moreover, the observed reward biases were independent of the individual's sensitivity to sensory signals. This suggests that reward effects were determined heuristically via modulation of decision-making processes instead of sensory processing. To further explain our findings and uncover plausible neural mechanisms, we simulated our experiments with a cortical network model and tested alternative mechanisms for how reward could exert its influence. We found that our experimental observations are more compatible with reward-dependent input to the output layer of the decision circuit. Together, our results suggest that, during a temporal judgment task, reward exerts its influence via changing later stages of decision-making (i.e., response bias) rather than early sensory processing (i.e., perceptual bias).
Collapse
|
39
|
Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nat Neurosci 2019; 22:2066-2077. [PMID: 31659343 DOI: 10.1038/s41593-019-0518-9] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 09/17/2019] [Indexed: 11/09/2022]
Abstract
When learning the value of actions in volatile environments, humans often make seemingly irrational decisions that fail to maximize expected value. We reasoned that these 'non-greedy' decisions, instead of reflecting information seeking during choice, may be caused by computational noise in the learning of action values. Here using reinforcement learning models of behavior and multimodal neurophysiological data, we show that the majority of non-greedy decisions stem from this learning noise. The trial-to-trial variability of sequential learning steps and their impact on behavior could be predicted both by blood oxygen level-dependent responses to obtained rewards in the dorsal anterior cingulate cortex and by phasic pupillary dilation, suggestive of neuromodulatory fluctuations driven by the locus coeruleus-norepinephrine system. Together, these findings indicate that most behavioral variability, rather than reflecting human exploration, is due to the limited computational precision of reward-guided learning.
Collapse
|
40
|
Stolyarova A, Rakhshan M, Hart EE, O'Dell TJ, Peters MAK, Lau H, Soltani A, Izquierdo A. Contributions of anterior cingulate cortex and basolateral amygdala to decision confidence and learning under uncertainty. Nat Commun 2019; 10:4704. [PMID: 31624264 PMCID: PMC6797780 DOI: 10.1038/s41467-019-12725-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 09/23/2019] [Indexed: 12/20/2022] Open
Abstract
The subjective sense of certainty, or confidence, in ambiguous sensory cues can alter the interpretation of reward feedback and facilitate learning. We trained rats to report the orientation of ambiguous visual stimuli according to a spatial stimulus-response rule that must be learned. Following choice, rats could wait a self-timed delay for reward or initiate a new trial. Waiting times increase with discrimination accuracy, demonstrating that this measure can be used as a proxy for confidence. Chemogenetic silencing of BLA shortens waiting times overall whereas ACC inhibition renders waiting times insensitive to confidence-modulating attributes of visual stimuli, suggesting contribution of ACC but not BLA to confidence computations. Subsequent reversal learning is enhanced by confidence. Both ACC and BLA inhibition block this enhancement but via differential adjustments in learning strategies and consistent use of learned rules. Altogether, we demonstrate dissociable roles for ACC and BLA in transmitting confidence and learning under uncertainty.
Collapse
Affiliation(s)
- A Stolyarova
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - M Rakhshan
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, 03755, USA
| | - E E Hart
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - T J O'Dell
- Department of Physiology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - M A K Peters
- Department of Bioengineering, University of California, Riverside, Riverside, CA, 92521, USA
- Department of Psychology, University of California, Riverside, Riverside, CA, 92521, USA
- Interdepartmental Graduate Program in Neuroscience, University of California, Riverside, Riverside, CA, 92521, USA
| | - H Lau
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Department of Psychology, The University of Hong Kong, Pok Fu Lam, Hong Kong
- State Key Laboratory for Brain and Cognitive Sciences, The University of Hong Kong, Pok Fu Lam, Hong Kong
| | - A Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, 03755, USA.
| | - A Izquierdo
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- The Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
41
|
|
42
|
Siniscalchi MJ, Wang H, Kwan AC. Enhanced Population Coding for Rewarded Choices in the Medial Frontal Cortex of the Mouse. Cereb Cortex 2019; 29:4090-4106. [PMID: 30615132 PMCID: PMC6735259 DOI: 10.1093/cercor/bhy292] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 10/26/2018] [Accepted: 10/31/2018] [Indexed: 12/29/2022] Open
Abstract
Instrumental behavior is characterized by the selection of actions based on the degree to which they lead to a desired outcome. However, we lack a detailed understanding of how rewarded actions are reinforced and preferentially implemented. In rodents, the medial frontal cortex is hypothesized to play an important role in this process, based in part on its capacity to encode chosen actions and their outcomes. We therefore asked how neural representations of choice and outcome might interact to facilitate instrumental behavior. To investigate this question, we imaged neural ensemble activity in layer 2/3 of the secondary motor region (M2) while mice engaged in a two-choice auditory discrimination task with probabilistic outcomes. Correct choices could result in one of three reward amounts (single, double or omitted reward), which allowed us to measure neural and behavioral effects of reward magnitude, as well as its categorical presence or absence. Single-unit and population decoding analyses revealed a consistent influence of outcome on choice signals in M2. Specifically, rewarded choices were more robustly encoded relative to unrewarded choices, with little dependence on the exact magnitude of reinforcement. Our results provide insight into the integration of past choices and outcomes in the rodent brain during instrumental behavior.
Collapse
Affiliation(s)
- Michael J Siniscalchi
- Interdepartmental Neuroscience Program, Yale University School of Medicine, New Haven, CT, USA
| | - Hongli Wang
- Interdepartmental Neuroscience Program, Yale University School of Medicine, New Haven, CT, USA
| | - Alex C Kwan
- Interdepartmental Neuroscience Program, Yale University School of Medicine, New Haven, CT, USA
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- Department of Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
43
|
Flexible combination of reward information across primates. Nat Hum Behav 2019; 3:1215-1224. [PMID: 31501543 PMCID: PMC6856432 DOI: 10.1038/s41562-019-0714-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2018] [Accepted: 07/29/2019] [Indexed: 01/09/2023]
Abstract
A fundamental but rarely contested assumption in economics and neuroeconomics is that decision-makers compute subjective values of risky options by multiplying functions of reward probability and magnitude. In contrast, an additive strategy for valuation allows flexible combination of reward information required in uncertain or changing environments. We hypothesized that the level of uncertainty in the reward environment should determine the strategy used for valuation and choice. To test this hypothesis, we examined choice between risky options in humans and monkeys across three tasks with different levels of uncertainty. We found that whereas humans and monkeys adopted a multiplicative strategy under risk when probabilities are known, both species spontaneously adopted an additive strategy under uncertainty when probabilities must be learned. Additionally, the level of volatility influenced relative weighting of certain and uncertain reward information and this was reflected in the encoding of reward magnitude by neurons in the dorsolateral prefrontal cortex.
Collapse
|
44
|
Spitmaan M, Chu E, Soltani A. Salience-Driven Value Construction for Adaptive Choice under Risk. J Neurosci 2019; 39:5195-5209. [PMID: 31023835 PMCID: PMC6595946 DOI: 10.1523/jneurosci.2522-18.2019] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2018] [Revised: 04/17/2019] [Accepted: 04/19/2019] [Indexed: 11/21/2022] Open
Abstract
Decisions we face in real life are inherently risky and can result in one of many possible outcomes. However, most of what we know about choice under risk is based on studies that use options with only two possible outcomes (simple gambles), so it remains unclear how the brain constructs reward values for more complex risky options faced in real life. To address this question, we combined experimental and modeling approaches to examine choice between pairs of simple gambles and pairs of three-outcome gambles in male and female human subjects. We found that subjects evaluated individual outcomes of three-outcome gambles by multiplying functions of reward magnitude and probability. To construct the overall value of each gamble, however, most subjects differentially weighted possible outcomes based on either reward magnitude or probability. These results reveal a novel dissociation between how reward information is processed when evaluating complex gambles: valuation of each outcome is based on a combination of reward information whereas weighting of possible outcomes mainly relies on a single piece of reward information. We show that differential weighting of possible outcomes could enable subjects to make decisions more easily and quickly. Together, our findings reveal a plausible mechanism for how salience, in terms of possible reward magnitude or probability, can influence the construction of subjective values for complex gambles. They also point to separable neural mechanisms for how reward value controls choice and attention to allow for more adaptive decision making under risk.SIGNIFICANCE STATEMENT Real-life decisions are inherently risky and can result in one of many possible outcomes, but how does the brain integrate information from all these outcomes to make decisions? To address this question, we examined choice between pairs of gambles with multiple outcomes using various computational models. We found that subjects evaluated individual outcomes by multiplying functions of reward magnitude and probability. To construct the overall value of each gamble, however, they differentially weighted possible outcomes based on either reward magnitude or probability. By doing so, they were able to make decisions more easily and quickly. Our findings illustrate how salience, in terms of possible reward magnitude or probability, can influence the construction of subjective values for more adaptive choice.
Collapse
Affiliation(s)
- Mehran Spitmaan
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Emily Chu
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire 03755
| |
Collapse
|
45
|
Heilbron M, Meyniel F. Confidence resets reveal hierarchical adaptive learning in humans. PLoS Comput Biol 2019; 15:e1006972. [PMID: 30964861 PMCID: PMC6474633 DOI: 10.1371/journal.pcbi.1006972] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 04/19/2019] [Accepted: 03/21/2019] [Indexed: 12/17/2022] Open
Abstract
Hierarchical processing is pervasive in the brain, but its computational significance for learning under uncertainty is disputed. On the one hand, hierarchical models provide an optimal framework and are becoming increasingly popular to study cognition. On the other hand, non-hierarchical (flat) models remain influential and can learn efficiently, even in uncertain and changing environments. Here, we show that previously proposed hallmarks of hierarchical learning, which relied on reports of learned quantities or choices in simple experiments, are insufficient to categorically distinguish hierarchical from flat models. Instead, we present a novel test which leverages a more complex task, whose hierarchical structure allows generalization between different statistics tracked in parallel. We use reports of confidence to quantitatively and qualitatively arbitrate between the two accounts of learning. Our results support the hierarchical learning framework, and demonstrate how confidence can be a useful metric in learning theory.
Collapse
Affiliation(s)
- Micha Heilbron
- Cognitive Neuroimaging Unit / NeuroSpin center / Institute for Life Sciences Frédéric Joliot / Fundamental Research Division / Commissariat à l'Energie Atomique et aux énergies alternatives; INSERM, Université Paris-Sud; Université Paris-Saclay; Gif-sur-Yvette, France
| | - Florent Meyniel
- Cognitive Neuroimaging Unit / NeuroSpin center / Institute for Life Sciences Frédéric Joliot / Fundamental Research Division / Commissariat à l'Energie Atomique et aux énergies alternatives; INSERM, Université Paris-Sud; Université Paris-Saclay; Gif-sur-Yvette, France
| |
Collapse
|
46
|
Maheu M, Dehaene S, Meyniel F. Brain signatures of a multiscale process of sequence learning in humans. eLife 2019; 8:41541. [PMID: 30714904 PMCID: PMC6361584 DOI: 10.7554/elife.41541] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 01/18/2019] [Indexed: 01/08/2023] Open
Abstract
Extracting the temporal structure of sequences of events is crucial for perception, decision-making, and language processing. Here, we investigate the mechanisms by which the brain acquires knowledge of sequences and the possibility that successive brain responses reflect the progressive extraction of sequence statistics at different timescales. We measured brain activity using magnetoencephalography in humans exposed to auditory sequences with various statistical regularities, and we modeled this activity as theoretical surprise levels using several learning models. Successive brain waves related to different types of statistical inferences. Early post-stimulus brain waves denoted a sensitivity to a simple statistic, the frequency of items estimated over a long timescale (habituation). Mid-latency and late brain waves conformed qualitatively and quantitatively to the computational properties of a more complex inference: the learning of recent transition probabilities. Our findings thus support the existence of multiple computational systems for sequence processing involving statistical inferences at multiple scales.
Collapse
Affiliation(s)
- Maxime Maheu
- Cognitive Neuroimaging Unit, CEA DRF/JOLIOT, INSERM, Université Paris-Sud, Université Paris-Saclay, NeuroSpin center, Gif-sur-Yvette, France.,Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Stanislas Dehaene
- Cognitive Neuroimaging Unit, CEA DRF/JOLIOT, INSERM, Université Paris-Sud, Université Paris-Saclay, NeuroSpin center, Gif-sur-Yvette, France.,Collège de France, Paris, France
| | - Florent Meyniel
- Cognitive Neuroimaging Unit, CEA DRF/JOLIOT, INSERM, Université Paris-Sud, Université Paris-Saclay, NeuroSpin center, Gif-sur-Yvette, France
| |
Collapse
|
47
|
Computing Value from Quality and Quantity in Human Decision-Making. J Neurosci 2018; 39:163-176. [PMID: 30455186 PMCID: PMC6325261 DOI: 10.1523/jneurosci.0706-18.2018] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Revised: 09/20/2018] [Accepted: 09/26/2018] [Indexed: 12/04/2022] Open
Abstract
How organisms learn the value of single stimuli through experience is well described. In many decisions, however, value estimates are computed “on the fly” by combining multiple stimulus attributes. The neural basis of this computation is poorly understood. Here we explore a common scenario in which decision-makers must combine information about quality and quantity to determine the best option. Using fMRI, we examined the neural representation of quality, quantity, and their integration into an integrated subjective value signal in humans of both genders. We found that activity within inferior frontal gyrus (IFG) correlated with offer quality, while activity in the intraparietal sulcus (IPS) specifically correlated with offer quantity. Several brain regions, including the anterior cingulate cortex (ACC), were sensitive to an interaction of quality and quantity. However, the ACC was uniquely activated by quality, quantity, and their interaction, suggesting that this region provides a substrate for flexible computation of value from both quality and quantity. Furthermore, ACC signals across subjects correlated with the strength of quality and quantity signals in IFG and IPS, respectively. ACC tracking of subjective value also correlated with choice predictability. Finally, activity in the ACC was elevated for choice trials, suggesting that ACC provides a nexus for the computation of subjective value in multiattribute decision-making. SIGNIFICANCE STATEMENT Would you prefer three apples or two oranges? Many choices we make each day require us to weigh up the quality and quantity of different outcomes. Using fMRI, we show that option quality is selectively represented in the inferior frontal gyrus, while option quantity correlates with areas of the intraparietal sulcus that have previously been associated with numerical processing. We show that information about the two is integrated into a value signal in the anterior cingulate cortex, and the fidelity of this integration predicts choice predictability. Our results demonstrate how on-the-fly value estimates are computed from multiple attributes in human value-based decision-making.
Collapse
|
48
|
Zhang JJ, Haubrich J, Bernabo M, Finnie PS, Nader K. Limits on lability: Boundaries of reconsolidation and the relationship to metaplasticity. Neurobiol Learn Mem 2018; 154:78-86. [DOI: 10.1016/j.nlm.2018.02.018] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Revised: 02/08/2018] [Accepted: 02/19/2018] [Indexed: 02/07/2023]
|
49
|
Zimmermann J, Glimcher PW, Louie K. Multiple timescales of normalized value coding underlie adaptive choice behavior. Nat Commun 2018; 9:3206. [PMID: 30097577 PMCID: PMC6086888 DOI: 10.1038/s41467-018-05507-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 07/10/2018] [Indexed: 01/25/2023] Open
Abstract
Adaptation is a fundamental process crucial for the efficient coding of sensory information. Recent evidence suggests that similar coding principles operate in decision-related brain areas, where neural value coding adapts to recent reward history. However, the circuit mechanism for value adaptation is unknown, and the link between changes in adaptive value coding and choice behavior is unclear. Here we show that choice behavior in nonhuman primates varies with the statistics of recent rewards. Consistent with efficient coding theory, decision-making shows increased choice sensitivity in lower variance reward environments. Both the average adaptation effect and across-session variability are explained by a novel multiple timescale dynamical model of value representation implementing divisive normalization. The model predicts empirical variance-driven changes in behavior despite having no explicit knowledge of environmental statistics, suggesting that distributional characteristics can be captured by dynamic model architectures. These findings highlight the importance of treating decision-making as a dynamic process and the role of normalization as a unifying computation for contextual phenomena in choice.
Collapse
Affiliation(s)
- Jan Zimmermann
- Center for Neural Science, New York University, 4 Washington Place Room 809, New York, NY, 10003, USA.
| | - Paul W Glimcher
- Center for Neural Science, New York University, 4 Washington Place Room 809, New York, NY, 10003, USA.,Institute for the Study of Decision Making, New York University, 4 Washington Place Room 809, New York, NY, 10003, USA
| | - Kenway Louie
- Center for Neural Science, New York University, 4 Washington Place Room 809, New York, NY, 10003, USA.,Institute for the Study of Decision Making, New York University, 4 Washington Place Room 809, New York, NY, 10003, USA
| |
Collapse
|
50
|
Elston TW, Kalhan S, Bilkey DK. Conflict and adaptation signals in the anterior cingulate cortex and ventral tegmental area. Sci Rep 2018; 8:11732. [PMID: 30082775 PMCID: PMC6079061 DOI: 10.1038/s41598-018-30203-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 07/25/2018] [Indexed: 12/22/2022] Open
Abstract
The integration and utilization of feedback in order to determine which decision strategy to use in different contexts is the core of executive function. The anterior cingulate cortex (ACC) is central to these processes but how feedback is made available to the ACC is unclear. To address this question, we trained rats with implants in the ACC and the ventral tegmental area (VTA), a dopaminergic brain region implicated in feedback processing, in a spatial decision reversal task with rule switching occurring approximately every 12 trials. Following a rule switch, the rats had to shift and sustain responses to the alternative side in order to obtain reward. Partial directed coherence (PDC) models of signal directionality between the ACC and VTA indicated that VTA → ACC communication (near 4 Hz) increased immediately prior to incorrect choices and during post-error decisions. This increase did not occur during correct choices. These data indicate that the VTA provides a feedback-driven, bottom-up modulating signal to the ACC which may be involved in assessing, and correcting for, decision conflict.
Collapse
Affiliation(s)
- Thomas W Elston
- Department of Psychology, University of Otago, Dunedin, 9016, New Zealand. .,Brain Health Research Centre, University of Otago, Dunedin, 9016, New Zealand. .,Institute for Neurobiology, University of Tübingen, Tübingen, 72076, Germany.
| | - Shivam Kalhan
- Department of Psychology, University of Otago, Dunedin, 9016, New Zealand.,Brain Health Research Centre, University of Otago, Dunedin, 9016, New Zealand
| | - David K Bilkey
- Department of Psychology, University of Otago, Dunedin, 9016, New Zealand.,Brain Health Research Centre, University of Otago, Dunedin, 9016, New Zealand
| |
Collapse
|