51
|
Nguyen QN, Reinagel P. Different Forms of Variability Could Explain a Difference Between Human and Rat Decision Making. Front Neurosci 2022; 16:794681. [PMID: 35273473 PMCID: PMC8902138 DOI: 10.3389/fnins.2022.794681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 01/17/2022] [Indexed: 11/16/2022] Open
Abstract
When observers make rapid, difficult perceptual decisions, their response time is highly variable from trial to trial. In a visual motion discrimination task, it has been reported that human accuracy declines with increasing response time, whereas rat accuracy increases with response time. This is of interest because different mathematical theories of decision-making differ in their predictions regarding the correlation of accuracy with response time. On the premise that perceptual decision-making mechanisms are likely to be conserved among mammals, we seek to unify the rodent and primate results in a common theoretical framework. We show that a bounded drift diffusion model (DDM) can explain both effects with variable parameters: trial-to-trial variability in the starting point of the diffusion process produces the pattern typically observed in rats, whereas variability in the drift rate produces the pattern typically observed in humans. We further show that the same effects can be produced by deterministic biases, even in the absence of parameter stochasticity or parameter change within a trial.
Collapse
Affiliation(s)
| | - Pamela Reinagel
- Section of Neurobiology, Division of Biological Sciences, University of California, San Diego, San Diego, CA, United States
| |
Collapse
|
52
|
Grossman CD, Bari BA, Cohen JY. Serotonin neurons modulate learning rate through uncertainty. Curr Biol 2022; 32:586-599.e7. [PMID: 34936883 PMCID: PMC8825708 DOI: 10.1016/j.cub.2021.12.006] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 10/11/2021] [Accepted: 12/03/2021] [Indexed: 12/20/2022]
Abstract
Regulating how fast to learn is critical for flexible behavior. Learning about the consequences of actions should be slow in stable environments, but accelerate when that environment changes. Recognizing stability and detecting change are difficult in environments with noisy relationships between actions and outcomes. Under these conditions, theories propose that uncertainty can be used to modulate learning rates ("meta-learning"). We show that mice behaving in a dynamic foraging task exhibit choice behavior that varied as a function of two forms of uncertainty estimated from a meta-learning model. The activity of dorsal raphe serotonin neurons tracked both types of uncertainty in the foraging task as well as in a dynamic Pavlovian task. Reversible inhibition of serotonin neurons in the foraging task reproduced changes in learning predicted by a simulated lesion of meta-learning in the model. We thus provide a quantitative link between serotonin neuron activity, learning, and decision making.
Collapse
Affiliation(s)
- Cooper D Grossman
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Bilal A Bari
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA.
| |
Collapse
|
53
|
Liu Z, Liu S, Li S, Li L, Zheng L, Weng X, Guo X, Lu Y, Men W, Gao J, You X. Dissociating Value-Based Neurocomputation from Subsequent Selection-Related Activations in Human Decision-Making. Cereb Cortex 2022; 32:4141-4155. [PMID: 35024797 DOI: 10.1093/cercor/bhab471] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 11/12/2022] Open
Abstract
Human decision-making requires the brain to fulfill neural computation of benefit and risk and therewith a selection between options. It remains unclear how value-based neural computation and subsequent brain activity evolve to achieve a final decision and which process is modulated by irrational factors. We adopted a sequential risk-taking task that asked participants to successively decide whether to open a box with potential reward/punishment in an eight-box trial, or not to open. With time-resolved multivariate pattern analyses, we decoded electroencephalography and magnetoencephalography responses to two successive low- and high-risk boxes before open-box action. Referencing the specificity of decoding-accuracy peak to a first-stage processing completion, we set it as the demarcation and dissociated the neural time course of decision-making into valuation and selection stages. The behavioral hierarchical drift diffusion modeling confirmed different information processing in two stages, that is, the valuation stage was related to the drift rate of evidence accumulation, while the selection stage was related to the nondecision time spent in response-producing. We further observed that medial orbitofrontal cortex participated in the valuation stage, while superior frontal gyrus engaged in the selection stage of irrational open-box decisions. Afterward, we revealed that irrational factors influenced decision-making through the selection stage rather than the valuation stage.
Collapse
Affiliation(s)
- Zhiyuan Liu
- Shaanxi Key Laboratory of Behavior and Cognitive Neuroscience, School of Psychology, Shaanxi Normal University, Xi'an 710062, China
| | - Sijia Liu
- Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou 310007, China
| | - Shuang Li
- School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Lin Li
- School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Li Zheng
- School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Xue Weng
- School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Xiuyan Guo
- Department of Psychology and Behavioral Sciences, Zhejiang University, Hangzhou 310007, China.,Shanghai Key Laboratory of Magnetic Resonance, School of Physics and Materials Science, East China Normal University, Shanghai 200062, China
| | - Yang Lu
- School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Weiwei Men
- Center for MRI Research, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100091, China.,Beijing City Key Laboratory for Medical Physics and Engineering, Institute of Heavy Ion Physics, School of Physics, Peking University, Beijing 100091, China
| | - Jiahong Gao
- Beijing City Key Laboratory for Medical Physics and Engineering, Institute of Heavy Ion Physics, School of Physics, Peking University, Beijing 100091, China.,Center for MRI Research and McGovern Institute for Brain Research, Peking University, Beijing 100091, China
| | - Xuqun You
- Shaanxi Key Laboratory of Behavior and Cognitive Neuroscience, School of Psychology, Shaanxi Normal University, Xi'an 710062, China
| |
Collapse
|
54
|
Averbeck B, O'Doherty JP. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology 2022; 47:147-162. [PMID: 34354249 PMCID: PMC8616931 DOI: 10.1038/s41386-021-01108-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 07/06/2021] [Accepted: 07/09/2021] [Indexed: 01/03/2023]
Abstract
We review the current state of knowledge on the computational and neural mechanisms of reinforcement-learning with a particular focus on fronto-striatal circuits. We divide the literature in this area into five broad research themes: the target of the learning-whether it be learning about the value of stimuli or about the value of actions; the nature and complexity of the algorithm used to drive the learning and inference process; how learned values get converted into choices and associated actions; the nature of state representations, and of other cognitive machinery that support the implementation of various reinforcement-learning operations. An emerging fifth area focuses on how the brain allocates or arbitrates control over different reinforcement-learning sub-systems or "experts". We will outline what is known about the role of the prefrontal cortex and striatum in implementing each of these functions. We then conclude by arguing that it will be necessary to build bridges from algorithmic level descriptions of computational reinforcement-learning to implementational level models to better understand how reinforcement-learning emerges from multiple distributed neural networks in the brain.
Collapse
Affiliation(s)
| | - John P O'Doherty
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
55
|
Lyu N, Hu Y, Zhang J, Lloyd H, Sun YH, Tao Y. Switching costs in stochastic environments drive the emergence of matching behaviour in animal decision-making through the promotion of reward learning strategies. Sci Rep 2021; 11:23593. [PMID: 34880339 PMCID: PMC8654859 DOI: 10.1038/s41598-021-02979-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 11/23/2021] [Indexed: 11/18/2022] Open
Abstract
A principle of choice in animal decision-making named probability matching (PM) has long been detected in animals, and can arise from different decision-making strategies. Little is known about how environmental stochasticity may influence the switching time of these different decision-making strategies. Here we address this problem using a combination of behavioral and theoretical approaches, and show, that although a simple Win-Stay-Loss-Shift (WSLS) strategy can generate PM in binary-choice tasks theoretically, budgerigars (Melopsittacus undulates) actually apply a range of sub-tactics more often when they are expected to make more accurate decisions. Surprisingly, budgerigars did not get more rewards than would be predicted when adopting a WSLS strategy, and their decisions also exhibited PM. Instead, budgerigars followed a learning strategy based on reward history, which potentially benefits individuals indirectly from paying lower switching costs. Furthermore, our data suggest that more stochastic environments may promote reward learning through significantly less switching. We suggest that switching costs driven by the stochasticity of an environmental niche can potentially represent an important selection pressure associated with decision-making that may play a key role in driving the evolution of complex cognition in animals.
Collapse
Affiliation(s)
- Nan Lyu
- Ministry of Education Key Laboratory for Biodiversity and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China.
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China.
| | - Yunbiao Hu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Jiahua Zhang
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Huw Lloyd
- Department of Natural Sciences, Faculty of Science and Engineering, Manchester Metropolitan University, Manchester, UK
| | - Yue-Hua Sun
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China.
| | - Yi Tao
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China.
| |
Collapse
|
56
|
Trepka E, Spitmaan M, Bari BA, Costa VD, Cohen JY, Soltani A. Entropy-based metrics for predicting choice behavior based on local response to reward. Nat Commun 2021; 12:6567. [PMID: 34772943 PMCID: PMC8590026 DOI: 10.1038/s41467-021-26784-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 10/18/2021] [Indexed: 11/16/2022] Open
Abstract
For decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching behavior. To address this, we developed metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct more accurate models of choice. Together, our entropy-based metrics provide a model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms.
Collapse
Affiliation(s)
- Ethan Trepka
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Mehran Spitmaan
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| | - Bilal A Bari
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Brain Science Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Vincent D Costa
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Brain Science Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
57
|
Joo HR, Liang H, Chung JE, Geaghan-Breiner C, Fan JL, Nachman BP, Kepecs A, Frank LM. Rats use memory confidence to guide decisions. Curr Biol 2021; 31:4571-4583.e4. [PMID: 34473948 DOI: 10.1016/j.cub.2021.08.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 05/29/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022]
Abstract
Memory enables access to past experiences to guide future behavior. Humans can determine which memories to trust (high confidence) and which to doubt (low confidence). How memory retrieval, memory confidence, and memory-guided decisions are related, however, is not understood. In particular, how confidence in memories is used in decision making is unknown. We developed a spatial memory task in which rats were incentivized to gamble their time: betting more following a correct choice yielded greater reward. Rat behavior reflected memory confidence, with higher temporal bets following correct choices. We applied machine learning to identify a memory decision variable and built a generative model of memories evolving over time that accurately predicted both choices and confidence reports. Our results reveal in rats an ability thought to exist exclusively in primates and introduce a unified model of memory dynamics, retrieval, choice, and confidence.
Collapse
Affiliation(s)
- Hannah R Joo
- Medical Scientist Training Program, University of California, San Francisco, 513 Parnassus Avenue, San Francisco, CA 94143, USA; Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; Department of Physiology, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Department of Psychiatry, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA.
| | - Hexin Liang
- Neuroscience Graduate Program, The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Jason E Chung
- Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; Department of Physiology, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Department of Psychiatry, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Department of Neurological Surgery, University of California, San Francisco, 505 Parnassus Avenue, San Francisco, CA 94143, USA
| | - Charlotte Geaghan-Breiner
- Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; Department of Physiology, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Department of Psychiatry, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA
| | - Jiang Lan Fan
- Bioengineering Graduate Program, University of California, Berkeley/University of California, San Francisco, 1675 Owens Street, San Francisco, CA 94158, USA
| | - Benjamin P Nachman
- Physics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA; Berkeley Institute of Data Science, University of California, Berkeley, 190 Doe Library, Berkeley, CA 94720, USA
| | - Adam Kepecs
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Avenue, St. Louis, MO 63110, USA
| | - Loren M Frank
- Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA; Department of Physiology, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Department of Psychiatry, University of California, San Francisco, 401 Parnassus Avenue, San Francisco, CA 94158, USA; Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, MD 20815, USA.
| |
Collapse
|
58
|
Zhang W, Li G, Manza P, Hu Y, Wang J, Lv G, He Y, von Deneen KM, Yu J, Han Y, Cui G, Volkow ND, Nie Y, Ji G, Wang GJ, Zhang Y. Functional Abnormality of the Executive Control Network in Individuals With Obesity During Delay Discounting. Cereb Cortex 2021; 32:2013-2021. [PMID: 34649270 DOI: 10.1093/cercor/bhab333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 08/11/2021] [Accepted: 08/12/2021] [Indexed: 01/14/2023] Open
Abstract
Individuals with obesity (OB) prefer immediate rewards of food intake over the delayed reward of healthy well-being achieved through diet management and physical activity, compared with normal-weight controls (NW). This may reflect heightened impulsivity, an important factor contributing to the development and maintenance of obesity. However, the neural mechanisms underlying the greater impulsivity in OB remain unclear. Therefore, the current study employed functional magnetic resonance imaging with a delay discounting (DD) task to examine the association between impulsive choice and altered neural mechanisms in OB. During decision-making in the DD task, OB compared with NW had greater activation in the dorsolateral prefrontal cortex (DLPFC) and posterior parietal cortex, which was associated with greater discounting rate and weaker cognitive control as measured with the Three-Factor Eating Questionnaire (TFEQ). In addition, the association between DLPFC activation and cognitive control (TFEQ) was mediated by discounting rate. Psychophysiological interaction analysis showed decreased connectivity of DLPFC-inferior parietal cortex (within executive control network [ECN]) and angular gyrus-caudate (ECN-reward) in OB relative to NW. These findings reveal that the aberrant function and connectivity in core regions of ECN and striatal brain reward regions underpin the greater impulsivity in OB and contribute to abnormal eating behaviors.
Collapse
Affiliation(s)
- Wenchao Zhang
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Guanya Li
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Peter Manza
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD 20892, USA
| | - Yang Hu
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Jia Wang
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Ganggang Lv
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Yang He
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Karen M von Deneen
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Juan Yu
- State Key Laboratory of Cancer Biology, National Clinical Research Center for Digestive Diseases and Xijing Hospital of Digestive Diseases, The Air Force Medical University, Xi'an, Shaanxi 710032, China
| | - Yu Han
- Department of Radiology, Tangdu Hospital, The Air Force Medical University, Xi'an, Shaanxi 710038, China
| | - Guangbin Cui
- Department of Radiology, Tangdu Hospital, The Air Force Medical University, Xi'an, Shaanxi 710038, China
| | - Nora D Volkow
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD 20892, USA
| | - Yongzhan Nie
- State Key Laboratory of Cancer Biology, National Clinical Research Center for Digestive Diseases and Xijing Hospital of Digestive Diseases, The Air Force Medical University, Xi'an, Shaanxi 710032, China
| | - Gang Ji
- State Key Laboratory of Cancer Biology, National Clinical Research Center for Digestive Diseases and Xijing Hospital of Digestive Diseases, The Air Force Medical University, Xi'an, Shaanxi 710032, China
| | - Gene-Jack Wang
- Laboratory of Neuroimaging, National Institute on Alcohol Abuse and Alcoholism, Bethesda, MD 20892, USA
| | - Yi Zhang
- Center for Brain Imaging, School of Life Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| |
Collapse
|
59
|
Choice history effects in mice and humans improve reward harvesting efficiency. PLoS Comput Biol 2021; 17:e1009452. [PMID: 34606493 PMCID: PMC8516315 DOI: 10.1371/journal.pcbi.1009452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 10/14/2021] [Accepted: 09/15/2021] [Indexed: 12/04/2022] Open
Abstract
Choice history effects describe how future choices depend on the history of past choices. In experimental tasks this is typically framed as a bias because it often diminishes the experienced reward rates. However, in natural habitats, choices made in the past constrain choices that can be made in the future. For foraging animals, the probability of earning a reward in a given patch depends on the degree to which the animals have exploited the patch in the past. One problem with many experimental tasks that show choice history effects is that such tasks artificially decouple choice history from its consequences on reward availability over time. To circumvent this, we use a variable interval (VI) reward schedule that reinstates a more natural contingency between past choices and future reward availability. By examining the behavior of optimal agents in the VI task we discover that choice history effects observed in animals serve to maximize reward harvesting efficiency. We further distil the function of choice history effects by manipulating first- and second-order statistics of the environment. We find that choice history effects primarily reflect the growth rate of the reward probability of the unchosen option, whereas reward history effects primarily reflect environmental volatility. Based on observed choice history effects in animals, we develop a reinforcement learning model that explicitly incorporates choice history over multiple time scales into the decision process, and we assess its predictive adequacy in accounting for the associated behavior. We show that this new variant, known as the double trace model, has a higher performance in predicting choice data, and shows near optimal reward harvesting efficiency in simulated environments. These results suggests that choice history effects may be adaptive for natural contingencies between consumption and reward availability. This concept lends credence to a normative account of choice history effects that extends beyond its description as a bias. Animals foraging for food in natural habitats compete to obtain better quality food patches. To achieve this goal, animals can rely on memory and choose the same patches that have provided higher quality of food in the past. However, in natural habitats simply identifying better food patches may not be sufficient to successfully compete with their conspecifics, as food resources can grow over time. Therefore, it makes sense to visit from time to time those patches that were associated with lower food quality in the past. This demands optimal foraging animals to keep in memory not only which food patches provided the best food quality, but also which food patches they visited recently. To see if animals track their history of visits and use it to maximize the food harvesting efficiency, we subjected them to experimental conditions that mimicked natural foraging behavior. In our behavioral tasks, we replaced food foraging behavior with a two choice task that provided rewards to mice and humans. By developing a new computational model and subjecting animals to various behavioral manipulations, we demonstrate that keeping a memory of past visits helps the animals to optimize the efficiency with which they can harvest rewards.
Collapse
|
60
|
D'Souza JF, Price NSC, Hagan MA. Marmosets: a promising model for probing the neural mechanisms underlying complex visual networks such as the frontal-parietal network. Brain Struct Funct 2021; 226:3007-3022. [PMID: 34518902 PMCID: PMC8541938 DOI: 10.1007/s00429-021-02367-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 08/23/2021] [Indexed: 01/02/2023]
Abstract
The technology, methodology and models used by visual neuroscientists have provided great insights into the structure and function of individual brain areas. However, complex cognitive functions arise in the brain due to networks comprising multiple interacting cortical areas that are wired together with precise anatomical connections. A prime example of this phenomenon is the frontal–parietal network and two key regions within it: the frontal eye fields (FEF) and lateral intraparietal area (area LIP). Activity in these cortical areas has independently been tied to oculomotor control, motor preparation, visual attention and decision-making. Strong, bidirectional anatomical connections have also been traced between FEF and area LIP, suggesting that the aforementioned visual functions depend on these inter-area interactions. However, advancements in our knowledge about the interactions between area LIP and FEF are limited with the main animal model, the rhesus macaque, because these key regions are buried in the sulci of the brain. In this review, we propose that the common marmoset is the ideal model for investigating how anatomical connections give rise to functionally-complex cognitive visual behaviours, such as those modulated by the frontal–parietal network, because of the homology of their cortical networks with humans and macaques, amenability to transgenic technology, and rich behavioural repertoire. Furthermore, the lissencephalic structure of the marmoset brain enables application of powerful techniques, such as array-based electrophysiology and optogenetics, which are critical to bridge the gaps in our knowledge about structure and function in the brain.
Collapse
Affiliation(s)
- Joanita F D'Souza
- Department of Physiology and Neuroscience Program, Biomedicine Discovery Institute, Monash University, 26 Innovation Walk, Clayton, VIC, 3800, Australia.,Australian Research Council, Centre of Excellence for Integrative Brain Function, Monash University Node, Clayton, VIC, 3800, Australia
| | - Nicholas S C Price
- Department of Physiology and Neuroscience Program, Biomedicine Discovery Institute, Monash University, 26 Innovation Walk, Clayton, VIC, 3800, Australia.,Australian Research Council, Centre of Excellence for Integrative Brain Function, Monash University Node, Clayton, VIC, 3800, Australia
| | - Maureen A Hagan
- Department of Physiology and Neuroscience Program, Biomedicine Discovery Institute, Monash University, 26 Innovation Walk, Clayton, VIC, 3800, Australia. .,Australian Research Council, Centre of Excellence for Integrative Brain Function, Monash University Node, Clayton, VIC, 3800, Australia.
| |
Collapse
|
61
|
Zhou Y, Rosen MC, Swaminathan SK, Masse NY, Zhu O, Freedman DJ. Distributed functions of prefrontal and parietal cortices during sequential categorical decisions. eLife 2021; 10:e58782. [PMID: 34491201 PMCID: PMC8423442 DOI: 10.7554/elife.58782] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 07/13/2021] [Indexed: 12/19/2022] Open
Abstract
Comparing sequential stimuli is crucial for guiding complex behaviors. To understand mechanisms underlying sequential decisions, we compared neuronal responses in the prefrontal cortex (PFC), the lateral intraparietal (LIP), and medial intraparietal (MIP) areas in monkeys trained to decide whether sequentially presented stimuli were from matching (M) or nonmatching (NM) categories. We found that PFC leads M/NM decisions, whereas LIP and MIP appear more involved in stimulus evaluation and motor planning, respectively. Compared to LIP, PFC showed greater nonlinear integration of currently visible and remembered stimuli, which correlated with the monkeys' M/NM decisions. Furthermore, multi-module recurrent networks trained on the same task exhibited key features of PFC and LIP encoding, including nonlinear integration in the PFC-like module, which was causally involved in the networks' decisions. Network analysis found that nonlinear units have stronger and more widespread connections with input, output, and within-area units, indicating putative circuit-level mechanisms for sequential decisions.
Collapse
Affiliation(s)
- Yang Zhou
- Department of Neurobiology, The University of ChicagoChicagoUnited States
- School of Psychological and Cognitive Sciences, PKU-IDG/McGovern Institute for Brain Research, Peking-Tsinghua Center for Life Sciences, Peking UniversityBeijingChina
| | - Matthew C Rosen
- Department of Neurobiology, The University of ChicagoChicagoUnited States
| | | | - Nicolas Y Masse
- Department of Neurobiology, The University of ChicagoChicagoUnited States
| | - Ou Zhu
- Department of Neurobiology, The University of ChicagoChicagoUnited States
| | - David J Freedman
- Department of Neurobiology, The University of ChicagoChicagoUnited States
- Neuroscience Institute, The University of ChicagoChicagoUnited States
| |
Collapse
|
62
|
Abstract
Choking under pressure is a frustrating phenomenon experienced sometimes by skilled performers as well as during everyday life. The phenomenon has been extensively studied in humans, but it has not been previously shown whether animals also choke under pressure. Here we report that rhesus monkeys also choke under pressure. This indicates that there may be shared neural mechanisms that underlie the behavior in both humans and monkeys. Introducing an animal model for choking under pressure allows for opportunities to study the neural causes of this paradoxical behavior. In high-stakes situations, people sometimes exhibit a frustrating phenomenon known as “choking under pressure.” Usually, we perform better when the potential payoff is larger. However, once potential rewards get too high, performance paradoxically decreases—we “choke.” Why do we choke under pressure? An animal model of choking would facilitate the investigation of its neural basis. However, it could be that choking is a uniquely human occurrence. To determine whether animals also choke, we trained three rhesus monkeys to perform a difficult reaching task in which they knew in advance the amount of reward to be given upon successful completion. Like humans, monkeys performed worse when potential rewards were exceptionally valuable. Failures that occurred at the highest level of reward were due to overly cautious reaching, in line with the psychological theory that explicit monitoring of behavior leads to choking. Our results demonstrate that choking under pressure is not unique to humans, and thus, its neural basis might be conserved across species.
Collapse
|
63
|
Wolf C, Lappe M. Vision as oculomotor reward: cognitive contributions to the dynamic control of saccadic eye movements. Cogn Neurodyn 2021; 15:547-568. [PMID: 34367360 PMCID: PMC8286912 DOI: 10.1007/s11571-020-09661-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/12/2020] [Accepted: 12/28/2020] [Indexed: 01/08/2023] Open
Abstract
Humans and other primates are equipped with a foveated visual system. As a consequence, we reorient our fovea to objects and targets in the visual field that are conspicuous or that we consider relevant or worth looking at. These reorientations are achieved by means of saccadic eye movements. Where we saccade to depends on various low-level factors such as a targets' luminance but also crucially on high-level factors like the expected reward or a targets' relevance for perception and subsequent behavior. Here, we review recent findings how the control of saccadic eye movements is influenced by higher-level cognitive processes. We first describe the pathways by which cognitive contributions can influence the neural oculomotor circuit. Second, we summarize what saccade parameters reveal about cognitive mechanisms, particularly saccade latencies, saccade kinematics and changes in saccade gain. Finally, we review findings on what renders a saccade target valuable, as reflected in oculomotor behavior. We emphasize that foveal vision of the target after the saccade can constitute an internal reward for the visual system and that this is reflected in oculomotor dynamics that serve to quickly and accurately provide detailed foveal vision of relevant targets in the visual field.
Collapse
Affiliation(s)
- Christian Wolf
- Institute for Psychology, University of Muenster, Fliednerstrasse 21, 48149 Münster, Germany
| | - Markus Lappe
- Institute for Psychology, University of Muenster, Fliednerstrasse 21, 48149 Münster, Germany
| |
Collapse
|
64
|
Smith AP, Beckmann JS. Quantifying value-based determinants of drug and non-drug decision dynamics. Psychopharmacology (Berl) 2021; 238:2047-2057. [PMID: 33839902 PMCID: PMC8529627 DOI: 10.1007/s00213-021-05830-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 03/15/2021] [Indexed: 01/17/2023]
Abstract
RATIONALE A growing body of research suggests that substance use disorder (SUD) may be characterized as disorders of decision making. However, drug choice studies assessing drug-associated decision making often lack more complex and dynamic conditions that better approximate contexts outside the laboratory and may lead to incomplete conclusions regarding the nature of drug-associated value. OBJECTIVES The current study assessed isomorphic (choice between identical food options) and allomorphic (choice between remifentanil [REMI] and food) choice across dynamically changing reward probabilities, magnitudes, and differentially reward-predictive stimuli in male rats to better understand determinants of drug value. Choice data were analyzed at aggregate and choice-by-choice levels using quantitative matching and reinforcement learning (RL) models, respectively. RESULTS Reductions in reward probability or magnitude independently reduced preferences for food and REMI commodities. Inclusion of reward-predictive cues significantly increased preference for food and REMI rewards. Model comparisons revealed that reward-predictive stimuli significantly altered the economic substitutability of food and REMI rewards at both levels of analysis. Furthermore, model comparisons supported the reformulation of reward value updating in RL models from independent terms to a shared, relative term, more akin to matching models. CONCLUSIONS The results indicate that value-based quantitative choice models can accurately capture choice determinants within complex decision-making contexts and corroborate drug choice as a multidimensional valuation process. Collectively, the present study indicates commonalities in decision-making for drug and non-drug rewards, validates the use of economic-based SUD therapies (e.g., contingency management), and implicates the neurobehavioral processes underlying drug-associated decision-making as a potential avenue for future SUD treatment.
Collapse
Affiliation(s)
- Aaron P Smith
- Cofrin Logan Center for Addiction Research and Treatment, University of Kansas, Lawrence, KS, USA
| | - Joshua S Beckmann
- Department of Psychology, University of Kentucky, Lexington, KY, USA.
| |
Collapse
|
65
|
Pfeffer T, Ponce-Alvarez A, Tsetsos K, Meindertsma T, Gahnström CJ, van den Brink RL, Nolte G, Engel AK, Deco G, Donner TH. Circuit mechanisms for the chemical modulation of cortex-wide network interactions and behavioral variability. SCIENCE ADVANCES 2021; 7:eabf5620. [PMID: 34272245 PMCID: PMC8284895 DOI: 10.1126/sciadv.abf5620] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 06/03/2021] [Indexed: 05/07/2023]
Abstract
Influential theories postulate distinct roles of catecholamines and acetylcholine in cognition and behavior. However, previous physiological work reported similar effects of these neuromodulators on the response properties (specifically, the gain) of individual cortical neurons. Here, we show a double dissociation between the effects of catecholamines and acetylcholine at the level of large-scale interactions between cortical areas in humans. A pharmacological boost of catecholamine levels increased cortex-wide interactions during a visual task, but not rest. An acetylcholine boost decreased interactions during rest, but not task. Cortical circuit modeling explained this dissociation by differential changes in two circuit properties: the local excitation-inhibition balance (more strongly increased by catecholamines) and intracortical transmission (more strongly reduced by acetylcholine). The inferred catecholaminergic mechanism also predicted noisier decision-making, which we confirmed for both perceptual and value-based choice behavior. Our work highlights specific circuit mechanisms for shaping cortical network interactions and behavioral variability by key neuromodulatory systems.
Collapse
Affiliation(s)
- Thomas Pfeffer
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
- Center for Brain and Cognition, Computational Neuroscience Group, Universitat Pompeu Fabra, Barcelona, Spain
| | - Adrian Ponce-Alvarez
- Center for Brain and Cognition, Computational Neuroscience Group, Universitat Pompeu Fabra, Barcelona, Spain
| | - Konstantinos Tsetsos
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Thomas Meindertsma
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
| | - Christoffer Julius Gahnström
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Ruud Lucas van den Brink
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Andreas Karl Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Gustavo Deco
- Center for Brain and Cognition, Computational Neuroscience Group, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de la Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- School of Psychological Sciences, Monash University, Melbourne, Australia
| | - Tobias Hinrich Donner
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
- Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany
| |
Collapse
|
66
|
|
67
|
Meta-analytic clustering dissociates brain activity and behavior profiles across reward processing paradigms. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021; 20:215-235. [PMID: 31872334 DOI: 10.3758/s13415-019-00763-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Reward learning is a ubiquitous cognitive mechanism guiding adaptive choices and behaviors, and when impaired, can lead to considerable mental health consequences. Reward-related functional neuroimaging studies have begun to implicate networks of brain regions essential for processing various peripheral influences (e.g., risk, subjective preference, delay, social context) involved in the multifaceted reward processing construct. To provide a more complete neurocognitive perspective on reward processing that synthesizes findings across the literature while also appreciating these peripheral influences, we used emerging meta-analytic techniques to elucidate brain regions, and in turn networks, consistently engaged in distinct aspects of reward processing. Using a data-driven, meta-analytic, k-means clustering approach, we dissociated seven meta-analytic groupings (MAGs) of neuroimaging results (i.e., brain activity maps) from 749 experimental contrasts across 176 reward processing studies involving 13,358 healthy participants. We then performed an exploratory functional decoding approach to gain insight into the putative functions associated with each MAG. We identified a seven-MAG clustering solution that represented dissociable patterns of convergent brain activity across reward processing tasks. Additionally, our functional decoding analyses revealed that each of these MAGs mapped onto discrete behavior profiles that suggested specialized roles in predicting value (MAG-1 & MAG-2) and processing a variety of emotional (MAG-3), external (MAG-4 & MAG-5), and internal (MAG-6 & MAG-7) influences across reward processing paradigms. These findings support and extend aspects of well-accepted reward learning theories and highlight large-scale brain network activity associated with distinct aspects of reward processing.
Collapse
|
68
|
Abstract
Remapping is a property of some cortical and subcortical neurons that update their responses around the time of an eye movement to account for the shift of stimuli on the retina due to the saccade. Physiologically, remapping is traditionally tested by briefly presenting a single stimulus around the time of the saccade and looking at the onset of the response and the locations in space to which the neuron is responsive. Here we suggest that a better way to understand the functional role of remapping is to look at the time at which the neural signal emerges when saccades are made across a stable scene. Based on data obtained using this approach, we suggest that remapping in the lateral intraparietal area is sufficient to play a role in maintaining visual stability across saccades, whereas in the frontal eye field, remapped activity carries information that affects future saccadic choices and, in a separate subset of neurons, is used to maintain a map of locations in the scene that have been previously fixated.
Collapse
Affiliation(s)
- James W Bisley
- Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.,Jules Stein Eye Institute, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.,Department of Psychology and the Brain Research Institute, UCLA, Los Angeles, CA, USA
| | - Koorosh Mirpour
- Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Yelda Alkan
- Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| |
Collapse
|
69
|
Mirpour K, Bisley JW. The roles of the lateral intraparietal area and frontal eye field in guiding eye movements in free viewing search behavior. J Neurophysiol 2021; 125:2144-2157. [PMID: 33949898 DOI: 10.1152/jn.00559.2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The lateral intraparietal area (LIP) and frontal eye field (FEF) have been shown to play significant roles in oculomotor control, yet most studies have found that the two areas behave similarly. To identify the unique roles each area plays in guiding eye movements, we recorded 200 LIP neurons and 231 FEF neurons from four animals performing a free viewing visual foraging task. We analyzed how neuronal responses were modulated by stimulus identity and the animals' choice of where to make a saccade. We additionally analyzed the comodulation of the sensory signals and the choice signal to identify how the sensory signals drove the choice. We found a clearly defined division of labor: LIP provided a stable map integrating task rules and stimulus identity, whereas FEF responses were dynamic, representing more complex information and, just before the saccade, were integrated with task rules and stimulus identity to decide where to move the eye.NEW & NOTEWORTHY The lateral intrapareital area (LIP) and frontal eye field (FEF) are known to contribute to guiding eye movements, but little is known about the unique roles that each area plays. Using a free viewing visual search task, we found that LIP provides a stable map of the visual world, integrating task rules and stimulus identity. FEF activity is consistently modulated by more complex information but, just before the saccade, integrates all the information to make the final decision about where to move.
Collapse
Affiliation(s)
- Koorosh Mirpour
- Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, California
| | - James W Bisley
- Department of Neurobiology, David Geffen School of Medicine at UCLA, Los Angeles, California.,Jules Stein Eye Institute, David Geffen School of Medicine at UCLA, Los Angeles, California.,Department of Psychology and the Brain Research Institute, UCLA, Los Angeles, California
| |
Collapse
|
70
|
Hennig JA, Oby ER, Golub MD, Bahureksa LA, Sadtler PT, Quick KM, Ryu SI, Tyler-Kabara EC, Batista AP, Chase SM, Yu BM. Learning is shaped by abrupt changes in neural engagement. Nat Neurosci 2021; 24:727-736. [PMID: 33782622 DOI: 10.1038/s41593-021-00822-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Accepted: 02/22/2021] [Indexed: 01/30/2023]
Abstract
Internal states such as arousal, attention and motivation modulate brain-wide neural activity, but how these processes interact with learning is not well understood. During learning, the brain modifies its neural activity to improve behavior. How do internal states affect this process? Using a brain-computer interface learning paradigm in monkeys, we identified large, abrupt fluctuations in neural population activity in motor cortex indicative of arousal-like internal state changes, which we term 'neural engagement.' In a brain-computer interface, the causal relationship between neural activity and behavior is known, allowing us to understand how neural engagement impacted behavioral performance for different task goals. We observed stereotyped changes in neural engagement that occurred regardless of how they impacted performance. This allowed us to predict how quickly different task goals were learned. These results suggest that changes in internal states, even those seemingly unrelated to goal-seeking behavior, can systematically influence how behavior improves with learning.
Collapse
Affiliation(s)
- Jay A Hennig
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA. .,Center for the Neural Basis of Cognition, Pittsburgh, PA, USA. .,Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Emily R Oby
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA.,Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Matthew D Golub
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA.,Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Lindsay A Bahureksa
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Patrick T Sadtler
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kristin M Quick
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Stephen I Ryu
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.,Department of Neurosurgery, Palo Alto Medical Foundation, Palo Alto, CA, USA
| | - Elizabeth C Tyler-Kabara
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Physical Medicine and Rehabilitation, University of Pittsburgh, Pittsburgh, PA, USA.,Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA.,Department of Neurosurgery, Dell Medical School, University of Texas at Austin, Austin, TX, USA
| | - Aaron P Batista
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Steven M Chase
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.,Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Byron M Yu
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.,Center for the Neural Basis of Cognition, Pittsburgh, PA, USA.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA.,Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
71
|
Wojtak W, Ferreira F, Vicente P, Louro L, Bicho E, Erlhagen W. A neural integrator model for planning and value-based decision making of a robotics assistant. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05224-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
72
|
Mochol G, Kiani R, Moreno-Bote R. Prefrontal cortex represents heuristics that shape choice bias and its integration into future behavior. Curr Biol 2021; 31:1234-1244.e6. [PMID: 33639107 PMCID: PMC8095400 DOI: 10.1016/j.cub.2021.01.068] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 10/01/2020] [Accepted: 01/20/2021] [Indexed: 02/07/2023]
Abstract
Goal-directed behavior requires integrating sensory information with prior knowledge about the environment. Behavioral biases that arise from these priors could increase positive outcomes when the priors match the true structure of the environment, but mismatches also happen frequently and could cause unfavorable outcomes. Biases that reduce gains and fail to vanish with training indicate fundamental suboptimalities arising from ingrained heuristics of the brain. Here, we report systematic, gain-reducing choice biases in highly trained monkeys performing a motion direction discrimination task where only the current stimulus is behaviorally relevant. The monkey's bias fluctuated at two distinct time scales: slow, spanning tens to hundreds of trials, and fast, arising from choices and outcomes of the most recent trials. Our findings enabled single trial prediction of biases, which influenced the choice especially on trials with weak stimuli. The pre-stimulus activity of neuronal ensembles in the monkey prearcuate gyrus represented these biases as an offset along the decision axis in the state space. This offset persisted throughout the stimulus viewing period, when sensory information was integrated, leading to a biased choice. The pre-stimulus representation of history-dependent bias was functionally indistinguishable from the neural representation of upcoming choice before stimulus onset, validating our model of single-trial biases and suggesting that pre-stimulus representation of choice could be fully defined by biases inferred from behavioral history. Our results indicate that the prearcuate gyrus reflects intrinsic heuristics that compute bias signals, as well as the mechanisms that integrate them into the oculomotor decision-making process.
Collapse
Affiliation(s)
- Gabriela Mochol
- Center for Brain and Cognition and Department of Information and Communications Technologies, Pompeu Fabra University, Barcelona, Spain.
| | - Roozbeh Kiani
- Center for Neural Science, New York University, New York, NY 10003, USA; Neuroscience Institute, NYU Langone Medical Center, New York, NY 10016, USA; Department of Psychology, New York University, New York, NY 10003, USA
| | - Rubén Moreno-Bote
- Center for Brain and Cognition and Department of Information and Communications Technologies, Pompeu Fabra University, Barcelona, Spain
| |
Collapse
|
73
|
Houston AI, Trimmer PC, McNamara JM. Matching Behaviours and Rewards. Trends Cogn Sci 2021; 25:403-415. [PMID: 33612384 DOI: 10.1016/j.tics.2021.01.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 01/19/2021] [Accepted: 01/21/2021] [Indexed: 10/22/2022]
Abstract
Matching describes how behaviour is related to rewards. The matching law holds when the ratio of an individual's behaviours equals the ratio of the rewards obtained. From its origins in the study of pigeons working for food in the laboratory, the law has been applied to a range of species, both in the laboratory and outside it (e.g., human sporting decisions). Probability matching occurs when the probability of a behaviour equals the probability of being rewarded. Input matching predicts the distribution of individuals across habitats. We evaluate the rationality of the matching law and probability matching, expose the logic of matching in real-world cases, review how recent neuroscience findings relate to matching, and suggest future research directions.
Collapse
Affiliation(s)
- Alasdair I Houston
- School of Biological Sciences, University of Bristol, Life Sciences Building, 24 Tyndall Avenue, Bristol, BS8 1TQ, UK.
| | - Pete C Trimmer
- Department of Psychology, University of Warwick, Coventry, CV4 7AL, UK
| | - John M McNamara
- School of Mathematics, University of Bristol, Fry Building, Woodland Road, Bristol, BS8 1UG, UK
| |
Collapse
|
74
|
Clancy KB, Mrsic-Flogel TD. The sensory representation of causally controlled objects. Neuron 2021; 109:677-689.e4. [PMID: 33357383 PMCID: PMC7889580 DOI: 10.1016/j.neuron.2020.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 08/17/2020] [Accepted: 12/02/2020] [Indexed: 12/13/2022]
Abstract
Intentional control over external objects is informed by our sensory experience of them. To study how causal relationships are learned and effected, we devised a brain machine interface (BMI) task using wide-field calcium signals. Mice learned to entrain activity patterns in arbitrary pairs of cortical regions to guide a visual cursor to a target location for reward. Brain areas that were normally correlated could be rapidly reconfigured to exert control over the cursor in a sensory-feedback-dependent manner. Higher visual cortex was more engaged when expert but not naive animals controlled the cursor. Individual neurons in higher visual cortex responded more strongly to the cursor when mice controlled it than when they passively viewed it, with the greatest response boosting as the cursor approached the target location. Thus, representations of causally controlled objects are sensitive to intention and proximity to the subject's goal, potentially strengthening sensory feedback to allow more fluent control.
Collapse
Affiliation(s)
- Kelly B Clancy
- Biozentrum, University of Basel, 70 Klingelbergstrasse, 4056 Basel, Switzerland.
| | | |
Collapse
|
75
|
Bari BA, Cohen JY. Dynamic decision making and value computations in medial frontal cortex. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2021; 158:83-113. [PMID: 33785157 DOI: 10.1016/bs.irn.2020.12.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Dynamic decision making requires an intact medial frontal cortex. Recent work has combined theory and single-neuron measurements in frontal cortex to advance models of decision making. We review behavioral tasks that have been used to study dynamic decision making and algorithmic models of these tasks using reinforcement learning theory. We discuss studies linking neurophysiology and quantitative decision variables. We conclude with hypotheses about the role of other cortical and subcortical structures in dynamic decision making, including ascending neuromodulatory systems.
Collapse
Affiliation(s)
- Bilal A Bari
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, United States
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, United States.
| |
Collapse
|
76
|
Thiery T, Saive AL, Combrisson E, Dehgan A, Bastin J, Kahane P, Berthoz A, Lachaux JP, Jerbi K. Decoding the neural dynamics of free choice in humans. PLoS Biol 2020; 18:e3000864. [PMID: 33301439 PMCID: PMC7755286 DOI: 10.1371/journal.pbio.3000864] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 12/22/2020] [Accepted: 10/05/2020] [Indexed: 11/19/2022] Open
Abstract
How do we choose a particular action among equally valid alternatives? Nonhuman primate findings have shown that decision-making implicates modulations in unit firing rates and local field potentials (LFPs) across frontal and parietal cortices. Yet the electrophysiological brain mechanisms that underlie free choice in humans remain ill defined. Here, we address this question using rare intracerebral electroencephalography (EEG) recordings in surgical epilepsy patients performing a delayed oculomotor decision task. We find that the temporal dynamics of high-gamma (HG, 60-140 Hz) neural activity in distinct frontal and parietal brain areas robustly discriminate free choice from instructed saccade planning at the level of single trials. Classification analysis was applied to the LFP signals to isolate decision-related activity from sensory and motor planning processes. Compared with instructed saccades, free-choice trials exhibited delayed and longer-lasting HG activity during the delay period. The temporal dynamics of the decision-specific sustained HG activity indexed the unfolding of a deliberation process, rather than memory maintenance. Taken together, these findings provide the first direct electrophysiological evidence in humans for the role of sustained high-frequency neural activation in frontoparietal cortex in mediating the intrinsically driven process of freely choosing among competing behavioral alternatives.
Collapse
Affiliation(s)
- Thomas Thiery
- Cognitive & Computational Neuroscience Lab, Psychology Department, University of Montreal, Québec, Canada
| | - Anne-Lise Saive
- Cognitive & Computational Neuroscience Lab, Psychology Department, University of Montreal, Québec, Canada
| | - Etienne Combrisson
- Cognitive & Computational Neuroscience Lab, Psychology Department, University of Montreal, Québec, Canada
- Centre de Recherche en Neurosciences de Lyon (CRNL), Lyon, France
| | - Arthur Dehgan
- Cognitive & Computational Neuroscience Lab, Psychology Department, University of Montreal, Québec, Canada
| | - Julien Bastin
- Grenoble Institut des Neurosciences, Grenoble, France
| | | | | | | | - Karim Jerbi
- Cognitive & Computational Neuroscience Lab, Psychology Department, University of Montreal, Québec, Canada
- MILA (Québec Artificial Intelligence Institute), Montréal, Québec, Canada
- Centre UNIQUE (Union Neurosciences & Intelligence Artificielle), Montréal, Québec, Canada
| |
Collapse
|
77
|
Chen X, Zirnsak M, Vega GM, Moore T. Frontal eye field neurons selectively signal the reward value of prior actions. Prog Neurobiol 2020; 195:101881. [PMID: 32628973 PMCID: PMC7736534 DOI: 10.1016/j.pneurobio.2020.101881] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 06/21/2020] [Accepted: 06/26/2020] [Indexed: 12/14/2022]
Abstract
The consequences of individual actions are typically unknown until well after they are executed. This fact necessitates a mechanism that bridges delays between specific actions and reward outcomes. We looked for the presence of such a mechanism in the post-movement activity of neurons in the frontal eye field (FEF), a visuomotor area in prefrontal cortex. Monkeys performed an oculomotor gamble task in which they made eye movements to different locations associated with dynamically varying reward outcomes. Behavioral data showed that monkeys tracked reward history and made choices according to their own risk preferences. Consistent with previous studies, we observed that the activity of FEF neurons is correlated with the expected reward value of different eye movements before a target appears. Moreover, we observed that the activity of FEF neurons continued to signal the direction of eye movements, the expected reward value, and their interaction well after the movements were completed and when targets were no longer within the neuronal response field. In addition, this post-movement information was also observed in local field potentials, particularly in low-frequency bands. These results show that neural signals of prior actions and expected reward value persist across delays between those actions and their experienced outcomes. These memory traces may serve a role in reward-based learning in which subjects need to learn actions predicting delayed reward.
Collapse
Affiliation(s)
- Xiaomo Chen
- Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Marc Zirnsak
- Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Gabriel M Vega
- Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Tirin Moore
- Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
78
|
Loganathan K, Lv J, Cropley V, Ho ETW, Zalesky A. Associations Between Delay Discounting and Connectivity of the Valuation-control System in Healthy Young Adults. Neuroscience 2020; 452:295-310. [PMID: 33242540 DOI: 10.1016/j.neuroscience.2020.11.026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 11/05/2020] [Accepted: 11/13/2020] [Indexed: 01/04/2023]
Abstract
The process of valuation assists in determining if an object or course of action is rewarding. Delay discounting is the observed decay of a rewards' subjective value over time. Encoding the subjective value of rewards across a spectrum has been attributed to brain regions belonging to the valuation and executive control systems. The valuation system (VS) encodes reward value over short and long delays, influencing reinforcement learning and reward representation. The executive control system (ECS) becomes more active as choice difficulty increases, integrating contextual and mnemonic information with salience signals in the modulation of decision-making. Here, we aimed to identify resting-state functional connectivity-based patterns of the VS and ECS correlated with value-setting and delay discounting (outside-scanner paradigm) in a large (n = 992) cohort of healthy young adults from the Human Connectome Project (HCP). Results suggest the VS may be involved in value-setting of small, immediate rewards while the ECS may be involved in value-setting and delay discounting for large and small rewards over a range of delays. We observed magnitude sensitive connections involving the posterior cingulate cortex, time-sensitive connections with the ventromedial and lateral prefrontal cortex while connections involving the posterior parietal cortex appeared both magnitude- and time-sensitive. The ventromedial prefrontal cortex and posterior parietal cortex could act as "comparator" regions, weighing the value of small rewards against large rewards across various delay duration to aid in decision-making.
Collapse
Affiliation(s)
- Kavinash Loganathan
- Centre for Intelligent Signal & Imaging Research, Universiti Teknologi PETRONAS, Perak, Malaysia.
| | - Jinglei Lv
- Sydney Imaging & School of Biomedical Engineering, The University of Sydney, Sydney, Australia; Melbourne Neuropsychiatry Centre, Department of Psychiatry, University of Melbourne & Melbourne Health, Melbourne Australia; Department of Biomedical Engineering, University of Melbourne, Melbourne, Australia
| | - Vanessa Cropley
- Melbourne Neuropsychiatry Centre, Department of Psychiatry, University of Melbourne & Melbourne Health, Melbourne Australia
| | - Eric Tatt Wei Ho
- Centre for Intelligent Signal & Imaging Research, Universiti Teknologi PETRONAS, Perak, Malaysia; Department of Electrical & Electronics Engineering, Universiti Teknologi PETRONAS, Perak, Malaysia
| | - Andrew Zalesky
- Melbourne Neuropsychiatry Centre, Department of Psychiatry, University of Melbourne & Melbourne Health, Melbourne Australia; Department of Biomedical Engineering, University of Melbourne, Melbourne, Australia
| |
Collapse
|
79
|
Abstract
Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning to reach subjective goals. A fundamental challenge in neuroscience is, How can we use behavior and neural activity to understand this internal model and its dynamic latent variables? Here we interpret behavioral data by assuming an agent behaves rationally—that is, it takes actions that optimize its subjective reward according to its understanding of the task and its relevant causal variables. We apply a method, inverse rational control (IRC), to learn an agent’s internal model and reward function by maximizing the likelihood of its measured sensory observations and actions. This thereby extracts rational and interpretable thoughts of the agent from its behavior. We also provide a framework for interpreting encoding, recoding, and decoding of neural data in light of this rational model for behavior. When applied to behavioral and neural data from simulated agents performing suboptimally on a naturalistic foraging task, this method successfully recovers their internal model and reward function, as well as the Markovian computational dynamics within the neural manifold that represent the task. This work lays a foundation for discovering how the brain represents and computes with dynamic latent variables.
Collapse
Affiliation(s)
- Zhengwei Wu
- Department of Neuroscience, Baylor College of Medicine, Houston, TX 77030
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77030
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX 77030
| | - Minhae Kwon
- Department of Neuroscience, Baylor College of Medicine, Houston, TX 77030
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77030
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX 77030
| | - Saurabh Daptardar
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77030
- Geo Core Data, Atomic Maps, Google, Mountain View, CA 94043
| | - Paul Schrater
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455
- Department of Computer Science, University of Minnesota, Minneapolis, MN 55455
| | - Xaq Pitkow
- Department of Neuroscience, Baylor College of Medicine, Houston, TX 77030;
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77030
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, TX 77030
| |
Collapse
|
80
|
Soltani A, Rakhshan M, Schafer RJ, Burrows BE, Moore T. Separable Influences of Reward on Visual Processing and Choice. J Cogn Neurosci 2020; 33:248-262. [PMID: 33166195 DOI: 10.1162/jocn_a_01647] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Primate vision is characterized by constant, sequential processing and selection of visual targets to fixate. Although expected reward is known to influence both processing and selection of visual targets, similarities and differences between these effects remain unclear mainly because they have been measured in separate tasks. Using a novel paradigm, we simultaneously measured the effects of reward outcomes and expected reward on target selection and sensitivity to visual motion in monkeys. Monkeys freely chose between two visual targets and received a juice reward with varying probability for eye movements made to either of them. Targets were stationary apertures of drifting gratings, causing the end points of eye movements to these targets to be systematically biased in the direction of motion. We used this motion-induced bias as a measure of sensitivity to visual motion on each trial. We then performed different analyses to explore effects of objective and subjective reward values on choice and sensitivity to visual motion to find similarities and differences between reward effects on these two processes. Specifically, we used different reinforcement learning models to fit choice behavior and estimate subjective reward values based on the integration of reward outcomes over multiple trials. Moreover, to compare the effects of subjective reward value on choice and sensitivity to motion directly, we considered correlations between each of these variables and integrated reward outcomes on a wide range of timescales. We found that, in addition to choice, sensitivity to visual motion was also influenced by subjective reward value, although the motion was irrelevant for receiving reward. Unlike choice, however, sensitivity to visual motion was not affected by objective measures of reward value. Moreover, choice was determined by the difference in subjective reward values of the two options, whereas sensitivity to motion was influenced by the sum of values. Finally, models that best predicted visual processing and choice used sets of estimated reward values based on different types of reward integration and timescales. Together, our results demonstrate separable influences of reward on visual processing and choice, and point to the presence of multiple brain circuits for the integration of reward outcomes.
Collapse
|
81
|
Ottenheimer DJ, Wang K, Tong X, Fraser KM, Richard JM, Janak PH. Reward activity in ventral pallidum tracks satiety-sensitive preference and drives choice behavior. SCIENCE ADVANCES 2020; 6:6/45/eabc9321. [PMID: 33148649 PMCID: PMC7673692 DOI: 10.1126/sciadv.abc9321] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 09/18/2020] [Indexed: 06/11/2023]
Abstract
A key function of the nervous system is producing adaptive behavior across changing conditions, like physiological state. Although states like thirst and hunger are known to impact decision-making, the neurobiology of this phenomenon has been studied minimally. Here, we tracked evolving preference for sucrose and water as rats proceeded from a thirsty to sated state. As rats shifted from water choices to sucrose choices across the session, the activity of a majority of neurons in the ventral pallidum, a region crucial for reward-related behaviors, closely matched the evolving behavioral preference. The timing of this signal followed the pattern of a reward prediction error, occurring at the cue or the reward depending on when reward identity was revealed. Additionally, optogenetic stimulation of ventral pallidum neurons at the time of reward was able to reverse behavioral preference. Our results suggest that ventral pallidum neurons guide reward-related decisions across changing physiological states.
Collapse
Affiliation(s)
- David J Ottenheimer
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Karen Wang
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Xiao Tong
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Kurt M Fraser
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| | - Jocelyn M Richard
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
| | - Patricia H Janak
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University, Baltimore, MD, USA.
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
82
|
Olivers CN, Roelfsema PR. Attention for action in visual working memory. Cortex 2020; 131:179-194. [DOI: 10.1016/j.cortex.2020.07.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/22/2020] [Accepted: 07/14/2020] [Indexed: 12/27/2022]
|
83
|
Zhou Y, Liu Y, Zhang M. Neuronal Correlates of Many-To-One Sensorimotor Mapping in Lateral Intraparietal Cortex. Cereb Cortex 2020; 30:5583-5596. [PMID: 32488241 DOI: 10.1093/cercor/bhaa145] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 04/01/2020] [Accepted: 05/09/2020] [Indexed: 11/14/2022] Open
Abstract
Efficiently mapping sensory stimuli onto motor programs is crucial for rapidly choosing appropriate behavioral responses. While neuronal mechanisms underlying simple, one-to-one sensorimotor mapping have been extensively studied, how the brain achieves complex, many-to-one sensorimotor mapping remains unclear. Here, we recorded single neuron activity from the lateral intraparietal (LIP) cortex of monkeys trained to map multiple spatial positions of visual cue onto two opposite saccades. We found that LIP neurons' activity was consistent with directly mapping multiple cue positions to the associated saccadic direction (SDir) regardless of whether the visual cue appeared in or outside neurons' receptive fields. Unlike the explicit encoding of the visual categories, such cue-target mapping (CTM)-related activity covaried with the associated SDirs. Furthermore, the CTM was preferentially mediated by visual neurons identified by memory-guided saccade. These results indicate that LIP plays a crucial role in the early stage of many-to-one sensorimotor transformation.
Collapse
Affiliation(s)
- Yang Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning; IDG/McGovern Institute for Brain Research at BNU; Division of Psychology, Beijing Normal University, Beijing 100875, China.,Institute of Neuroscience, State Key Laboratory for Neuroscience, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.,Department of Neurobiology, The University of Chicago, Chicago, IL 60637, USA
| | - Yining Liu
- The First Affiliated Hospital of Zhengzhou University, Henan 450052, China
| | - Mingsha Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning; IDG/McGovern Institute for Brain Research at BNU; Division of Psychology, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
84
|
Kim AJ, Anderson BA. The effect of concurrent reward on aversive information processing in the brain. Neuroimage 2020; 217:116890. [PMID: 32360930 PMCID: PMC7474551 DOI: 10.1016/j.neuroimage.2020.116890] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 04/22/2020] [Accepted: 04/24/2020] [Indexed: 11/28/2022] Open
Abstract
Neural networks for the processing of appetitive and aversive information, in isolation, have been well characterized. However, how the brain integrates competing signals associated with simultaneous appetitive and aversive information is less clear. In particular, it is unknown how the presence of concurrent reward modulates the processing of an aversive event throughout the brain. Here, we utilized a four-armed bandit task in an fMRI study to measure the representation of an aversive electric shock with and without the simultaneous receipt of monetary reward. Using a region of interest (ROI) approach, we first identified regions activated by the experience of aversive electric shock, and then measured how this shock-related activation is modulated by concurrent reward using independent data. Informed by prior literature and our own preliminary data, analyses focused on the dorsolateral prefrontal cortex, anterior and posterior insula, anterior cingulate cortex, and the thalamus and somatosensory cortex. We hypothesized that the neural response to punishment in these ROIs would be attenuated by the presence of concurrent reward. However, we found no evidence of concurrent reward attenuating the neural response to punishment in any ROI and also no evidence of concurrent punishment attenuating the neural response to reward in exploratory analyses. Altogether, our findings are consistent with the idea that neural networks responsible for the processing of reward and punishment signals are largely independent of one another, and that representations of overall value or utility are arrived at through the integration of separate reward and punishment signals at later stages of information processing.
Collapse
Affiliation(s)
- Andy J Kim
- Texas A&M University, Department of Psychological & Brain Sciences, Texas A&M Institute for Neuroscience, 4235 TAMU College Station, TX, 77843-4235, USA.
| | - Brian A Anderson
- Texas A&M University, Department of Psychological & Brain Sciences, Texas A&M Institute for Neuroscience, 4235 TAMU College Station, TX, 77843-4235, USA.
| |
Collapse
|
85
|
Takagaki K, Krug K. The effects of reward and social context on visual processing for perceptual decision-making. CURRENT OPINION IN PHYSIOLOGY 2020. [DOI: 10.1016/j.cophys.2020.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
86
|
Kuo CC, Hsieh JC, Tsai HC, Kuo YS, Yau HJ, Chen CC, Chen RF, Yang HW, Min MY. Inhibitory interneurons regulate phasic activity of noradrenergic neurons in the mouse locus coeruleus and functional implications. J Physiol 2020; 598:4003-4029. [PMID: 32598024 DOI: 10.1113/jp279557] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 06/25/2020] [Indexed: 01/16/2023] Open
Abstract
KEY POINTS The locus coeruleus (LC) contains noradrenergic (NA) neurons that respond to novel stimuli in the environment with phasic activation to initiate an orienting response; phasic LC activation is also triggered by stimuli, representing the outcome of task-related decision processes, to facilitate ensuing behaviours and help optimize task performance. Here, we report that LC-NA neurons exhibit bursts of action potentials in vitro resembling phasic LC activation in vivo, and the activity is gated by inhibitory interneurons (I-INs) located in the peri-LC. We also observe that inhibition of peri-LC I-INs enhances prepulse inhibition and axons from cortical areas that play important roles in evaluating the cost/reward of a stimulus synapse on both peri-LC I-INs and LC-NA neurons. The results help us understand the cellular mechanisms underlying the generation and regulation of phasic LC activation with a focus on the role of peri-LC I-INs. ABSTRACT Noradrenergic (NA) neurons in the locus coeruleus (LC) have global axonal projection to the brain. These neurons discharge action potentials phasically in response to either novel stimuli in the environment to initiate an orienting behaviour or stimuli representing the outcome of task-related decision processes to facilitate ensuing behaviours and help optimize task performance. Nevertheless, the cellular mechanisms underlying the generation and regulation of phasic LC activation remain unknown. We report here that LC-NA neurons recorded in brain slices exhibit bursts of action potentials that resembled the phasic activation-pause profile observed in animals. The activity was referred to as phasic-like activity (PLA) and was suppressed and enhanced by blocking excitatory and inhibitory synaptic transmissions, respectively. These results suggest the existence of a local circuit to drive PLA, and the activity could be regulated by the excitatory-inhibitory balance of the circuit. In support of this notion, we located a population of inhibitory interneurons (I-INs) in the medial part of the peri-LC that exerted feedforward inhibition of LC-NA neurons through GABAergic and glycinergic transmissions. Selective inhibition of peri-LC I-INs with chemogenetic methods could enhance PLA in brain slices and increase prepulse inhibition in animals. Moreover, axons from the orbitofrontal and prelimbic cortices, which play important roles in evaluating the cost/reward of a stimulus, synapse on both peri-LC I-INs and LC-NA neurons. These observations demonstrate functional roles of peri-LC I-INs in integrating inputs of the frontal cortex onto LC-NA neurons and gating the phasic LC output.
Collapse
Affiliation(s)
- Chao-Cheng Kuo
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| | - Jung-Chien Hsieh
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| | - Hsing-Chun Tsai
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| | - Yu-Shan Kuo
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan.,Departments of Biomedical Sciences and Medical Research, Chung-Shan Medical University and Chung-Shan Medical University Hospital, Taichung, 40201, Taiwan
| | - Hau-Jie Yau
- Graduate Institute of Brain and Mind Sciences, College of Medicine, National Taiwan University, Taipei, 10051, Taiwan
| | - Chih-Cheng Chen
- Institute of Biomedical Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Ruei-Feng Chen
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| | - Hsiu-Wen Yang
- Departments of Biomedical Sciences and Medical Research, Chung-Shan Medical University and Chung-Shan Medical University Hospital, Taichung, 40201, Taiwan
| | - Ming-Yuan Min
- Department of Life Science, College of Life Science, National Taiwan University, Taipei, 10617, Taiwan
| |
Collapse
|
87
|
Treviño M, Medina-Coss y León R, Haro B. Adaptive Choice Biases in Mice and Humans. Front Behav Neurosci 2020; 14:99. [PMID: 32760255 PMCID: PMC7372118 DOI: 10.3389/fnbeh.2020.00099] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 05/22/2020] [Indexed: 12/31/2022] Open
Abstract
The contribution of non-sensory information processing to perceptual decision making is not fully understood. Choice biases have been described for mice and humans and are highly prevalent even if they decrease rewarding outcomes. Choice biases are usually reduced by discriminability because stimulus strength directly enables the adjustments in the decision strategies used by decision-makers. However, choice biases could also derive from functional asymmetries in sensory processing, decision making, or both. Here, we tested how particular experimental contingencies influenced the production of choice biases in mice and humans. Our main goal was to establish the tasks and methods to jointly characterize psychometric performance and innate side-choice behavior in mice and humans. We implemented forced and un-forced visual tasks and found that both species displayed stable levels of side-choice biases, forming continuous distributions from low to high levels of choice stereotypy. Interestingly, stimulus discriminability reduced the side-choice biases in forced-choice, but not in free-choice tasks. Choice biases were stable in appearance and intensity across experimental days and could be employed to identify mice and human participants. Additionally, side- and alternating choices could be reinforced for both mice and humans, implying that choice biases were adaptable to non-visual manipulations. Our results highlight the fact that internal and external elements can influence the production of choice biases. Adaptations of our tasks could become a helpful diagnostic tool to detect aberrant levels of choice variability.
Collapse
Affiliation(s)
- Mario Treviño
- Laboratorio de Plasticidad Cortical y Aprendizaje Perceptual, Instituto de Neurociencias, Universidad de Guadalajara, Guadalajara, Mexico
| | | | | |
Collapse
|
88
|
Schonberg T, Katz LN. A Neural Pathway for Nonreinforced Preference Change. Trends Cogn Sci 2020; 24:504-514. [DOI: 10.1016/j.tics.2020.04.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 04/16/2020] [Accepted: 04/16/2020] [Indexed: 01/12/2023]
|
89
|
The impact of learning on perceptual decisions and its implication for speed-accuracy tradeoffs. Nat Commun 2020; 11:2757. [PMID: 32488065 PMCID: PMC7265464 DOI: 10.1038/s41467-020-16196-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Accepted: 04/01/2020] [Indexed: 11/16/2022] Open
Abstract
In standard models of perceptual decision-making, noisy sensory evidence is considered to be the primary source of choice errors and the accumulation of evidence needed to overcome this noise gives rise to speed-accuracy tradeoffs. Here, we investigated how the history of recent choices and their outcomes interact with these processes using a combination of theory and experiment. We found that the speed and accuracy of performance of rats on olfactory decision tasks could be best explained by a Bayesian model that combines reinforcement-based learning with accumulation of uncertain sensory evidence. This model predicted the specific pattern of trial history effects that were found in the data. The results suggest that learning is a critical factor contributing to speed-accuracy tradeoffs in decision-making, and that task history effects are not simply biases but rather the signatures of an optimal learning strategy. Here, the authors show that rats’ performance on olfactory decision tasks is best explained by a Bayesian model that combines reinforcement-based learning with accumulation of uncertain sensory evidence. The results suggest that learning is a critical factor contributing to speed-accuracy tradeoffs.
Collapse
|
90
|
Inference-Based Decisions in a Hidden State Foraging Task: Differential Contributions of Prefrontal Cortical Areas. Neuron 2020; 106:166-176.e6. [PMID: 32048995 PMCID: PMC7146546 DOI: 10.1016/j.neuron.2020.01.017] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 10/24/2019] [Accepted: 01/14/2020] [Indexed: 11/23/2022]
Abstract
Essential features of the world are often hidden and must be inferred by constructing internal models based on indirect evidence. Here, to study the mechanisms of inference, we establish a foraging task that is naturalistic and easily learned yet can distinguish inference from simpler strategies such as the direct integration of sensory data. We show that both mice and humans learn a strategy consistent with optimal inference of a hidden state. However, humans acquire this strategy more than an order of magnitude faster than mice. Using optogenetics in mice, we show that orbitofrontal and anterior cingulate cortex inactivation impacts task performance, but only orbitofrontal inactivation reverts mice from an inference-based to a stimulus-bound decision strategy. These results establish a cross-species paradigm for studying the problem of inference-based decision making and begins to dissect the network of brain regions crucial for its performance.
Collapse
|
91
|
van den Berg B, Geib BR, San Martín R, Woldorff MG. A key role for stimulus-specific updating of the sensory cortices in the learning of stimulus-reward associations. Soc Cogn Affect Neurosci 2020; 14:173-187. [PMID: 30576533 PMCID: PMC6374612 DOI: 10.1093/scan/nsy116] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Revised: 10/19/2018] [Accepted: 10/31/2018] [Indexed: 11/12/2022] Open
Abstract
Successful adaptive behavior requires the learning of associations between stimulus-specific choices and rewarding outcomes. Most research on the mechanisms underlying such processes has focused on subcortical reward-processing regions, in conjunction with frontal circuits. Given the extensive stimulus-specific coding in the sensory cortices, we hypothesized they would play a key role in the learning of stimulus-specific reward associations. We recorded electrical brain activity (using electroencephalogram) during a learning-based decision-making gambling task where, on each trial, participants chose between a face and a house and then received feedback (gain or loss). Within each 20-trial set, either faces or houses were more likely to predict a gain. Results showed that early feedback processing (~200-1200 ms) was independent of the choice made. In contrast, later feedback processing (~1400-1800 ms) was stimulus-specific, reflected by decreased alpha power (reflecting increased cortical activity) over face-selective regions, for winning-vs-losing after a face choice but not after a house choice. Finally, as the reward association was learned in a set, there was an increasingly stronger attentional bias towards the more likely winning stimulus, reflected by increasing attentional orienting-related brain activity and increasing likelihood of choosing that stimulus. These results delineate the processes underlying the updating of stimulus-reward associations during feedback-guided learning, which then guide future attentional allocation and decision-making.
Collapse
Affiliation(s)
- Berry van den Berg
- Center for Cognitive Neuroscience, Duke University, Durham, NC, United States.,Department of Experimental Psychology, Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands.,Department of Social Psychology, Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands
| | - Benjamin R Geib
- Center for Cognitive Neuroscience, Duke University, Durham, NC, United States
| | - Rene San Martín
- Centro de Neuroeconomia, Universidad Diego Portales, Santiago, Chile
| | - Marty G Woldorff
- Center for Cognitive Neuroscience, Duke University, Durham, NC, United States.,Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, United States
| |
Collapse
|
92
|
Wisniewski D, Forstmann B, Brass M. Outcome contingency selectively affects the neural coding of outcomes but not of tasks. Sci Rep 2019; 9:19395. [PMID: 31852993 PMCID: PMC6920387 DOI: 10.1038/s41598-019-55887-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 11/29/2019] [Indexed: 01/07/2023] Open
Abstract
Value-based decision-making is ubiquitous in every-day life, and critically depends on the contingency between choices and their outcomes. Only if outcomes are contingent on our choices can we make meaningful value-based decisions. Here, we investigate the effect of outcome contingency on the neural coding of rewards and tasks. Participants performed a reversal-learning paradigm in which reward outcomes were contingent on trial-by-trial choices, and performed a 'free choice' paradigm in which rewards were random and not contingent on choices. We hypothesized that contingent outcomes enhance the neural coding of rewards and tasks, which was tested using multivariate pattern analysis of fMRI data. Reward outcomes were encoded in a large network including the striatum, dmPFC and parietal cortex, and these representations were indeed amplified for contingent rewards. Tasks were encoded in the dmPFC at the time of decision-making, and in parietal cortex in a subsequent maintenance phase. We found no evidence for contingency-dependent modulations of task signals, demonstrating highly similar coding across contingency conditions. Our findings suggest selective effects of contingency on reward coding only, and further highlight the role of dmPFC and parietal cortex in value-based decision-making, as these were the only regions strongly involved in both reward and task coding.
Collapse
Affiliation(s)
- David Wisniewski
- Department of Experimental Psychology, Ghent University, Ghent, Belgium.
| | - Birte Forstmann
- Integrative Model-Based Cognitive Neuroscience Research Unit, University of Amsterdam, Amsterdam, The Netherlands
| | - Marcel Brass
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
93
|
Medendorp WP, Heed T. State estimation in posterior parietal cortex: Distinct poles of environmental and bodily states. Prog Neurobiol 2019; 183:101691. [DOI: 10.1016/j.pneurobio.2019.101691] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2019] [Revised: 08/12/2019] [Accepted: 08/29/2019] [Indexed: 01/06/2023]
|
94
|
Suriya-Arunroj L, Gail A. Complementary encoding of priors in monkey frontoparietal network supports a dual process of decision-making. eLife 2019; 8:47581. [PMID: 31612855 PMCID: PMC6794075 DOI: 10.7554/elife.47581] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 10/03/2019] [Indexed: 11/13/2022] Open
Abstract
Prior expectations of movement instructions can promote preliminary action planning and influence choices. We investigated how action priors affect action-goal encoding in premotor and parietal cortices and if they bias subsequent free choice. Monkeys planned reaches according to visual cues that indicated relative probabilities of two possible goals. On instructed trials, the reach goal was determined by a secondary cue respecting these probabilities. On rarely interspersed free-choice trials without instruction, both goals offered equal reward. Action priors induced graded free-choice biases and graded frontoparietal motor-goal activity, complementarily in two subclasses of neurons. Down-regulating neurons co-encoded both possible goals and decreased opposite-to-preferred responses with decreasing prior, possibly supporting a process of choice by elimination. Up-regulating neurons showed increased preferred-direction responses with increasing prior, likely supporting a process of computing net likelihood. Action-selection signals emerged earliest in down-regulating neurons of premotor cortex, arguing for an initiation of selection in the frontal lobe.
Collapse
Affiliation(s)
- Lalitta Suriya-Arunroj
- Sensorimotor Group, German Primate Center - Leibniz Institute for Primate Research, Göttingen, Germany
| | - Alexander Gail
- Sensorimotor Group, German Primate Center - Leibniz Institute for Primate Research, Göttingen, Germany.,University of Göttingen, Göttingen, Germany.,Leibniz Science Campus Primate Cognition, Göttingen, Germany.,Bernstein Center for Computational Neuroscience, Göttingen, Germany
| |
Collapse
|
95
|
Zhou Y, Liu Y, Wu S, Zhang M. Neuronal Representation of the Saccadic Timing Signals in Macaque Lateral Intraparietal Area. Cereb Cortex 2019; 28:2887-2900. [PMID: 28968649 DOI: 10.1093/cercor/bhx166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 06/15/2017] [Indexed: 11/13/2022] Open
Abstract
Primates frequently make saccades direct fovea on interesting objects to receive acute visual information. However, saccade displaces the images on retina and disrupts the visual constancy. One possible mechanism to retain visual constancy is by integrating the presaccadic and postsaccadic visual information right at the time of saccade, which makes the timing of saccade crucial. So far, the saccadic timing signals have been found only in the subcortical regions, for example, the cerebellum and superior colliculus, but not in the neocortex. Here we report 2 types of saccadic timing signals in macaque lateral intraparietal area (LIP). First, many presaccadic response neurons started to decline activity either right around the start (saccade-on-decay) or the end (saccade-off-decay) of saccades. Notably, the time difference between saccade-off-decay and saccade-on-decay was highly correlated with the mean duration of saccades but not with the individual ones, and both saccade-off-decay and saccade-on-decay were better aligned with saccade end than saccade start-reflecting prediction. Second, the peak activity plateau of a group of postsaccadic response neurons was highly correlated with the actual duration of saccade-reflecting reality. While the predicted timing signals might facilitate the integration of visual information across saccades in LIP, the actual duration signals might calibrate the prediction errors.
Collapse
Affiliation(s)
- Yang Zhou
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China.,Institute of Neuroscience, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, Shanghai, China.,Department of Neurobiology, The University of Chicago, Chicago, IL, USA
| | - Yining Liu
- The First Affiliated Hospital of Zhengzhou University, Henan, China
| | - Si Wu
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Mingsha Zhang
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| |
Collapse
|
96
|
Reitich-Stolero T, Aberg KC, Paz R. Re-exploring Mechanisms of Exploration. Neuron 2019; 103:360-363. [PMID: 31394060 DOI: 10.1016/j.neuron.2019.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Deciding when to exploit what is already known and when to explore new possibilities is crucial for adapting to novel and dynamic environments. Using reinforcement-based decision making, Costa et al. (2019) in this issue of Neuron find that neurons in the amygdala and ventral-striatum differentially signal the benefit from exploring new options and exploiting familiar ones.
Collapse
Affiliation(s)
| | - Kristoffer C Aberg
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
| | - Rony Paz
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
97
|
Kourtzi Z, Welchman AE. Learning predictive structure without a teacher: decision strategies and brain routes. Curr Opin Neurobiol 2019; 58:130-134. [PMID: 31569060 DOI: 10.1016/j.conb.2019.09.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 09/03/2019] [Accepted: 09/12/2019] [Indexed: 11/17/2022]
Abstract
Extracting the structure of complex environments is at the core of our ability to interpret the present and predict the future. This skill is important for a range of behaviours from navigating a new city to learning music and language. Classical approaches that investigate our ability to extract the principles of organisation that govern complex environments focus on reward-based learning. Yet, the human brain is shown to be expert at learning generative structure based on mere exposure and without explicit reward. Individuals are shown to adapt to-unbeknownst to them-changes in the environment's temporal statistics and predict future events. Further, we present evidence for a common brain architecture for unsupervised structure learning and reward-based learning, suggesting that the brain is built on the premise that 'learning is its own reward' to support adaptive behaviour.
Collapse
Affiliation(s)
- Zoe Kourtzi
- Department of Psychology, University of Cambridge, Cambridge, UK.
| | | |
Collapse
|
98
|
Costa VD, Mitz AR, Averbeck BB. Subcortical Substrates of Explore-Exploit Decisions in Primates. Neuron 2019; 103:533-545.e5. [PMID: 31196672 PMCID: PMC6687547 DOI: 10.1016/j.neuron.2019.05.017] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 03/27/2019] [Accepted: 05/08/2019] [Indexed: 01/06/2023]
Abstract
The explore-exploit dilemma refers to the challenge of deciding when to forego immediate rewards and explore new opportunities that could lead to greater rewards in the future. While motivational neural circuits facilitate learning based on past choices and outcomes, it is unclear whether they also support computations relevant for deciding when to explore. We recorded neural activity in the amygdala and ventral striatum of rhesus macaques as they solved a task that required them to balance novelty-driven exploration with exploitation of what they had already learned. Using a partially observable Markov decision process (POMDP) model to quantify explore-exploit trade-offs, we identified that the ventral striatum and amygdala differ in how they represent the immediate value of exploitative choices and the future value of exploratory choices. These findings show that subcortical motivational circuits are important in guiding explore-exploit decisions.
Collapse
Affiliation(s)
- Vincent D Costa
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institute of Health, Bethesda, MD 20892, USA; Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR 97239, USA; Division of Neuroscience, Oregon National Primate Research Center, Beaverton, OR 97006, USA.
| | - Andrew R Mitz
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institute of Health, Bethesda, MD 20892, USA
| | - Bruno B Averbeck
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institute of Health, Bethesda, MD 20892, USA
| |
Collapse
|
99
|
Grabenhorst F, Tsutsui KI, Kobayashi S, Schultz W. Primate prefrontal neurons signal economic risk derived from the statistics of recent reward experience. eLife 2019; 8:e44838. [PMID: 31343407 PMCID: PMC6658165 DOI: 10.7554/elife.44838] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 07/12/2019] [Indexed: 01/28/2023] Open
Abstract
Risk derives from the variation of rewards and governs economic decisions, yet how the brain calculates risk from the frequency of experienced events, rather than from explicit risk-descriptive cues, remains unclear. Here, we investigated whether neurons in dorsolateral prefrontal cortex process risk derived from reward experience. Monkeys performed in a probabilistic choice task in which the statistical variance of experienced rewards evolved continually. During these choices, prefrontal neurons signaled the reward-variance associated with specific objects ('object risk') or actions ('action risk'). Crucially, risk was not derived from explicit, risk-descriptive cues but calculated internally from the variance of recently experienced rewards. Support-vector-machine decoding demonstrated accurate neuronal risk discrimination. Within trials, neuronal signals transitioned from experienced reward to risk (risk updating) and from risk to upcoming choice (choice computation). Thus, prefrontal neurons encode the statistical variance of recently experienced rewards, complying with formal decision variables of object risk and action risk.
Collapse
Affiliation(s)
- Fabian Grabenhorst
- Department of Physiology, Development and NeuroscienceUniversity of CambridgeCambridgeUnited Kingdom
| | - Ken-Ichiro Tsutsui
- Department of Physiology, Development and NeuroscienceUniversity of CambridgeCambridgeUnited Kingdom
| | - Shunsuke Kobayashi
- Department of Physiology, Development and NeuroscienceUniversity of CambridgeCambridgeUnited Kingdom
| | - Wolfram Schultz
- Department of Physiology, Development and NeuroscienceUniversity of CambridgeCambridgeUnited Kingdom
| |
Collapse
|
100
|
Freedman DJ, Ibos G. An Integrative Framework for Sensory, Motor, and Cognitive Functions of the Posterior Parietal Cortex. Neuron 2019; 97:1219-1234. [PMID: 29566792 DOI: 10.1016/j.neuron.2018.01.044] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Revised: 01/12/2018] [Accepted: 01/23/2018] [Indexed: 11/28/2022]
Abstract
Throughout the history of modern neuroscience, the parietal cortex has been associated with a wide array of sensory, motor, and cognitive functions. The use of non-human primates as a model organism has been instrumental in our current understanding of how areas in the posterior parietal cortex (PPC) modulate our perception and influence our behavior. In this Perspective, we highlight a series of influential studies over the last five decades examining the role of the PPC in visual perception and motor planning. We also integrate long-standing views of PPC functions with more recent evidence to propose a more general model framework to explain integrative sensory, motor, and cognitive functions of the PPC.
Collapse
Affiliation(s)
- David J Freedman
- Department of Neurobiology, The University of Chicago, Chicago, IL 60637, USA; Grossman Institute for Neuroscience, Quantitative Biology and Human Behavior, The University of Chicago, Chicago, IL 60637, USA.
| | - Guilhem Ibos
- Department of Neurobiology, The University of Chicago, Chicago, IL 60637, USA; Institut de Neuroscience de la Timone, UMR 7289 CNRS & Aix-Marseille Université, Marseille, France.
| |
Collapse
|