1
|
Zimmerman CA, Pan-Vazquez A, Wu B, Keppler EF, Guthman EM, Fetcho RN, Bolkan SS, McMannon B, Lee J, Hoag AT, Lynch LA, Janarthanan SR, López Luna JF, Bondy AG, Falkner AL, Wang SSH, Witten IB. A neural mechanism for learning from delayed postingestive feedback. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.06.561214. [PMID: 37873112 PMCID: PMC10592633 DOI: 10.1101/2023.10.06.561214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Animals learn the value of foods based on their postingestive effects and thereby develop aversions to foods that are toxic1-6 and preferences to those that are nutritious7-14. However, it remains unclear how the brain is able to assign credit to flavors experienced during a meal with postingestive feedback signals that can arise after a substantial delay. Here, we reveal an unexpected role for postingestive reactivation of neural flavor representations in this temporal credit assignment process. To begin, we leverage the fact that mice learn to associate novel15-18, but not familiar, flavors with delayed gastric malaise signals to investigate how the brain represents flavors that support aversive postingestive learning. Surveying cellular resolution brainwide activation patterns reveals that a network of amygdala regions is unique in being preferentially activated by novel flavors across every stage of the learning process: the initial meal, delayed malaise, and memory retrieval. By combining high-density recordings in the amygdala with optogenetic stimulation of genetically defined hindbrain malaise cells, we find that postingestive malaise signals potently and specifically reactivate amygdalar novel flavor representations from a recent meal. The degree of malaise-driven reactivation of individual neurons predicts strengthening of flavor responses upon memory retrieval, leading to stabilization of the population-level representation of the recently consumed flavor. In contrast, meals without postingestive consequences degrade neural flavor representations as flavors become familiar and safe. Thus, our findings demonstrate that interoceptive reactivation of amygdalar flavor representations provides a neural mechanism to resolve the temporal credit assignment problem inherent to postingestive learning.
Collapse
Affiliation(s)
| | | | - Bichan Wu
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Emma F Keppler
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Eartha Mae Guthman
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Robert N Fetcho
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Scott S Bolkan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Brenna McMannon
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Junuk Lee
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Austin T Hoag
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Laura A Lynch
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | | | - Juan F López Luna
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Adrian G Bondy
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Annegret L Falkner
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Samuel S-H Wang
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| |
Collapse
|
2
|
Parker NF, Baidya A, Cox J, Haetzel LM, Zhukovskaya A, Murugan M, Engelhard B, Goldman MS, Witten IB. Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning. Cell Rep 2022; 39:110756. [PMID: 35584665 PMCID: PMC9218875 DOI: 10.1016/j.celrep.2022.110756] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 02/18/2022] [Accepted: 04/07/2022] [Indexed: 11/25/2022] Open
Abstract
How are actions linked with subsequent outcomes to guide choices? The nucleus accumbens, which is implicated in this process, receives glutamatergic inputs from the prelimbic cortex and midline regions of the thalamus. However, little is known about whether and how representations differ across these input pathways. By comparing these inputs during a reinforcement learning task in mice, we discovered that prelimbic cortical inputs preferentially represent actions and choices, whereas midline thalamic inputs preferentially represent cues. Choice-selective activity in the prelimbic cortical inputs is organized in sequences that persist beyond the outcome. Through computational modeling, we demonstrate that these sequences can support the neural implementation of reinforcement-learning algorithms, in both a circuit model based on synaptic plasticity and one based on neural dynamics. Finally, we test and confirm a prediction of our circuit models by direct manipulation of nucleus accumbens input neurons.
Collapse
Affiliation(s)
- Nathan F Parker
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Avinash Baidya
- Center for Neuroscience, University of California, Davis, Davis, CA 95616, USA; Department of Physics and Astronomy, University of California, Davis, Davis, CA 95616, USA
| | - Julia Cox
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Laura M Haetzel
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Anna Zhukovskaya
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Malavika Murugan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Ben Engelhard
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Mark S Goldman
- Center for Neuroscience, University of California, Davis, Davis, CA 95616, USA; Department of Neurobiology, Physiology and Behavior, University of California, Davis, Davis, CA 95616, USA; Department of Ophthalmology and Vision Science, University of California, Davis, Davis, CA 95616, USA.
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Psychology, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
3
|
Taghizadeh B, Foley NC, Karimimehr S, Cohanpour M, Semework M, Sheth SA, Lashgari R, Gottlieb J. Reward uncertainty asymmetrically affects information transmission within the monkey fronto-parietal network. Commun Biol 2020; 3:594. [PMID: 33087809 PMCID: PMC7578031 DOI: 10.1038/s42003-020-01320-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 09/25/2020] [Indexed: 01/02/2023] Open
Abstract
A central hypothesis in research on executive function is that controlled information processing is costly and is allocated according to the behavioral benefits it brings. However, while computational theories predict that the benefits of new information depend on prior uncertainty, the cellular effects of uncertainty on the executive network are incompletely understood. Using simultaneous recordings in monkeys, we describe several mechanisms by which the fronto-parietal network reacts to uncertainty. We show that the variance of expected rewards, independently of the value of the rewards, was encoded in single neuron and population spiking activity and local field potential (LFP) oscillations, and, importantly, asymmetrically affected fronto-parietal information transmission (measured through the coherence between spikes and LFPs). Higher uncertainty selectively enhanced information transmission from the parietal to the frontal lobe and suppressed it in the opposite direction, consistent with Bayesian principles that prioritize sensory information according to a decision maker’s prior uncertainty. Bahareh Taghizadeh and Nicholas Foley et al. show that individual neuronal responses, population spiking activity, and local field potential oscillations encode the variance of expected rewards independent of their value. They also demonstrate that reward uncertainty asymmetrically affects neuronal transmission within the monkey fronto-parietal network.
Collapse
Affiliation(s)
- Bahareh Taghizadeh
- Brain Engineering Research Center, Institute for Research in Fundamental Sciences, Tehran, Iran.,School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Nicholas C Foley
- Department of Neuroscience, Columbia University, New York, NY, USA.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Saeed Karimimehr
- Brain Engineering Research Center, Institute for Research in Fundamental Sciences, Tehran, Iran.,School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Michael Cohanpour
- Department of Neuroscience, Columbia University, New York, NY, USA.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Mulugeta Semework
- Department of Neuroscience, Columbia University, New York, NY, USA.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Sameer A Sheth
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Reza Lashgari
- Brain Engineering Research Center, Institute for Research in Fundamental Sciences, Tehran, Iran.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Jacqueline Gottlieb
- Department of Neuroscience, Columbia University, New York, NY, USA. .,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA. .,The Kavli Institute for Brain Science, Columbia University, New York, NY, USA.
| |
Collapse
|
4
|
|
5
|
Stolyarova A. Solving the Credit Assignment Problem With the Prefrontal Cortex. Front Neurosci 2018; 12:182. [PMID: 29636659 PMCID: PMC5881225 DOI: 10.3389/fnins.2018.00182] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 03/06/2018] [Indexed: 12/13/2022] Open
Abstract
In naturalistic multi-cue and multi-step learning tasks, where outcomes of behavior are delayed in time, discovering which choices are responsible for rewards can present a challenge, known as the credit assignment problem. In this review, I summarize recent work that highlighted a critical role for the prefrontal cortex (PFC) in assigning credit where it is due in tasks where only a few of the multitude of cues or choices are relevant to the final outcome of behavior. Collectively, these investigations have provided compelling support for specialized roles of the orbitofrontal (OFC), anterior cingulate (ACC), and dorsolateral prefrontal (dlPFC) cortices in contingent learning. However, recent work has similarly revealed shared contributions and emphasized rich and heterogeneous response properties of neurons in these brain regions. Such functional overlap is not surprising given the complexity of reciprocal projections spanning the PFC. In the concluding section, I overview the evidence suggesting that the OFC, ACC and dlPFC communicate extensively, sharing the information about presented options, executed decisions and received rewards, which enables them to assign credit for outcomes to choices on which they are contingent. This account suggests that lesion or inactivation/inhibition experiments targeting a localized PFC subregion will be insufficient to gain a fine-grained understanding of credit assignment during learning and instead poses refined questions for future research, shifting the focus from focal manipulations to experimental techniques targeting cortico-cortical projections.
Collapse
Affiliation(s)
- Alexandra Stolyarova
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
6
|
Abstract
To adapt successfully to our environments, we must use the outcomes of our choices to guide future behavior. Critically, we must be able to correctly assign credit for any particular outcome to the causal features which preceded it. In some cases, the causal features may be immediately evident, whereas in others they may be separated in time or intermingled with irrelevant environmental stimuli, creating a potentially nontrivial credit-assignment problem. We examined the neuronal representation of information relevant for credit assignment in the dorsolateral prefrontal cortex (dlPFC) of two male rhesus macaques performing a task that elicited key aspects of this problem. We found that neurons conveyed the information necessary for credit assignment. Specifically, neuronal activity reflected both the relevant cues and outcomes at the time of feedback and did so in a manner that was stable over time, in contrast to prior reports of representational instability in the dlPFC. Furthermore, these representations were most stable early in learning, when credit assignment was most needed. When the same features were not needed for credit assignment, these neuronal representations were much weaker or absent. These results demonstrate that the activity of dlPFC neurons conforms to the basic requirements of a system that performs credit assignment, and that spiking activity can serve as a stable mechanism that links causes and effects.SIGNIFICANCE STATEMENT Credit assignment is the process by which we infer the causes of our successes and failures. We found that neuronal activity in the dorsolateral prefrontal cortex conveyed the necessary information for performing credit assignment. Importantly, while there are various potential mechanisms to retain a "trace" of the causal events over time, we observed that spiking activity was sufficiently stable to act as the link between causes and effects, in contrast to prior reports that suggested spiking representations were unstable over time. In addition, we observed that this stability varied as a function of learning, such that the neural code was more reliable over time during early learning, when it was most needed.
Collapse
|
7
|
Jazayeri M, Shadlen MN. A Neural Mechanism for Sensing and Reproducing a Time Interval. Curr Biol 2015; 25:2599-609. [PMID: 26455307 DOI: 10.1016/j.cub.2015.08.038] [Citation(s) in RCA: 114] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 08/15/2015] [Accepted: 08/17/2015] [Indexed: 11/28/2022]
Abstract
Timing plays a crucial role in sensorimotor function. However, the neural mechanisms that enable the brain to flexibly measure and reproduce time intervals are not known. We recorded neural activity in parietal cortex of monkeys in a time reproduction task. Monkeys were trained to measure and immediately afterward reproduce different sample intervals. While measuring an interval, neural responses had a nonlinear profile that increased with the duration of the sample interval. Activity was reset during the transition from measurement to production and was followed by a ramping activity whose slope encoded the previously measured sample interval. We found that firing rates at the end of the measurement epoch were correlated with both the slope of the ramp and the monkey's corresponding production interval on a trial-by-trial basis. Analysis of response dynamics further linked the rate of change of firing rates in the measurement epoch to the slope of the ramp in the production epoch. These observations suggest that, during time reproduction, an interval is measured prospectively in relation to the desired motor plan to reproduce that interval.
Collapse
Affiliation(s)
- Mehrdad Jazayeri
- Department of Brain and Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Michael N Shadlen
- Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Kavli Institute of Brain Science, and Howard Hughes Medical Institute, Columbia University, New York, NY 10032, USA
| |
Collapse
|
8
|
Hernádi I, Grabenhorst F, Schultz W. Planning activity for internally generated reward goals in monkey amygdala neurons. Nat Neurosci 2015; 18:461-9. [PMID: 25622146 PMCID: PMC4340753 DOI: 10.1038/nn.3925] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Accepted: 12/15/2014] [Indexed: 12/12/2022]
Abstract
The best rewards are often distant and can only be achieved by planning and decision-making over several steps. We designed a multi-step choice task in which monkeys followed internal plans to save rewards toward self-defined goals. During this self-controlled behavior, amygdala neurons showed future-oriented activity that reflected the animal's plan to obtain specific rewards several trials ahead. This prospective activity encoded crucial components of the animal's plan, including value and length of the planned choice sequence. It began on initial trials when a plan would be formed, reappeared step by step until reward receipt, and readily updated with a new sequence. It predicted performance, including errors, and typically disappeared during instructed behavior. Such prospective activity could underlie the formation and pursuit of internal plans characteristic of goal-directed behavior. The existence of neuronal planning activity in the amygdala suggests that this structure is important in guiding behavior toward internally generated, distant goals.
Collapse
Affiliation(s)
- István Hernádi
- Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3DY, UK
| | - Fabian Grabenhorst
- Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3DY, UK
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3DY, UK
| |
Collapse
|