1
|
Carvalho W, Tomov MS, de Cothi W, Barry C, Gershman SJ. Predictive Representations: Building Blocks of Intelligence. Neural Comput 2024; 36:2225-2298. [PMID: 39212963 DOI: 10.1162/neco_a_01705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 06/10/2024] [Indexed: 09/04/2024]
Abstract
Adaptive behavior often requires predicting future events. The theory of reinforcement learning prescribes what kinds of predictive representations are useful and how to compute them. This review integrates these theoretical ideas with work on cognition and neuroscience. We pay special attention to the successor representation and its generalizations, which have been widely applied as both engineering tools and models of brain function. This convergence suggests that particular kinds of predictive representations may function as versatile building blocks of intelligence.
Collapse
Affiliation(s)
- Wilka Carvalho
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA 02134, U.S.A.
| | - Momchil S Tomov
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02134, U.S.A
- Motional AD LLC, Boston, MA 02210, U.S.A.
| | - William de Cothi
- Department of Cell and Developmental Biology, University College London, London WC1E 7JE, U.K.
| | - Caswell Barry
- Department of Cell and Developmental Biology, University College London, London WC1E 7JE, U.K.
| | - Samuel J Gershman
- Kempner Institute for the Study of Natural and Artificial Intelligence, and Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02134, U.S.A
- Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02139, U.S.A.
| |
Collapse
|
2
|
Gershman SJ, Assad JA, Datta SR, Linderman SW, Sabatini BL, Uchida N, Wilbrecht L. Explaining dopamine through prediction errors and beyond. Nat Neurosci 2024; 27:1645-1655. [PMID: 39054370 DOI: 10.1038/s41593-024-01705-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
The most influential account of phasic dopamine holds that it reports reward prediction errors (RPEs). The RPE-based interpretation of dopamine signaling is, in its original form, probably too simple and fails to explain all the properties of phasic dopamine observed in behaving animals. This Perspective helps to resolve some of the conflicting interpretations of dopamine that currently exist in the literature. We focus on the following three empirical challenges to the RPE theory of dopamine: why does dopamine (1) ramp up as animals approach rewards, (2) respond to sensory and motor features and (3) influence action selection? We argue that the prediction error concept, once it has been suitably modified and generalized based on an analysis of each computational problem, answers each challenge. Nonetheless, there are a number of additional empirical findings that appear to demand fundamentally different theoretical explanations beyond encoding RPE. Therefore, looking forward, we discuss the prospects for a unifying theory that respects the diversity of dopamine signaling and function as well as the complex circuitry that both underlies and responds to dopaminergic transmission.
Collapse
Affiliation(s)
- Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| | - John A Assad
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | | | - Scott W Linderman
- Department of Statistics and Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Bernardo L Sabatini
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Linda Wilbrecht
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
| |
Collapse
|
3
|
Rahman S, Terao K, Hashimoto K, Mizunami M. Independent operations of appetitive and aversive conditioning systems lead to simultaneous production of conflicting memories in an insect. Proc Biol Sci 2024; 291:20241273. [PMID: 39317316 PMCID: PMC11421932 DOI: 10.1098/rspb.2024.1273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 08/04/2024] [Accepted: 08/20/2024] [Indexed: 09/26/2024] Open
Abstract
Pavlovian conditioning is a ubiquitous form of associative learning that enables animals to remember appetitive and aversive experiences. Animals possess appetitive and aversive conditioning systems that memorize and retrieve appetitive and aversive experiences. Here, we addressed a question of whether integration of competing appetitive and aversive information takes place during the encoding of the experience or during memory retrieval. We developed novel experimental procedures to address this question using crickets (Gryllus bimaculatus), which allowed selective blockade of the expression of appetitive and aversive memories by injecting octopamine and dopamine receptor antagonists. We conditioned an odour (conditioned stimulus 1, CS1) with water and then with sodium chloride solution. At 24 h after conditioning, crickets retained both appetitive and aversive memories, and the memories were integrated to produce a conditioned response (CR). Importantly, when a visual pattern (CS2) was conditioned with CS1, appetitive and aversive memories formed simultaneously. This indicates that appetitive and aversive second-order conditionings are achieved at the same time. The memories were integrated for producing a conditioned response. We conclude that appetitive and aversive conditioning systems operate independently to form parallel appetitive and aversive memories, which compete to produce learned behaviour in crickets.
Collapse
Affiliation(s)
- Sadniman Rahman
- Graduate School of Life Science, Hokkaido University , Sapporo 060-0810, Japan
| | - Kanta Terao
- Academic Assembly Institute of Science and Engineering, Shimane University , Matsue, Shimane 690-8504, Japan
| | - Kohei Hashimoto
- Graduate School of Life Science, Hokkaido University , Sapporo 060-0810, Japan
| | - Makoto Mizunami
- Research Institute for Electric Science, Hokkaido University , Sapporo 060-0812, Japan
- Faculty of Science, Hokkaido University , Sapporo 060-0810, Japan
| |
Collapse
|
4
|
Delaney J, Nathani S, Tan V, Chavez C, Orr A, Paek J, Faraji M, Setlow B, Urs NM. Enhanced cognitive flexibility and phasic striatal dopamine dynamics in a mouse model of low striatal tonic dopamine. Neuropsychopharmacology 2024; 49:1600-1608. [PMID: 38698264 PMCID: PMC11319590 DOI: 10.1038/s41386-024-01868-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/08/2024] [Accepted: 04/12/2024] [Indexed: 05/05/2024]
Abstract
The catecholamine neuromodulators dopamine and norepinephrine are implicated in motor function, motivation, and cognition. Although roles for striatal dopamine in these aspects of behavior are well established, the specific roles for cortical catecholamines in regulating striatal dopamine dynamics and behavior are less clear. We recently showed that elevating cortical dopamine but not norepinephrine suppresses hyperactivity in dopamine transporter knockout (DAT-KO) mice, which have elevated striatal dopamine levels. In contrast, norepinephrine transporter knockout (NET-KO) mice have a phenotype distinct from DAT-KO mice, as they show elevated extracellular cortical catecholamines but reduced baseline striatal dopamine levels. Here we evaluated the consequences of altered catecholamine levels in NET-KO mice on cognitive flexibility and striatal dopamine dynamics. In a probabilistic reversal learning task, NET-KO mice showed enhanced reversal learning, which was consistent with larger phasic dopamine transients (dLight) in the dorsomedial striatum (DMS) during reward delivery and reward omission, compared to WT controls. Selective depletion of dorsal medial prefrontal cortex (mPFC) norepinephrine in WT mice did not alter performance on the reversal learning task but reduced nestlet shredding. Surprisingly, NET-KO mice did not show altered breakpoints in a progressive ratio task, suggesting intact food motivation. Collectively, these studies show novel roles of cortical catecholamines in the regulation of tonic and phasic striatal dopamine dynamics and cognitive flexibility, updating our current views on dopamine regulation and informing future therapeutic strategies to counter multiple psychiatric disorders.
Collapse
Affiliation(s)
- Jena Delaney
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Sanya Nathani
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Victor Tan
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Carson Chavez
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Alexander Orr
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Joon Paek
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA
| | - Mojdeh Faraji
- Department of Psychiatry, University of Florida, Gainesville, FL, 32610, USA
| | - Barry Setlow
- Department of Psychiatry, University of Florida, Gainesville, FL, 32610, USA
| | - Nikhil M Urs
- Department of Pharmacology and Therapeutics, University of Florida, Gainesville, FL, 32610, USA.
| |
Collapse
|
5
|
Chalmers E, Luczak A. A bio-inspired reinforcement learning model that accounts for fast adaptation after punishment. Neurobiol Learn Mem 2024; 215:107974. [PMID: 39209018 DOI: 10.1016/j.nlm.2024.107974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 08/14/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Humans and animals can quickly learn a new strategy when a previously-rewarding strategy is punished. It is difficult to model this with reinforcement learning methods, because they tend to perseverate on previously-learned strategies - a hallmark of impaired response to punishment. Past work has addressed this by augmenting conventional reinforcement learning equations with ad hoc parameters or parallel learning systems. This produces reinforcement learning models that account for reversal learning, but are more abstract, complex, and somewhat detached from neural substrates. Here we use a different approach: we generalize a recently-discovered neuron-level learning rule, on the assumption that it captures a basic principle of learning that may occur at the whole-brain-level. Surprisingly, this gives a new reinforcement learning rule that accounts for adaptation and lose-shift behavior, and uses only the same parameters as conventional reinforcement learning equations. In the new rule, the normal reward prediction errors that drive reinforcement learning are scaled by the likelihood the agent assigns to the action that triggered a reward or punishment. The new rule demonstrates quick adaptation in card sorting and variable Iowa gambling tasks, and also exhibits a human-like paradox-of-choice effect. It will be useful for experimental researchers modeling learning and behavior.
Collapse
Affiliation(s)
- Eric Chalmers
- Department of Mathematics and Computing, Mount Royal University, 4825 Mt Royal Gate SW, Calgary, AB T3E 6K6, Canada.
| | - Artur Luczak
- Canadian Center for Behavioral Neuroscience, University of Lethbridge4401 University Dr W, Lethbridge, AB T1K 3M4, Canada.
| |
Collapse
|
6
|
Robke R, Arbab T, Smith R, Willuhn I. Value-Driven Adaptations of Mesolimbic Dopamine Release Are Governed by Both Model-Based and Model-Free Mechanisms. eNeuro 2024; 11:ENEURO.0223-24.2024. [PMID: 38918053 PMCID: PMC11223458 DOI: 10.1523/eneuro.0223-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 05/29/2024] [Indexed: 06/27/2024] Open
Abstract
The magnitude of dopamine signals elicited by rewarding events and their predictors is updated when reward value changes. It is actively debated how readily these dopamine signals adapt and whether adaptation aligns with model-free or model-based reinforcement-learning principles. To investigate this, we trained male rats in a pavlovian-conditioning paradigm and measured dopamine release in the nucleus accumbens core in response to food reward (unconditioned stimulus) and reward-predictive conditioned stimuli (CS), both before and after reward devaluation, induced via either sensory-specific or nonspecific satiety. We demonstrate that (1) such devaluation reduces CS-induced dopamine release rapidly, without additional pairing of CS with devalued reward and irrespective of whether the devaluation was sensory-specific or nonspecific. In contrast, (2) reward devaluation did not decrease food reward-induced dopamine release. Surprisingly, (3) postdevaluation reconditioning, by additional pairing of CS with devalued reward, rapidly reinstated CS-induced dopamine signals to predevaluation levels. Taken together, we identify distinct, divergent adaptations in dopamine-signal magnitude when reward value is decreased: CS dopamine diminishes but reinstates fast, whereas reward dopamine is resistant to change. This implies that, respective to abovementioned findings, (1) CS dopamine may be governed by a model-based mechanism and (2) reward dopamine by a model-free one, where (3) the latter may contribute to swift reinstatement of the former. However, changes in CS dopamine were not selective for sensory specificity of reward devaluation, which is inconsistent with model-based processes. Thus, mesolimbic dopamine signaling incorporates both model-free and model-based mechanisms and is not exclusively governed by either.
Collapse
Affiliation(s)
- Rhiannon Robke
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Amsterdam 1105BA, The Netherlands
- Department of Psychiatry, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam 1105AZ, The Netherlands
| | - Tara Arbab
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Amsterdam 1105BA, The Netherlands
- Department of Psychiatry, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam 1105AZ, The Netherlands
| | - Rachel Smith
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Amsterdam 1105BA, The Netherlands
| | - Ingo Willuhn
- The Netherlands Institute for Neuroscience, Royal Netherlands Academy of Arts and Sciences, Amsterdam 1105BA, The Netherlands
- Department of Psychiatry, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam 1105AZ, The Netherlands
| |
Collapse
|
7
|
Schütt HH, Kim D, Ma WJ. Reward prediction error neurons implement an efficient code for reward. Nat Neurosci 2024; 27:1333-1339. [PMID: 38898182 DOI: 10.1038/s41593-024-01671-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 04/29/2024] [Indexed: 06/21/2024]
Abstract
We use efficient coding principles borrowed from sensory neuroscience to derive the optimal neural population to encode a reward distribution. We show that the responses of dopaminergic reward prediction error neurons in mouse and macaque are similar to those of the efficient code in the following ways: the neurons have a broad distribution of midpoints covering the reward distribution; neurons with higher thresholds have higher gains, more convex tuning functions and lower slopes; and their slope is higher when the reward distribution is narrower. Furthermore, we derive learning rules that converge to the efficient code. The learning rule for the position of the neuron on the reward axis closely resembles distributional reinforcement learning. Thus, reward prediction error neuron responses may be optimized to broadcast an efficient reward signal, forming a connection between efficient coding and reinforcement learning, two of the most successful theories in computational neuroscience.
Collapse
Affiliation(s)
- Heiko H Schütt
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA.
- Department of Behavioural and Cognitive Sciences, Université du Luxembourg, Esch-Belval, Luxembourg.
| | - Dongjae Kim
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
- Department of AI-Based Convergence, Dankook University, Yongin, Republic of Korea
| | - Wei Ji Ma
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
| |
Collapse
|
8
|
Cowan RL, Davis T, Kundu B, Rahimpour S, Rolston JD, Smith EH. More widespread and rigid neuronal representation of reward expectation underlies impulsive choices. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.588637. [PMID: 38645037 PMCID: PMC11030340 DOI: 10.1101/2024.04.11.588637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Impulsive choices prioritize smaller, more immediate rewards over larger, delayed, or potentially uncertain rewards. Impulsive choices are a critical aspect of substance use disorders and maladaptive decision-making across the lifespan. Here, we sought to understand the neuronal underpinnings of expected reward and risk estimation on a trial-by-trial basis during impulsive choices. To do so, we acquired electrical recordings from the human brain while participants carried out a risky decision-making task designed to measure choice impulsivity. Behaviorally, we found a reward-accuracy tradeoff, whereby more impulsive choosers were more accurate at the task, opting for a more immediate reward while compromising overall task performance. We then examined how neuronal populations across frontal, temporal, and limbic brain regions parametrically encoded reinforcement learning model variables, namely reward and risk expectation and surprise, across trials. We found more widespread representations of reward value expectation and prediction error in more impulsive choosers, whereas less impulsive choosers preferentially represented risk expectation. A regional analysis of reward and risk encoding highlighted the anterior cingulate cortex for value expectation, the anterior insula for risk expectation and surprise, and distinct regional encoding between impulsivity groups. Beyond describing trial-by-trial population neuronal representations of reward and risk variables, these results suggest impaired inhibitory control and model-free learning underpinnings of impulsive choice. These findings shed light on neural processes underlying reinforced learning and decision-making in uncertain environments and how these processes may function in psychiatric disorders.
Collapse
Affiliation(s)
- Rhiannon L Cowan
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Tyler Davis
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Bornali Kundu
- Department of Neurosurgery, University of Missouri, Columbia, MO 65212, USA
| | - Shervin Rahimpour
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - John D Rolston
- Department of Neurosurgery, Brigham & Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Elliot H Smith
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| |
Collapse
|
9
|
Qian L, Burrell M, Hennig JA, Matias S, Murthy VN, Gershman SJ, Uchida N. The role of prospective contingency in the control of behavior and dopamine signals during associative learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.05.578961. [PMID: 38370735 PMCID: PMC10871210 DOI: 10.1101/2024.02.05.578961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Associative learning depends on contingency, the degree to which a stimulus predicts an outcome. Despite its importance, the neural mechanisms linking contingency to behavior remain elusive. Here we examined the dopamine activity in the ventral striatum - a signal implicated in associative learning - in a Pavlovian contingency degradation task in mice. We show that both anticipatory licking and dopamine responses to a conditioned stimulus decreased when additional rewards were delivered uncued, but remained unchanged if additional rewards were cued. These results conflict with contingency-based accounts using a traditional definition of contingency or a novel causal learning model (ANCCR), but can be explained by temporal difference (TD) learning models equipped with an appropriate inter-trial-interval (ITI) state representation. Recurrent neural networks trained within a TD framework develop state representations like our best 'handcrafted' model. Our findings suggest that the TD error can be a measure that describes both contingency and dopaminergic activity.
Collapse
Affiliation(s)
- Lechen Qian
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Mark Burrell
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- These authors contributed equally
| | - Jay A. Hennig
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Sara Matias
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Venkatesh. N. Murthy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Samuel J. Gershman
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Center for Brain Science, Harvard University, Cambridge, MA, USA
| |
Collapse
|
10
|
Shikano Y, Yagishita S, Tanaka KF, Takata N. Slow-rising and fast-falling dopaminergic dynamics jointly adjust negative prediction error in the ventral striatum. Eur J Neurosci 2023; 58:4502-4522. [PMID: 36843200 DOI: 10.1111/ejn.15945] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 02/22/2023] [Indexed: 02/28/2023]
Abstract
The greater the reward expectations are, the more different the brain's physiological response will be. Although it is well-documented that better-than-expected outcomes are encoded quantitatively via midbrain dopaminergic (DA) activity, it has been less addressed experimentally whether worse-than-expected outcomes are expressed quantitatively as well. We show that larger reward expectations upon unexpected reward omissions are associated with the preceding slower rise and following larger decrease (DA dip) in the DA concentration at the ventral striatum of mice. We set up a lever press task on a fixed ratio (FR) schedule requiring five lever presses as an effort for a food reward (FR5). The mice occasionally checked the food magazine without a reward before completing the task. The percentage of this premature magazine entry (PME) increased as the number of lever presses approached five, showing rising expectations with increasing proximity to task completion, and hence greater reward expectations. Fibre photometry of extracellular DA dynamics in the ventral striatum using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m ) revealed that the slow increase and fast decrease in DA levels around PMEs were correlated with the PME percentage, demonstrating a monotonic relationship between the DA dip amplitude and degree of expectations. Computational modelling of the lever press task implementing temporal difference errors and state transitions replicated the observed correlation between the PME frequency and DA dip amplitude in the FR5 task. Taken together, these findings indicate that the DA dip amplitude represents the degree of reward expectations monotonically, which may guide behavioural adjustment.
Collapse
Affiliation(s)
- Yu Shikano
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Sho Yagishita
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Kenji F Tanaka
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Norio Takata
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
11
|
Luján MÁ, Covey DP, Young-Morrison R, Zhang L, Kim A, Morgado F, Patel S, Bass CE, Paladini C, Cheer JF. Mobilization of endocannabinoids by midbrain dopamine neurons is required for the encoding of reward prediction. Nat Commun 2023; 14:7545. [PMID: 37985770 PMCID: PMC10662422 DOI: 10.1038/s41467-023-43131-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 11/01/2023] [Indexed: 11/22/2023] Open
Abstract
Brain levels of the endocannabinoid 2-arachidonoylglycerol (2-AG) shape motivated behavior and nucleus accumbens (NAc) dopamine release. However, it is not clear whether mobilization of 2-AG specifically from midbrain dopamine neurons is necessary for dopaminergic responses to external stimuli predicting forthcoming reward. Here, we use a viral-genetic strategy to prevent the expression of the 2-AG-synthesizing enzyme diacylglycerol lipase α (DGLα) from ventral tegmental area (VTA) dopamine cells in adult mice. We find that DGLα deletion from VTA dopamine neurons prevents depolarization-induced suppression of excitation (DSE), a form of 2-AG-mediated synaptic plasticity, in dopamine neurons. DGLα deletion also decreases effortful, cue-driven reward-seeking but has no effect on non-cued or low-effort operant tasks and other behaviors. Moreover, dopamine recording in the NAc reveals that deletion of DGLα impairs the transfer of accumbal dopamine signaling from a reward to its earliest predictors. These results demonstrate that 2-AG mobilization from VTA dopamine neurons is a necessary step for the generation of dopamine-based predictive associations that are required to direct and energize reward-oriented behavior.
Collapse
Affiliation(s)
- Miguel Á Luján
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Dan P Covey
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Neuroscience, Lovelace Biomedical Research Institute, Albuquerque, NM, USA
| | - Reana Young-Morrison
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - LanYuan Zhang
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Andrew Kim
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Fiorella Morgado
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Sachin Patel
- Northwestern Center for Psychiatric Neuroscience, Department of Psychiatry and Behavioral Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Caroline E Bass
- Department of Pharmacology and Toxicology, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Carlos Paladini
- UTSA Neuroscience Institute, University of Texas at San Antonio, San Antonio, TX, USA
| | - Joseph F Cheer
- Department of Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA.
- Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
12
|
Stetsenko A, Koos T. Neuronal implementation of the temporal difference learning algorithm in the midbrain dopaminergic system. Proc Natl Acad Sci U S A 2023; 120:e2309015120. [PMID: 37903252 PMCID: PMC10636325 DOI: 10.1073/pnas.2309015120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 09/29/2023] [Indexed: 11/01/2023] Open
Abstract
The temporal difference learning (TDL) algorithm has been essential to conceptualizing the role of dopamine in reinforcement learning (RL). Despite its theoretical importance, it remains unknown whether a neuronal implementation of this algorithm exists in the brain. Here, we provide an interpretation of the recently described signaling properties of ventral tegmental area (VTA) GABAergic neurons and show that a circuitry of these neurons implements the TDL algorithm. Specifically, we identified the neuronal mechanism of three key components of the TDL model: a sustained state value signal encoded by an afferent input to the VTA, a temporal differentiation circuit formed by two types of VTA GABAergic neurons the combined output of which computes momentary reward prediction (RP) as the derivative of the state value, and the computation of reward prediction errors (RPEs) in dopamine neurons utilizing the output of the differentiation circuit. Using computational methods, we also show that this mechanism is optimally adapted to the biophysics of RPE signaling in dopamine neurons, mechanistically links the emergence of conditioned reinforcement to RP, and can naturally account for the temporal discounting of reinforcement. Elucidating the implementation of the TDL algorithm may further the investigation of RL in biological and artificial systems.
Collapse
Affiliation(s)
- Anya Stetsenko
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ07102
| | - Tibor Koos
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ07102
| |
Collapse
|
13
|
Lloyd K, Dayan P. Reframing dopamine: A controlled controller at the limbic-motor interface. PLoS Comput Biol 2023; 19:e1011569. [PMID: 37847681 PMCID: PMC10610519 DOI: 10.1371/journal.pcbi.1011569] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 10/27/2023] [Accepted: 10/03/2023] [Indexed: 10/19/2023] Open
Abstract
Pavlovian influences notoriously interfere with operant behaviour. Evidence suggests this interference sometimes coincides with the release of the neuromodulator dopamine in the nucleus accumbens. Suppressing such interference is one of the targets of cognitive control. Here, using the examples of active avoidance and omission behaviour, we examine the possibility that direct manipulation of the dopamine signal is an instrument of control itself. In particular, when instrumental and Pavlovian influences come into conflict, dopamine levels might be affected by the controlled deployment of a reframing mechanism that recasts the prospect of possible punishment as an opportunity to approach safety, and the prospect of future reward in terms of a possible loss of that reward. We operationalize this reframing mechanism and fit the resulting model to rodent behaviour from two paradigmatic experiments in which accumbens dopamine release was also measured. We show that in addition to matching animals' behaviour, the model predicts dopamine transients that capture some key features of observed dopamine release at the time of discriminative cues, supporting the idea that modulation of this neuromodulator is amongst the repertoire of cognitive control strategies.
Collapse
Affiliation(s)
- Kevin Lloyd
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- University of Tübingen, Tübingen, Germany
| |
Collapse
|
14
|
Takahashi YK, Zhang Z, Montesinos-Cartegena M, Kahnt T, Langdon AJ, Schoenbaum G. Expectancy-related changes in firing of dopamine neurons depend on hippocampus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.19.549728. [PMID: 37781610 PMCID: PMC10541105 DOI: 10.1101/2023.07.19.549728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
The orbitofrontal cortex (OFC) and hippocampus (HC) are both implicated in forming the cognitive or task maps that support flexible behavior. Previously, we used the dopamine neurons as a sensor or tool to measure the functional effects of OFC lesions (Takahashi et al., 2011). We recorded midbrain dopamine neurons as rats performed an odor-based choice task, in which errors in the prediction of reward were induced by manipulating the number or timing of the expected rewards across blocks of trials. We found that OFC lesions ipsilateral to the recording electrodes caused prediction errors to be degraded consistent with a loss in the resolution of the task states, particularly under conditions where hidden information was critical to sharpening the predictions. Here we have repeated this experiment, along with computational modeling of the results, in rats with ipsilateral HC lesions. The results show HC also shapes the map of our task, however unlike OFC, which provides information local to the trial, the HC appears to be necessary for estimating the upper-level hidden states based on the information that is discontinuous or separated by longer timescales. The results contrast the respective roles of the OFC and HC in cognitive mapping and add to evidence that the dopamine neurons access a rich information set from distributed regions regarding the predictive structure of the environment, potentially enabling this powerful teaching signal to support complex learning and behavior.
Collapse
Affiliation(s)
- Yuji K Takahashi
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD
| | - Zhewei Zhang
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD
| | | | - Thorsten Kahnt
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD
| | - Angela J Langdon
- Intramural Research Program, National Institute on Mental Health, Bethesda, MD
| | - Geoffrey Schoenbaum
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD
| |
Collapse
|
15
|
Sato Matsumoto C, Matsumoto Y, Mizunami M. Roles of octopamine neurons in the vertical lobe of the mushroom body for the execution of a conditioned response in cockroaches. Neurobiol Learn Mem 2023:107778. [PMID: 37257558 DOI: 10.1016/j.nlm.2023.107778] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 04/20/2023] [Accepted: 05/24/2023] [Indexed: 06/02/2023]
Abstract
Aminergic neurons mediate reward signals in mammals and insects. In crickets, we showed that blockade of synaptic transmission from octopamine neurons (OANs) impairs conditioning of an odor (conditioned stimulus, CS) with water or sucrose (unconditioned stimulus, US) and execution of a conditioned response (CR) to the CS. It has not yet been established, however, whether findings in crickets can be applied to other species of insects. In this study, we investigated the roles of OANs in conditioning of salivation, monitored by activities of salivary neurons, and in execution of the CR in cockroaches (Periplaneta americana). We showed that injection of epinastine (an OA receptor antagonist) into the head hemolymph impaired both conditioning and execution of the CR, in accordance with findings in crickets. Moreover, local injection of epinastine into the vertical lobes of the mushroom body (MB), the center for associative learning and control of the CR, impaired execution of the CR, whereas injection of epinastine into the calyces of the MB or the antennal lobes (primary olfactory centers) did not. We propose that OANs in the MB vertical lobes play critical roles in the execution of the CR in cockroaches. This is analogous to the fact that midbrain dopamine neurons govern execution of learned actions in mammals.
Collapse
Affiliation(s)
| | - Yukihisa Matsumoto
- Tokyo Dental and Medical University, Department of Biology, Ichikawa, Japan
| | | |
Collapse
|
16
|
Takahashi YK, Stalnaker TA, Mueller LE, Harootonian SK, Langdon AJ, Schoenbaum G. Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model. Nat Neurosci 2023; 26:830-839. [PMID: 37081296 PMCID: PMC10646487 DOI: 10.1038/s41593-023-01310-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/16/2023] [Indexed: 04/22/2023]
Abstract
Dopamine neuron activity is tied to the prediction error in temporal difference reinforcement learning models. These models make significant simplifying assumptions, particularly with regard to the structure of the predictions fed into the dopamine neurons, which consist of a single chain of timepoint states. Although this predictive structure can explain error signals observed in many studies, it cannot cope with settings where subjects might infer multiple independent events and outcomes. In the present study, we recorded dopamine neurons in the ventral tegmental area in such a setting to test the validity of the single-stream assumption. Rats were trained in an odor-based choice task, in which the timing and identity of one of several rewards delivered in each trial changed across trial blocks. This design revealed an error signaling pattern that requires the dopamine neurons to access and update multiple independent predictive streams reflecting the subject's belief about timing and potentially unique identities of expected rewards.
Collapse
Affiliation(s)
- Yuji K Takahashi
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA.
| | - Thomas A Stalnaker
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | - Lauren E Mueller
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | | | - Angela J Langdon
- Intramural Research Program, National Institute of Mental Health, Bethesda, MD, USA.
| | - Geoffrey Schoenbaum
- Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA.
| |
Collapse
|
17
|
Yamada D, Bushey D, Li F, Hibbard KL, Sammons M, Funke J, Litwin-Kumar A, Hige T, Aso Y. Hierarchical architecture of dopaminergic circuits enables second-order conditioning in Drosophila. eLife 2023; 12:e79042. [PMID: 36692262 PMCID: PMC9937650 DOI: 10.7554/elife.79042] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 01/23/2023] [Indexed: 01/25/2023] Open
Abstract
Dopaminergic neurons with distinct projection patterns and physiological properties compose memory subsystems in a brain. However, it is poorly understood whether or how they interact during complex learning. Here, we identify a feedforward circuit formed between dopamine subsystems and show that it is essential for second-order conditioning, an ethologically important form of higher-order associative learning. The Drosophila mushroom body comprises a series of dopaminergic compartments, each of which exhibits distinct memory dynamics. We find that a slow and stable memory compartment can serve as an effective 'teacher' by instructing other faster and transient memory compartments via a single key interneuron, which we identify by connectome analysis and neurotransmitter prediction. This excitatory interneuron acquires enhanced response to reward-predicting odor after first-order conditioning and, upon activation, evokes dopamine release in the 'student' compartments. These hierarchical connections between dopamine subsystems explain distinct properties of first- and second-order memory long known by behavioral psychologists.
Collapse
Affiliation(s)
- Daichi Yamada
- Department of Biology, University of North Carolina at Chapel HillChapel HillUnited States
| | - Daniel Bushey
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | - Feng Li
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | - Karen L Hibbard
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | - Megan Sammons
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | - Jan Funke
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| | | | - Toshihide Hige
- Department of Biology, University of North Carolina at Chapel HillChapel HillUnited States
- Department of Cell Biology and Physiology, University of North Carolina at Chapel HillChapel HillUnited States
- Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel HillChapel HillUnited States
| | - Yoshinori Aso
- Janelia Research Campus, Howard Hughes Medical InstituteAshburnUnited States
| |
Collapse
|
18
|
Morris LS, Mehta M, Ahn C, Corniquel M, Verma G, Delman B, Hof PR, Jacob Y, Balchandani P, Murrough JW. Ventral tegmental area integrity measured with high-resolution 7-Tesla MRI relates to motivation across depression and anxiety diagnoses. Neuroimage 2022; 264:119704. [PMID: 36349598 PMCID: PMC9801251 DOI: 10.1016/j.neuroimage.2022.119704] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 09/25/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022] Open
Abstract
The ventral tegmental area (VTA) is one of the major sources of dopamine in the brain and has been associated with reward prediction, error-based reward learning, volitional drive and anhedonia. However, precise anatomical investigations of the VTA have been prevented by the use of standard-resolution MRI, reliance on subjective manual tracings, and lack of quantitative measures of dopamine-related signal. Here, we combine ultra-high field 400 µm3 quantitative MRI with dopamine-related signal mapping, and a mixture of machine learning and supervised computational techniques to delineate the VTA in a transdiagnostic sample of subjects with and without depression and anxiety disorders. Subjects also underwent cognitive testing to measure intrinsic and extrinsic motivational tone. Fifty-one subjects were scanned in total, including healthy control (HC) and mood/anxiety (MA) disorder subjects. MA subjects had significantly larger VTA volumes compared to HC but significantly lower signal intensity within VTA compared to HC, indicating reduced structural integrity of the dopaminergic VTA. Interestingly, while VTA integrity did not significantly correlate with self-reported depression or anxiety symptoms, it was correlated with an objective cognitive measure of extrinsic motivation, whereby lower VTA integrity was associated with lower motivation. This is the first study to demonstrate a computational pipeline for detecting and delineating the VTA in human subjects with 400 μm3 resolution. We highlight the use of objective transdiagnostic measures of cognitive function that link neural integrity to behavior across clinical and non-clinical groups.
Collapse
Affiliation(s)
- Laurel S Morris
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA; BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, USA.
| | - Marishka Mehta
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Christopher Ahn
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Morgan Corniquel
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Gaurav Verma
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, USA; Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Bradley Delman
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, USA; Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Patrick R Hof
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Yael Jacob
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA; BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, USA; Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Priti Balchandani
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, USA; Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, USA
| | - James W Murrough
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA; Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, USA
| |
Collapse
|
19
|
Brain-inspired meta-reinforcement learning cognitive control in conflictual inhibition decision-making task for artificial agents. Neural Netw 2022; 154:283-302. [DOI: 10.1016/j.neunet.2022.06.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 06/09/2022] [Accepted: 06/16/2022] [Indexed: 11/21/2022]
|
20
|
Islam KUS, Meli N, Blaess S. The Development of the Mesoprefrontal Dopaminergic System in Health and Disease. Front Neural Circuits 2021; 15:746582. [PMID: 34712123 PMCID: PMC8546303 DOI: 10.3389/fncir.2021.746582] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 09/10/2021] [Indexed: 12/18/2022] Open
Abstract
Midbrain dopaminergic neurons located in the substantia nigra and the ventral tegmental area are the main source of dopamine in the brain. They send out projections to a variety of forebrain structures, including dorsal striatum, nucleus accumbens, and prefrontal cortex (PFC), establishing the nigrostriatal, mesolimbic, and mesoprefrontal pathways, respectively. The dopaminergic input to the PFC is essential for the performance of higher cognitive functions such as working memory, attention, planning, and decision making. The gradual maturation of these cognitive skills during postnatal development correlates with the maturation of PFC local circuits, which undergo a lengthy functional remodeling process during the neonatal and adolescence stage. During this period, the mesoprefrontal dopaminergic innervation also matures: the fibers are rather sparse at prenatal stages and slowly increase in density during postnatal development to finally reach a stable pattern in early adulthood. Despite the prominent role of dopamine in the regulation of PFC function, relatively little is known about how the dopaminergic innervation is established in the PFC, whether and how it influences the maturation of local circuits and how exactly it facilitates cognitive functions in the PFC. In this review, we provide an overview of the development of the mesoprefrontal dopaminergic system in rodents and primates and discuss the role of altered dopaminergic signaling in neuropsychiatric and neurodevelopmental disorders.
Collapse
Affiliation(s)
- K Ushna S Islam
- Neurodevelopmental Genetics, Institute of Reconstructive Neurobiology, Medical Faculty, University of Bonn, Bonn, Germany
| | - Norisa Meli
- Neurodevelopmental Genetics, Institute of Reconstructive Neurobiology, Medical Faculty, University of Bonn, Bonn, Germany.,Institute of Neuropathology, Section for Translational Epilepsy Research, Medical Faculty, University of Bonn, Bonn, Germany
| | - Sandra Blaess
- Neurodevelopmental Genetics, Institute of Reconstructive Neurobiology, Medical Faculty, University of Bonn, Bonn, Germany
| |
Collapse
|
21
|
Tanaka S, Taylor JE, Sakagami M. The effect of effort on reward prediction error signals in midbrain dopamine neurons. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.07.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
22
|
Miller DR, Guenther DT, Maurer AP, Hansen CA, Zalesky A, Khoshbouei H. Dopamine Transporter Is a Master Regulator of Dopaminergic Neural Network Connectivity. J Neurosci 2021; 41:5453-5470. [PMID: 33980544 PMCID: PMC8221606 DOI: 10.1523/jneurosci.0223-21.2021] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 04/19/2021] [Accepted: 05/01/2021] [Indexed: 12/13/2022] Open
Abstract
Dopaminergic neurons of the substantia nigra pars compacta (SNC) and ventral tegmental area (VTA) exhibit spontaneous firing activity. The dopaminergic neurons in these regions have been shown to exhibit differential sensitivity to neuronal loss and psychostimulants targeting dopamine transporter. However, it remains unclear whether these regional differences scale beyond individual neuronal activity to regional neuronal networks. Here, we used live-cell calcium imaging to show that network connectivity greatly differs between SNC and VTA regions with higher incidence of hub-like neurons in the VTA. Specifically, the frequency of hub-like neurons was significantly lower in SNC than in the adjacent VTA, consistent with the interpretation of a lower network resilience to SNC neuronal loss. We tested this hypothesis, in DAT-cre/loxP-GCaMP6f mice of either sex, when activity of an individual dopaminergic neuron is suppressed, through whole-cell patch clamp electrophysiology, in either SNC or VTA networks. Neuronal loss in the SNC increased network clustering, whereas the larger number of hub-neurons in the VTA overcompensated by decreasing network clustering in the VTA. We further show that network properties are regulatable via a dopamine transporter but not a D2 receptor dependent mechanism. Our results demonstrate novel regulatory mechanisms of functional network topology in dopaminergic brain regions.SIGNIFICANCE STATEMENT In this work, we begin to untangle the differences in complex network properties between the substantia nigra pars compacta (SNC) and VTA, that may underlie differential sensitivity between regions. The methods and analysis employed provide a springboard for investigations of network topology in multiple deep brain structures and disorders.
Collapse
Affiliation(s)
- Douglas R Miller
- Department of Neuroscience, University of Florida, Gainesville, Florida
| | - Dylan T Guenther
- Department of Neuroscience, University of Florida, Gainesville, Florida
| | - Andrew P Maurer
- Department of Neuroscience, University of Florida, Gainesville, Florida
| | - Carissa A Hansen
- Department of Neuroscience, University of Florida, Gainesville, Florida
| | - Andrew Zalesky
- Melbourne Neuropsychiatry Centre, The University of Melbourne and Melbourne Health, Melbourne, Victoria 3010, Australia
- Department of Biomedical Engineering, Melbourne School of Engineering, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | | |
Collapse
|
23
|
Xu HA, Modirshanechi A, Lehmann MP, Gerstner W, Herzog MH. Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making. PLoS Comput Biol 2021; 17:e1009070. [PMID: 34081705 PMCID: PMC8205159 DOI: 10.1371/journal.pcbi.1009070] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 06/15/2021] [Accepted: 05/12/2021] [Indexed: 11/19/2022] Open
Abstract
Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free action-values. Even though the world-model is available for model-based RL, we find that human decisions are dominated by model-free action choices. The world-model is only marginally used for planning, but it is important to detect surprising events. Our theory predicts human action choices with high probability and allows us to dissociate surprise, novelty, and reward in EEG signals.
Collapse
Affiliation(s)
- He A. Xu
- Laboratory of Psychophysics, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alireza Modirshanechi
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Marco P. Lehmann
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Wulfram Gerstner
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Michael H. Herzog
- Laboratory of Psychophysics, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|