1
|
Cone I, Clopath C, Shouval HZ. Learning to express reward prediction error-like dopaminergic activity requires plastic representations of time. Nat Commun 2024; 15:5856. [PMID: 38997276 DOI: 10.1038/s41467-024-50205-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 07/02/2024] [Indexed: 07/14/2024] Open
Abstract
The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference learning (TD) learning, whereby certain units signal reward prediction errors (RPE). The TD algorithm has been traditionally mapped onto the dopaminergic system, as firing properties of dopamine neurons can resemble RPEs. However, certain predictions of TD learning are inconsistent with experimental results, and previous implementations of the algorithm have made unscalable assumptions regarding stimulus-specific fixed temporal bases. We propose an alternate framework to describe dopamine signaling in the brain, FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, dopamine release is similar, but not identical to RPE, leading to predictions that contrast to those of TD. While FLEX itself is a general theoretical framework, we describe a specific, biophysically plausible implementation, the results of which are consistent with a preponderance of both existing and reanalyzed experimental data.
Collapse
Affiliation(s)
- Ian Cone
- Department of Bioengineering, Imperial College London, London, UK
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, USA
- Applied Physics Program, Rice University, Houston, TX, USA
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, UK
| | - Harel Z Shouval
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, USA.
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA.
| |
Collapse
|
2
|
Lee RS, Sagiv Y, Engelhard B, Witten IB, Daw ND. A feature-specific prediction error model explains dopaminergic heterogeneity. Nat Neurosci 2024:10.1038/s41593-024-01689-1. [PMID: 38961229 DOI: 10.1038/s41593-024-01689-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 05/22/2024] [Indexed: 07/05/2024]
Abstract
The hypothesis that midbrain dopamine (DA) neurons broadcast a reward prediction error (RPE) is among the great successes of computational neuroscience. However, recent results contradict a core aspect of this theory: specifically that the neurons convey a scalar, homogeneous signal. While the predominant family of extensions to the RPE model replicates the classic model in multiple parallel circuits, we argue that these models are ill suited to explain reports of heterogeneity in task variable encoding across DA neurons. Instead, we introduce a complementary 'feature-specific RPE' model, positing that individual ventral tegmental area DA neurons report RPEs for different aspects of an animal's moment-to-moment situation. Further, we show how our framework can be extended to explain patterns of heterogeneity in action responses reported among substantia nigra pars compacta DA neurons. This theory reconciles new observations of DA heterogeneity with classic ideas about RPE coding while also providing a new perspective of how the brain performs reinforcement learning in high-dimensional environments.
Collapse
Affiliation(s)
- Rachel S Lee
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | - Yotam Sagiv
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | - Ben Engelhard
- Princeton Neuroscience Institute, Princeton, NJ, USA
| | | | - Nathaniel D Daw
- Princeton Neuroscience Institute, Princeton, NJ, USA.
- Department of Psychology, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
3
|
Schütt HH, Kim D, Ma WJ. Reward prediction error neurons implement an efficient code for reward. Nat Neurosci 2024; 27:1333-1339. [PMID: 38898182 DOI: 10.1038/s41593-024-01671-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 04/29/2024] [Indexed: 06/21/2024]
Abstract
We use efficient coding principles borrowed from sensory neuroscience to derive the optimal neural population to encode a reward distribution. We show that the responses of dopaminergic reward prediction error neurons in mouse and macaque are similar to those of the efficient code in the following ways: the neurons have a broad distribution of midpoints covering the reward distribution; neurons with higher thresholds have higher gains, more convex tuning functions and lower slopes; and their slope is higher when the reward distribution is narrower. Furthermore, we derive learning rules that converge to the efficient code. The learning rule for the position of the neuron on the reward axis closely resembles distributional reinforcement learning. Thus, reward prediction error neuron responses may be optimized to broadcast an efficient reward signal, forming a connection between efficient coding and reinforcement learning, two of the most successful theories in computational neuroscience.
Collapse
Affiliation(s)
- Heiko H Schütt
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA.
- Department of Behavioural and Cognitive Sciences, Université du Luxembourg, Esch-Belval, Luxembourg.
| | - Dongjae Kim
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
- Department of AI-Based Convergence, Dankook University, Yongin, Republic of Korea
| | - Wei Ji Ma
- Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
| |
Collapse
|
4
|
Augustat N, Endres D, Mueller EM. Uncertainty of treatment efficacy moderates placebo effects on reinforcement learning. Sci Rep 2024; 14:14421. [PMID: 38909105 PMCID: PMC11193823 DOI: 10.1038/s41598-024-64240-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/06/2024] [Indexed: 06/24/2024] Open
Abstract
The placebo-reward hypothesis postulates that positive effects of treatment expectations on health (i.e., placebo effects) and reward processing share common neural underpinnings. Moreover, experiments in humans and animals indicate that reward uncertainty increases striatal dopamine, which is presumably involved in placebo responses and reward learning. Therefore, treatment uncertainty analogously to reward uncertainty may affect updating from rewards after placebo treatment. Here, we address whether different degrees of uncertainty regarding the efficacy of a sham treatment affect reward sensitivity. In an online between-subjects experiment with N = 141 participants, we systematically varied the provided efficacy instructions before participants first received a sham treatment that consisted of listening to binaural beats and then performed a probabilistic reinforcement learning task. We fitted a Q-learning model including two different learning rates for positive (gain) and negative (loss) reward prediction errors and an inverse gain parameter to behavioral decision data in the reinforcement learning task. Our results yielded an inverted-U-relationship between provided treatment efficacy probability and learning rates for gain, such that higher levels of treatment uncertainty, rather than of expected net efficacy, affect presumably dopamine-related reward learning. These findings support the placebo-reward hypothesis and suggest harnessing uncertainty in placebo treatment for recovering reward learning capabilities.
Collapse
Affiliation(s)
- Nick Augustat
- Department of Psychology, University of Marburg, Marburg, Germany.
| | - Dominik Endres
- Department of Psychology, University of Marburg, Marburg, Germany
| | - Erik M Mueller
- Department of Psychology, University of Marburg, Marburg, Germany
| |
Collapse
|
5
|
Shen B, Wilson J, Nguyen D, Glimcher PW, Louie K. Origins of noise in both improving and degrading decision making. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.26.586597. [PMID: 38915616 PMCID: PMC11195060 DOI: 10.1101/2024.03.26.586597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Noise is a fundamental problem for information processing in neural systems. In decision-making, noise is assumed to have a primary role in errors and stochastic choice behavior. However, little is known about how noise arising from different sources contributes to value coding and choice behaviors, especially when it interacts with neural computation. Here we examine how noise arising early versus late in the choice process differentially impacts context-dependent choice behavior. We found in model simulations that early and late noise predict opposing context effects: under early noise, contextual information enhances choice accuracy; while under late noise, context degrades choice accuracy. Furthermore, we verified these opposing predictions in experimental human choice behavior. Manipulating early and late noise - by inducing uncertainty in option values and controlling time pressure - produced dissociable positive and negative context effects. These findings reconcile controversial experimental findings in the literature reporting either context-driven impairments or improvements in choice performance, suggesting a unified mechanism for context-dependent choice. More broadly, these findings highlight how different sources of noise can interact with neural computations to differentially modulate behavior. Significance The current study addresses the role of noise origin in decision-making, reconciling controversies around how decision-making is impacted by context. We demonstrate that different types of noise - either arising early during evaluation or late during option comparison - leads to distinct results: with early noise, context enhances choice accuracy, while with late noise, context impairs it. Understanding these dynamics offers potential strategies for improving decision-making in noisy environments and refining existing neural computation models. Overall, our findings advance our understanding of how neural systems handle noise in essential cognitive tasks, suggest a beneficial role for contextual modulation under certain conditions, and highlight the profound implications of noise structure in decision-making.
Collapse
|
6
|
Schultz W. A dopamine mechanism for reward maximization. Proc Natl Acad Sci U S A 2024; 121:e2316658121. [PMID: 38717856 PMCID: PMC11098095 DOI: 10.1073/pnas.2316658121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024] Open
Abstract
Individual survival and evolutionary selection require biological organisms to maximize reward. Economic choice theories define the necessary and sufficient conditions, and neuronal signals of decision variables provide mechanistic explanations. Reinforcement learning (RL) formalisms use predictions, actions, and policies to maximize reward. Midbrain dopamine neurons code reward prediction errors (RPE) of subjective reward value suitable for RL. Electrical and optogenetic self-stimulation experiments demonstrate that monkeys and rodents repeat behaviors that result in dopamine excitation. Dopamine excitations reflect positive RPEs that increase reward predictions via RL; against increasing predictions, obtaining similar dopamine RPE signals again requires better rewards than before. The positive RPEs drive predictions higher again and thus advance a recursive reward-RPE-prediction iteration toward better and better rewards. Agents also avoid dopamine inhibitions that lower reward prediction via RL, which allows smaller rewards than before to elicit positive dopamine RPE signals and resume the iteration toward better rewards. In this way, dopamine RPE signals serve a causal mechanism that attracts agents via RL to the best rewards. The mechanism improves daily life and benefits evolutionary selection but may also induce restlessness and greed.
Collapse
Affiliation(s)
- Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, CambridgeCB2 3DY, United Kingdom
| |
Collapse
|
7
|
Imtiaz Z, Kato A, Kopell BH, Qasim SE, Davis AN, Martinez LN, Heflin M, Kulkarni K, Morsi A, Gu X, Saez I. Human Substantia Nigra Neurons Encode Reward Expectations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593406. [PMID: 38766086 PMCID: PMC11100806 DOI: 10.1101/2024.05.10.593406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Dopamine (DA) signals originating from substantia nigra (SN) neurons are centrally involved in the regulation of motor and reward processing. DA signals behaviorally relevant events where reward outcomes differ from expectations (reward prediction errors, RPEs). RPEs play a crucial role in learning optimal courses of action and in determining response vigor when an agent expects rewards. Nevertheless, how reward expectations, crucial for RPE calculations, are conveyed to and represented in the dopaminergic system is not fully understood, especially in the human brain where the activity of DA neurons is difficult to study. One possibility, suggested by evidence from animal models, is that DA neurons explicitly encode reward expectations. Alternatively, they may receive RPE information directly from upstream brain regions. To address whether SN neuron activity directly reflects reward expectation information, we directly examined the encoding of reward expectation signals in human putative DA neurons by performing single-unit recordings from the SN of patients undergoing neurosurgery. Patients played a two-armed bandit decision-making task in which they attempted to maximize reward. We show that neuronal firing rates (FR) of putative DA neurons during the reward expectation period explicitly encode reward expectations. First, activity in these neurons was modulated by previous trial outcomes, such that FR were greater after positive outcomes than after neutral or negative outcome trials. Second, this increase in FR was associated with shorter reaction times, consistent with an invigorating effect of DA neuron activity during expectation. These results suggest that human DA neurons explicitly encode reward expectations, providing a neurophysiological substrate for a signal critical for reward learning.
Collapse
Affiliation(s)
- Zarghona Imtiaz
- Nash Family Department of Neuroscience and the Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ayaka Kato
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Brian H. Kopell
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Salman E. Qasim
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Arianna Neal Davis
- Nash Family Department of Neuroscience and the Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lizbeth Nunez Martinez
- Nash Family Department of Neuroscience and the Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Matt Heflin
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kaustubh Kulkarni
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Amr Morsi
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Xiaosi Gu
- Nash Family Department of Neuroscience and the Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ignacio Saez
- Nash Family Department of Neuroscience and the Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
8
|
Burwell SCV, Yan H, Lim SSX, Shields BC, Tadross MR. Reward perseveration is shaped by GABA A -mediated dopamine pauses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.09.593320. [PMID: 38766037 PMCID: PMC11100816 DOI: 10.1101/2024.05.09.593320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Extinction learning is an essential form of cognitive flexibility, which enables obsolete reward associations to be discarded. Its downregulation can lead to perseveration, a symptom seen in several neuropsychiatric disorders. This balance is regulated by dopamine from VTA DA (ventral tegmental area dopamine) neurons, which in turn are largely controlled by GABA (gamma amino-butyric acid) synapses. However, the causal relationship of these circuit elements to extinction and perseveration remain incompletely understood. Here, we employ an innovative drug-targeting technology, DART (drug acutely restricted by tethering), to selectively block GABA A receptors on VTA DA neurons as mice engage in Pavlovian learning. DART eliminated GABA A -mediated pauses-brief decrements in VTA DA activity canonically thought to drive extinction learning. However, contrary to the hypothesis that blocking VTA DA pauses should eliminate extinction learning, we observed the opposite-accelerated extinction learning. Specifically, DART eliminated the naturally occurring perseveration seen in half of control mice. We saw no impact on Pavlovian conditioning, nor on other aspects of VTA DA neural firing. These findings challenge canonical theories, recasting GABA A -mediated VTA DA pauses from presumed facilitators of extinction to drivers of perseveration. More broadly, this study showcases the merits of targeted synaptic pharmacology, while hinting at circuit interventions for pathological perseveration.
Collapse
|
9
|
Davidson AM, Hige T. Roles of feedback and feed-forward networks of dopamine subsystems: insights from Drosophila studies. Learn Mem 2024; 31:a053807. [PMID: 38862171 PMCID: PMC11199952 DOI: 10.1101/lm.053807.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 11/10/2023] [Indexed: 06/13/2024]
Abstract
Across animal species, dopamine-operated memory systems comprise anatomically segregated, functionally diverse subsystems. Although individual subsystems could operate independently to support distinct types of memory, the logical interplay between subsystems is expected to enable more complex memory processing by allowing existing memory to influence future learning. Recent comprehensive ultrastructural analysis of the Drosophila mushroom body revealed intricate networks interconnecting the dopamine subsystems-the mushroom body compartments. Here, we review the functions of some of these connections that are beginning to be understood. Memory consolidation is mediated by two different forms of network: A recurrent feedback loop within a compartment maintains sustained dopamine activity required for consolidation, whereas feed-forward connections across compartments allow short-term memory formation in one compartment to open the gate for long-term memory formation in another compartment. Extinction and reversal of aversive memory rely on a similar feed-forward circuit motif that signals omission of punishment as a reward, which triggers plasticity that counteracts the original aversive memory trace. Finally, indirect feed-forward connections from a long-term memory compartment to short-term memory compartments mediate higher-order conditioning. Collectively, these emerging studies indicate that feedback control and hierarchical connectivity allow the dopamine subsystems to work cooperatively to support diverse and complex forms of learning.
Collapse
Affiliation(s)
- Andrew M Davidson
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | - Toshihide Hige
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| |
Collapse
|
10
|
Groos D, Helmchen F. The lateral habenula: A hub for value-guided behavior. Cell Rep 2024; 43:113968. [PMID: 38522071 DOI: 10.1016/j.celrep.2024.113968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/20/2024] [Accepted: 02/29/2024] [Indexed: 03/26/2024] Open
Abstract
The habenula is an evolutionarily highly conserved diencephalic brain region divided into two major parts, medial and lateral. Over the past two decades, studies of the lateral habenula (LHb), in particular, have identified key functions in value-guided behavior in health and disease. In this review, we focus on recent insights into LHb connectivity and its functional relevance for different types of aversive and appetitive value-guided behavior. First, we give an overview of the anatomical organization of the LHb and its main cellular composition. Next, we elaborate on how distinct LHb neuronal subpopulations encode aversive and appetitive stimuli and on their involvement in more complex decision-making processes. Finally, we scrutinize the afferent and efferent connections of the LHb and discuss their functional implications for LHb-dependent behavior. A deepened understanding of distinct LHb circuit components will substantially contribute to our knowledge of value-guided behavior.
Collapse
Affiliation(s)
- Dominik Groos
- Laboratory of Neural Circuit Dynamics, Brain Research Institute, University of Zurich, Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich, Zurich, Switzerland.
| | - Fritjof Helmchen
- Laboratory of Neural Circuit Dynamics, Brain Research Institute, University of Zurich, Zurich, Switzerland; Neuroscience Center Zurich, University of Zurich, Zurich, Switzerland; University Research Priority Program (URPP), Adaptive Brain Circuits in Development and Learning, University of Zurich, Zurich, Switzerland
| |
Collapse
|
11
|
Avvisati R, Kaufmann AK, Young CJ, Portlock GE, Cancemi S, Costa RP, Magill PJ, Dodson PD. Distributional coding of associative learning in discrete populations of midbrain dopamine neurons. Cell Rep 2024; 43:114080. [PMID: 38581677 PMCID: PMC7616095 DOI: 10.1016/j.celrep.2024.114080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/12/2024] [Accepted: 03/24/2024] [Indexed: 04/08/2024] Open
Abstract
Midbrain dopamine neurons are thought to play key roles in learning by conveying the difference between expected and actual outcomes. Recent evidence suggests diversity in dopamine signaling, yet it remains poorly understood how heterogeneous signals might be organized to facilitate the role of downstream circuits mediating distinct aspects of behavior. Here, we investigated the organizational logic of dopaminergic signaling by recording and labeling individual midbrain dopamine neurons during associative behavior. Our findings show that reward information and behavioral parameters are not only heterogeneously encoded but also differentially distributed across populations of dopamine neurons. Retrograde tracing and fiber photometry suggest that populations of dopamine neurons projecting to different striatal regions convey distinct signals. These data, supported by computational modeling, indicate that such distributional coding can maximize dynamic range and tailor dopamine signals to facilitate specialized roles of different striatal regions.
Collapse
Affiliation(s)
- Riccardo Avvisati
- School of Physiology, Pharmacology, and Neuroscience, University of Bristol, Bristol BS8 1TD, UK; Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Anna-Kristin Kaufmann
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Callum J Young
- School of Physiology, Pharmacology, and Neuroscience, University of Bristol, Bristol BS8 1TD, UK; Computational Neuroscience Unit, Department of Computer Science, SCEEM, Faculty of Engineering, University of Bristol, Bristol BS8 1UB, UK
| | - Gabriella E Portlock
- School of Physiology, Pharmacology, and Neuroscience, University of Bristol, Bristol BS8 1TD, UK
| | - Sophie Cancemi
- School of Physiology, Pharmacology, and Neuroscience, University of Bristol, Bristol BS8 1TD, UK
| | - Rui Ponte Costa
- Computational Neuroscience Unit, Department of Computer Science, SCEEM, Faculty of Engineering, University of Bristol, Bristol BS8 1UB, UK
| | - Peter J Magill
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Paul D Dodson
- School of Physiology, Pharmacology, and Neuroscience, University of Bristol, Bristol BS8 1TD, UK; Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK.
| |
Collapse
|
12
|
Cowan RL, Davis T, Kundu B, Rahimpour S, Rolston JD, Smith EH. More widespread and rigid neuronal representation of reward expectation underlies impulsive choices. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.588637. [PMID: 38645037 PMCID: PMC11030340 DOI: 10.1101/2024.04.11.588637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Impulsive choices prioritize smaller, more immediate rewards over larger, delayed, or potentially uncertain rewards. Impulsive choices are a critical aspect of substance use disorders and maladaptive decision-making across the lifespan. Here, we sought to understand the neuronal underpinnings of expected reward and risk estimation on a trial-by-trial basis during impulsive choices. To do so, we acquired electrical recordings from the human brain while participants carried out a risky decision-making task designed to measure choice impulsivity. Behaviorally, we found a reward-accuracy tradeoff, whereby more impulsive choosers were more accurate at the task, opting for a more immediate reward while compromising overall task performance. We then examined how neuronal populations across frontal, temporal, and limbic brain regions parametrically encoded reinforcement learning model variables, namely reward and risk expectation and surprise, across trials. We found more widespread representations of reward value expectation and prediction error in more impulsive choosers, whereas less impulsive choosers preferentially represented risk expectation. A regional analysis of reward and risk encoding highlighted the anterior cingulate cortex for value expectation, the anterior insula for risk expectation and surprise, and distinct regional encoding between impulsivity groups. Beyond describing trial-by-trial population neuronal representations of reward and risk variables, these results suggest impaired inhibitory control and model-free learning underpinnings of impulsive choice. These findings shed light on neural processes underlying reinforced learning and decision-making in uncertain environments and how these processes may function in psychiatric disorders.
Collapse
Affiliation(s)
- Rhiannon L Cowan
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Tyler Davis
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - Bornali Kundu
- Department of Neurosurgery, University of Missouri, Columbia, MO 65212, USA
| | - Shervin Rahimpour
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| | - John D Rolston
- Department of Neurosurgery, Brigham & Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Elliot H Smith
- Department of Neurosurgery, University of Utah, Salt Lake City, UT 84132, USA
| |
Collapse
|
13
|
Drzewiecki CM, Fox AS. Understanding the heterogeneity of anxiety using a translational neuroscience approach. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2024; 24:228-245. [PMID: 38356013 PMCID: PMC11039504 DOI: 10.3758/s13415-024-01162-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/14/2024] [Indexed: 02/16/2024]
Abstract
Anxiety disorders affect millions of people worldwide and present a challenge in neuroscience research because of their substantial heterogeneity in clinical presentation. While a great deal of progress has been made in understanding the neurobiology of fear and anxiety, these insights have not led to effective treatments. Understanding the relationship between phenotypic heterogeneity and the underlying biology is a critical first step in solving this problem. We show translation, reverse translation, and computational modeling can contribute to a refined, cross-species understanding of fear and anxiety as well as anxiety disorders. More specifically, we outline how animal models can be leveraged to develop testable hypotheses in humans by using targeted, cross-species approaches and ethologically informed behavioral paradigms. We discuss reverse translational approaches that can guide and prioritize animal research in nontraditional research species. Finally, we advocate for the use of computational models to harmonize cross-species and cross-methodology research into anxiety. Together, this translational neuroscience approach will help to bridge the widening gap between how we currently conceptualize and diagnose anxiety disorders, as well as aid in the discovery of better treatments for these conditions.
Collapse
Affiliation(s)
- Carly M Drzewiecki
- California National Primate Research Center, University of California, Davis, CA, USA.
| | - Andrew S Fox
- California National Primate Research Center, University of California, Davis, CA, USA.
- Department of Psychology, University of California, Davis, CA, USA.
| |
Collapse
|
14
|
Mohebi A, Wei W, Pelattini L, Kim K, Berke JD. Dopamine transients follow a striatal gradient of reward time horizons. Nat Neurosci 2024; 27:737-746. [PMID: 38321294 PMCID: PMC11001583 DOI: 10.1038/s41593-023-01566-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 12/21/2023] [Indexed: 02/08/2024]
Abstract
Animals make predictions to guide their behavior and update those predictions through experience. Transient increases in dopamine (DA) are thought to be critical signals for updating predictions. However, it is unclear how this mechanism handles a wide range of behavioral timescales-from seconds or less (for example, if singing a song) to potentially hours or more (for example, if hunting for food). Here we report that DA transients in distinct rat striatal subregions convey prediction errors based on distinct time horizons. DA dynamics systematically accelerated from ventral to dorsomedial to dorsolateral striatum, in the tempo of spontaneous fluctuations, the temporal integration of prior rewards and the discounting of future rewards. This spectrum of timescales for evaluative computations can help achieve efficient learning and adaptive motivation for a broad range of behaviors.
Collapse
Affiliation(s)
- Ali Mohebi
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Wei Wei
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Lilian Pelattini
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Kyoungjun Kim
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Joshua D Berke
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, CA, USA.
- Neuroscience Graduate Program, University of California San Francisco, San Francisco, CA, USA.
- Kavli Institute for Fundamental Neuroscience, University of California San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
15
|
Wang Y, Lak A, Manohar SG, Bogacz R. Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration. PLoS Comput Biol 2024; 20:e1011516. [PMID: 38626219 PMCID: PMC11051659 DOI: 10.1371/journal.pcbi.1011516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 04/26/2024] [Accepted: 03/23/2024] [Indexed: 04/18/2024] Open
Abstract
When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning.
Collapse
Affiliation(s)
- Yuhao Wang
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Armin Lak
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Sanjay G. Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
16
|
Kehrer P, Brigman JL, Cavanagh JF. Depth recordings of the mouse homologue of the Reward Positivity. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2024; 24:292-301. [PMID: 37853299 DOI: 10.3758/s13415-023-01134-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/28/2023] [Indexed: 10/20/2023]
Abstract
We recently advanced a rodent homologue for the reward-specific, event-related potential component observed in humans known as the Reward Positivity. We sought to determine the cortical source of this signal in mice to further test the nature of this homology. While similar reward-related cortical signals have been identified in rats, these recordings were all performed in cingulate gyrus. Given the value-dependent nature of this event, we hypothesized that more ventral prelimbic and infralimbic areas also contribute important variance to this signal. Depth probes assessed local field activity in 29 mice (15 males) while they completed multiple sessions of a probabilistic reinforcement learning task. Using a priori regions of interest, we demonstrated that the depth of recording in the cortical midline significantly correlated with the size of reward-evoked delta band spectral activity as well as the single trial correlation between delta power and reward prediction error. These findings provide important verification of the validity of this translational biomarker of reward responsiveness, learning, and valuation.
Collapse
Affiliation(s)
- Penelope Kehrer
- Psychology Department, University of New Mexico, Logan Hall, MSC03 2220, 87131, Albuquerque, NM, Mexico
- Department of Neurosciences, University of New Mexico School of Medicine, Albuquerque, NM, Mexico
| | - Jonathan L Brigman
- Department of Neurosciences, University of New Mexico School of Medicine, Albuquerque, NM, Mexico
| | - James F Cavanagh
- Psychology Department, University of New Mexico, Logan Hall, MSC03 2220, 87131, Albuquerque, NM, Mexico.
| |
Collapse
|
17
|
Negm A, Ma X, Aggidis G. Deep reinforcement learning challenges and opportunities for urban water systems. WATER RESEARCH 2024; 253:121145. [PMID: 38330870 DOI: 10.1016/j.watres.2024.121145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 01/09/2024] [Accepted: 01/14/2024] [Indexed: 02/10/2024]
Abstract
The efficient and sustainable supply and transport of water is a key component to any functioning civilisation making the role of urban water systems (UWS) inherently crucial to the wellbeing of its customers. However, managing water is not a simple task. Whether it is ageing infrastructure, transient flows, air cavities or low pressures; water can be lost as a result of many issues that face UWSs. The complexity of those networks grows with the high urbanisation trends and climate change making water companies and regulatory bodies in need of new solutions. So, it comes as no surprise that many researchers are invested in innovating within the water industry to ensure that the future of our water is safe. Deep reinforcement learning (DRL) has the potential to tackle complexities that used to be very challenging as it relies on deep neural networks for function approximation and representation. This technology has conquered many fields due to its impressive results and can effectively revolutionise UWS. In this article, we explain the background of DRL and the milestones of this field using a novel taxonomy of the DRL algorithms. This will be followed by with a novel review of DRL applications in the UWS which focus on water distribution networks and stormwater systems. The review will be concluded with critical insights on how DRL can benefit different aspects of urban water systems.
Collapse
Affiliation(s)
- Ahmed Negm
- Lancaster University Energy Group, School of Engineering, Lancaster LA1 4YW, UK
| | - Xiandong Ma
- Lancaster University Energy Group, School of Engineering, Lancaster LA1 4YW, UK
| | - George Aggidis
- Lancaster University Energy Group, School of Engineering, Lancaster LA1 4YW, UK.
| |
Collapse
|
18
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward, but not negative value of aversive stimuli, to dopamine neurons. Neuron 2024; 112:1001-1019.e6. [PMID: 38278147 PMCID: PMC10957320 DOI: 10.1016/j.neuron.2023.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 11/10/2023] [Accepted: 12/21/2023] [Indexed: 01/28/2024]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs), but the mechanisms underlying RPE computation, particularly the contributions of different neurotransmitters, remain poorly understood. Here, we used a genetically encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons in mice. We found that glutamate inputs exhibit virtually all of the characteristics of RPE rather than conveying a specific component of RPE computation, such as reward or expectation. Notably, whereas glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli into more positive responses, whereas excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
19
|
Lin S, Fan CY, Wang HR, Li XF, Zeng JL, Lan PX, Li HX, Zhang B, Hu C, Xu J, Luo JH. Frontostriatal circuit dysfunction leads to cognitive inflexibility in neuroligin-3 R451C knockin mice. Mol Psychiatry 2024:10.1038/s41380-024-02505-9. [PMID: 38459194 DOI: 10.1038/s41380-024-02505-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 02/24/2024] [Accepted: 02/28/2024] [Indexed: 03/10/2024]
Abstract
Cognitive and behavioral rigidity are observed in various psychiatric diseases, including in autism spectrum disorder (ASD). However, the underlying mechanism remains to be elucidated. In this study, we found that neuroligin-3 (NL3) R451C knockin mouse model of autism (KI mice) exhibited deficits in behavioral flexibility in choice selection tasks. Single-unit recording of medium spiny neuron (MSN) activity in the nucleus accumbens (NAc) revealed altered encoding of decision-related cue and impaired updating of choice anticipation in KI mice. Additionally, fiber photometry demonstrated significant disruption in dynamic mesolimbic dopamine (DA) signaling for reward prediction errors (RPEs), along with reduced activity in medial prefrontal cortex (mPFC) neurons projecting to the NAc in KI mice. Interestingly, NL3 re-expression in the mPFC, but not in the NAc, rescued the deficit of flexible behaviors and simultaneously restored NAc-MSN encoding, DA dynamics, and mPFC-NAc output in KI mice. Taken together, this study reveals the frontostriatal circuit dysfunction underlying cognitive inflexibility and establishes a critical role of the mPFC NL3 deficiency in this deficit in KI mice. Therefore, these findings provide new insights into the mechanisms of cognitive and behavioral inflexibility and potential intervention strategies.
Collapse
Affiliation(s)
- Shen Lin
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China.
- Fujian Provincial Institutes of Brain Disorders and Brain Sciences, First Affiliated Hospital, Fujian Medical University, Fuzhou, China.
| | - Cui-Ying Fan
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Hao-Ran Wang
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
- Nanhu Brain-Computer Interface Institute, Hangzhou, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, Hangzhou, China
| | - Xiao-Fan Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Jia-Li Zeng
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Pei-Xuan Lan
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Hui-Xian Li
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China
| | - Bin Zhang
- Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, School of Medicine, Hangzhou City University, Hangzhou, China
| | - Chun Hu
- Institute for Brain Research and Rehabilitation, Key Laboratory of Brain Cognition and Education Sciences of Ministry of Education, South China Normal University, Guangzhou, China
| | - Junyu Xu
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China.
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, Hangzhou, China.
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, China.
| | - Jian-Hong Luo
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, School of Brain Science and Brain Medicine, Zhejiang University School of Medicine, Hangzhou, China.
- Nanhu Brain-Computer Interface Institute, Hangzhou, China.
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-machine Intelligence, Zhejiang University, Hangzhou, China.
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, China.
| |
Collapse
|
20
|
Hong T, Stauffer WR. Anterior cingulate learns reward distribution. Nat Neurosci 2024; 27:391-392. [PMID: 38351324 DOI: 10.1038/s41593-024-01571-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Affiliation(s)
- Tao Hong
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA
- Program in Neural Computation, Carnegie Mellon University, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, University of Pittsburgh and Carnegie Mellon University, Pittsburgh, PA, USA
| | - William R Stauffer
- Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for the Neural Basis of Cognition, University of Pittsburgh and Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
21
|
Polanía R, Burdakov D, Hare TA. Rationality, preferences, and emotions with biological constraints: it all starts from our senses. Trends Cogn Sci 2024; 28:264-277. [PMID: 38341322 DOI: 10.1016/j.tics.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/10/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024]
Abstract
Is the role of our sensory systems to represent the physical world as accurately as possible? If so, are our preferences and emotions, often deemed irrational, decoupled from these 'ground-truth' sensory experiences? We show why the answer to both questions is 'no'. Brain function is metabolically costly, and the brain loses some fraction of the information that it encodes and transmits. Therefore, if brains maximize objective functions that increase the fitness of their species, they should adapt to the objective-maximizing rules of the environment at the earliest stages of sensory processing. Consequently, observed 'irrationalities', preferences, and emotions stem from the necessity for our early sensory systems to adapt and process information while considering the metabolic costs and internal states of the organism.
Collapse
Affiliation(s)
- Rafael Polanía
- Decision Neuroscience Laboratory, Department of Health Sciences and Technology, ETH, Zurich, Zurich, Switzerland.
| | - Denis Burdakov
- Neurobehavioral Dynamics Laboratory, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
| | - Todd A Hare
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
22
|
Muller TH, Butler JL, Veselic S, Miranda B, Wallis JD, Dayan P, Behrens TEJ, Kurth-Nelson Z, Kennerley SW. Distributional reinforcement learning in prefrontal cortex. Nat Neurosci 2024; 27:403-408. [PMID: 38200183 PMCID: PMC10917656 DOI: 10.1038/s41593-023-01535-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 11/29/2023] [Indexed: 01/12/2024]
Abstract
The prefrontal cortex is crucial for learning and decision-making. Classic reinforcement learning (RL) theories center on learning the expectation of potential rewarding outcomes and explain a wealth of neural data in the prefrontal cortex. Distributional RL, on the other hand, learns the full distribution of rewarding outcomes and better explains dopamine responses. In the present study, we show that distributional RL also better explains macaque anterior cingulate cortex neuronal responses, suggesting that it is a common mechanism for reward-guided learning.
Collapse
Affiliation(s)
- Timothy H Muller
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
- Department of Clinical and Movement Neurosciences, University College London, London, UK.
| | - James L Butler
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Department of Clinical and Movement Neurosciences, University College London, London, UK
| | - Sebastijan Veselic
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Department of Clinical and Movement Neurosciences, University College London, London, UK
- Wellcome Trust Centre for Human Neuroimaging, University College London, London, UK
| | - Bruno Miranda
- Department of Clinical and Movement Neurosciences, University College London, London, UK
- Institute of Physiology and Institute of Molecular Medicine, Lisbon School of Medicine, University of Lisbon, Lisbon, Portugal
| | - Joni D Wallis
- Department of Psychology and Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- University of Tübingen, Tübingen, Germany
| | - Timothy E J Behrens
- Wellcome Trust Centre for Human Neuroimaging, University College London, London, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford, UK
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | - Zeb Kurth-Nelson
- Google DeepMind, London, UK.
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, UK.
| | - Steven W Kennerley
- Department of Experimental Psychology, University of Oxford, Oxford, UK.
- Department of Clinical and Movement Neurosciences, University College London, London, UK.
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, John Radcliffe Hospital, Oxford, UK.
| |
Collapse
|
23
|
Šabanović M, Lazari A, Blanco-Pozo M, Tisca C, Tachrount M, Martins-Bach AB, Lerch JP, Walton ME, Bannerman DM. Lasting dynamic effects of the psychedelic 2,5-dimethoxy-4-iodoamphetamine ((±)-DOI) on cognitive flexibility. Mol Psychiatry 2024:10.1038/s41380-024-02439-2. [PMID: 38321122 DOI: 10.1038/s41380-024-02439-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 01/15/2024] [Accepted: 01/17/2024] [Indexed: 02/08/2024]
Abstract
Psychedelic drugs can aid fast and lasting remission from various neuropsychiatric disorders, though the underlying mechanisms remain unclear. Preclinical studies suggest serotonergic psychedelics enhance neuronal plasticity, but whether neuroplastic changes can also be seen at cognitive and behavioural levels is unexplored. Here we show that a single dose of the psychedelic 2,5-dimethoxy-4-iodoamphetamine ((±)-DOI) affects structural brain plasticity and cognitive flexibility in young adult mice beyond the acute drug experience. Using ex vivo magnetic resonance imaging, we show increased volumes of several sensory and association areas one day after systemic administration of 2 mgkg-1 (±)-DOI. We then demonstrate lasting effects of (±)-DOI on cognitive flexibility in a two-step probabilistic reversal learning task where 2 mgkg-1 (±)-DOI improved the rate of adaptation to a novel reversal in task structure occurring one-week post-treatment. Strikingly, (±)-DOI-treated mice started learning from reward omissions, a unique strategy not typically seen in mice in this task, suggesting heightened sensitivity to previously overlooked cues. Crucially, further experiments revealed that (±)-DOI's effects on cognitive flexibility were contingent on the timing between drug treatment and the novel reversal, as well as on the nature of the intervening experience. (±)-DOI's facilitation of both cognitive adaptation and novel thinking strategies may contribute to the clinical benefits of psychedelic-assisted therapy, particularly in cases of perseverative behaviours and a resistance to change seen in depression, anxiety, or addiction. Furthermore, our findings highlight the crucial role of time-dependent neuroplasticity and the influence of experiential factors in shaping the therapeutic potential of psychedelic interventions for impaired cognitive flexibility.
Collapse
Affiliation(s)
- Merima Šabanović
- Department of Experimental Psychology, University of Oxford, OX1 3SR, Oxford, UK.
- Department of Psychiatry, Weill Cornell Medicine, New York, NY, 10065, USA.
| | - Alberto Lazari
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
| | - Marta Blanco-Pozo
- Department of Experimental Psychology, University of Oxford, OX1 3SR, Oxford, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
- Department of Biology, Stanford University, Stanford, CA, 94305, USA
| | - Cristiana Tisca
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
| | - Mohamed Tachrount
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
| | - Aurea B Martins-Bach
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
| | - Jason P Lerch
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK
| | - Mark E Walton
- Department of Experimental Psychology, University of Oxford, OX1 3SR, Oxford, UK.
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, OX3 9DU, Oxford, UK.
| | - David M Bannerman
- Department of Experimental Psychology, University of Oxford, OX1 3SR, Oxford, UK.
| |
Collapse
|
24
|
Amo R. Prediction error in dopamine neurons during associative learning. Neurosci Res 2024; 199:12-20. [PMID: 37451506 DOI: 10.1016/j.neures.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 06/18/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
Dopamine neurons have long been thought to facilitate learning by broadcasting reward prediction error (RPE), a teaching signal used in machine learning, but more recent work has advanced alternative models of dopamine's computational role. Here, I revisit this critical issue and review new experimental evidences that tighten the link between dopamine activity and RPE. First, I introduce the recent observation of a gradual backward shift of dopamine activity that had eluded researchers for over a decade. I also discuss several other findings, such as dopamine ramping, that were initially interpreted to conflict but later found to be consistent with RPE. These findings improve our understanding of neural computation in dopamine neurons.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
25
|
Yuan B, Zhang J, Lyu A, Wu J, Wang Z, Yang M, Liu K, Mou M, Cui P. Emergence and Causality in Complex Systems: A Survey of Causal Emergence and Related Quantitative Studies. ENTROPY (BASEL, SWITZERLAND) 2024; 26:108. [PMID: 38392363 PMCID: PMC10887681 DOI: 10.3390/e26020108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 02/24/2024]
Abstract
Emergence and causality are two fundamental concepts for understanding complex systems. They are interconnected. On one hand, emergence refers to the phenomenon where macroscopic properties cannot be solely attributed to the cause of individual properties. On the other hand, causality can exhibit emergence, meaning that new causal laws may arise as we increase the level of abstraction. Causal emergence (CE) theory aims to bridge these two concepts and even employs measures of causality to quantify emergence. This paper provides a comprehensive review of recent advancements in quantitative theories and applications of CE. It focuses on two primary challenges: quantifying CE and identifying it from data. The latter task requires the integration of machine learning and neural network techniques, establishing a significant link between causal emergence and machine learning. We highlight two problem categories: CE with machine learning and CE for machine learning, both of which emphasize the crucial role of effective information (EI) as a measure of causal emergence. The final section of this review explores potential applications and provides insights into future perspectives.
Collapse
Affiliation(s)
- Bing Yuan
- Swarma Research, Beijing 100085, China
| | - Jiang Zhang
- Swarma Research, Beijing 100085, China
- School of Systems Sciences, Beijing Normal University, Beijing 100875, China
| | - Aobo Lyu
- Department of Electrical and Systems Engineering, Washington University, St. Louis, MO 63130, USA
| | - Jiayun Wu
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| | - Zhipeng Wang
- School of Systems Sciences, Beijing Normal University, Beijing 100875, China
| | - Mingzhe Yang
- School of Systems Sciences, Beijing Normal University, Beijing 100875, China
| | - Kaiwei Liu
- School of Systems Sciences, Beijing Normal University, Beijing 100875, China
| | - Muyun Mou
- School of Systems Sciences, Beijing Normal University, Beijing 100875, China
| | - Peng Cui
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
26
|
Tolooshams B, Matias S, Wu H, Temereanca S, Uchida N, Murthy VN, Masset P, Ba D. Interpretable deep learning for deconvolutional analysis of neural signals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.05.574379. [PMID: 38260512 PMCID: PMC10802267 DOI: 10.1101/2024.01.05.574379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The widespread adoption of deep learning to build models that capture the dynamics of neural populations is typically based on "black-box" approaches that lack an interpretable link between neural activity and function. Here, we propose to apply algorithm unrolling, a method for interpretable deep learning, to design the architecture of sparse deconvolutional neural networks and obtain a direct interpretation of network weights in relation to stimulus-driven single-neuron activity through a generative model. We characterize our method, referred to as deconvolutional unrolled neural learning (DUNL), and show its versatility by applying it to deconvolve single-trial local signals across multiple brain areas and recording modalities. To exemplify use cases of our decomposition method, we uncover multiplexed salience and reward prediction error signals from midbrain dopamine neurons in an unbiased manner, perform simultaneous event detection and characterization in somatosensory thalamus recordings, and characterize the responses of neurons in the piriform cortex. Our work leverages the advances in interpretable deep learning to gain a mechanistic understanding of neural dynamics.
Collapse
Affiliation(s)
- Bahareh Tolooshams
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, 02138
- Computing + Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| | - Sara Matias
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Hao Wu
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Simona Temereanca
- Carney Institute for Brain Science, Brown University, Providence, RI, 02906
| | - Naoshige Uchida
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Venkatesh N. Murthy
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
| | - Paul Masset
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- Department of Molecular and Cellular Biology, Harvard University, Cambridge MA, 02138
- Department of Psychology, McGill University, Montréal QC, H3A 1G1
| | - Demba Ba
- Center for Brain Science, Harvard University, Cambridge MA, 02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, 02138
- Kempner Institute for the Study of Natural & Artificial Intelligence, Harvard University, Cambridge MA, 02138
| |
Collapse
|
27
|
Clarke-Williams CJ, Lopes-Dos-Santos V, Lefèvre L, Brizee D, Causse AA, Rothaermel R, Hartwich K, Perestenko PV, Toth R, McNamara CG, Sharott A, Dupret D. Coordinating brain-distributed network activities in memory resistant to extinction. Cell 2024; 187:409-427.e19. [PMID: 38242086 PMCID: PMC7615560 DOI: 10.1016/j.cell.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 07/13/2023] [Accepted: 12/13/2023] [Indexed: 01/21/2024]
Abstract
Certain memories resist extinction to continue invigorating maladaptive actions. The robustness of these memories could depend on their widely distributed implementation across populations of neurons in multiple brain regions. However, how dispersed neuronal activities are collectively organized to underpin a persistent memory-guided behavior remains unknown. To investigate this, we simultaneously monitored the prefrontal cortex, nucleus accumbens, amygdala, hippocampus, and ventral tegmental area (VTA) of the mouse brain from initial recall to post-extinction renewal of a memory involving cocaine experience. We uncover a higher-order pattern of short-lived beta-frequency (15-25 Hz) activities that are transiently coordinated across these networks during memory retrieval. The output of a divergent pathway from upstream VTA glutamatergic neurons, paced by a slower (4-Hz) oscillation, actuates this multi-network beta-band coactivation; its closed-loop phase-informed suppression prevents renewal of cocaine-biased behavior. Binding brain-distributed neural activities in this temporally structured manner may constitute an organizational principle of robust memory expression.
Collapse
Affiliation(s)
- Charlie J Clarke-Williams
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK.
| | - Vítor Lopes-Dos-Santos
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Laura Lefèvre
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Demi Brizee
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Adrien A Causse
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Roman Rothaermel
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Katja Hartwich
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Pavel V Perestenko
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Robert Toth
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Colin G McNamara
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - Andrew Sharott
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK
| | - David Dupret
- Medical Research Council Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 3TH, UK.
| |
Collapse
|
28
|
Xu S, Ren W. Distinct processing of the state prediction error signals in frontal and parietal correlates in learning the environment model. Cereb Cortex 2024; 34:bhad449. [PMID: 38037370 DOI: 10.1093/cercor/bhad449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 10/31/2023] [Indexed: 12/02/2023] Open
Abstract
Goal-directed reinforcement learning constructs a model of how the states in the environment are connected and prospectively evaluates action values by simulating experience. State prediction error (SPE) is theorized as a crucial signal for learning the environment model. However, the underlying neural mechanisms remain unclear. Here, using electroencephalogram, we verified in a two-stage Markov task two neural correlates of SPEs: an early negative correlate transferring from frontal to central electrodes and a late positive correlate over parietal regions. Furthermore, by investigating the effects of explicit knowledge about the environment model and rewards in the environment, we found that, for the parietal correlate, rewards enhanced the representation efficiency (beta values of regression coefficient) of SPEs, whereas explicit knowledge elicited a larger SPE representation (event-related potential activity) for rare transitions. However, for the frontal and central correlates, rewards increased activities in a content-independent way and explicit knowledge enhanced activities only for common transitions. Our results suggest that the parietal correlate of SPEs is responsible for the explicit learning of state transition structure, whereas the frontal and central correlates may be involved in cognitive control. Our study provides novel evidence for distinct roles of the frontal and the parietal cortices in processing SPEs.
Collapse
Affiliation(s)
- Shuyuan Xu
- MOE Key Laboratory of Modern Teaching Technology, Shaanxi Normal University, Xi'an, Shaanxi, China
| | - Wei Ren
- MOE Key Laboratory of Modern Teaching Technology, Shaanxi Normal University, Xi'an, Shaanxi, China
- Faculty of Education, Shaanxi Normal University, Xi'an, Shaanxi, China
| |
Collapse
|
29
|
Chan HK, Toyoizumi T. A multi-stage anticipated surprise model with dynamic expectation for economic decision-making. Sci Rep 2024; 14:657. [PMID: 38182692 PMCID: PMC10770108 DOI: 10.1038/s41598-023-50529-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/20/2023] [Indexed: 01/07/2024] Open
Abstract
There are many modeling works that aim to explain people's behaviors that violate classical economic theories. However, these models often do not take into full account the multi-stage nature of real-life problems and people's tendency in solving complicated problems sequentially. In this work, we propose a descriptive decision-making model for multi-stage problems with perceived post-decision information. In the model, decisions are chosen based on an entity which we call the 'anticipated surprise'. The reference point is determined by the expected value of the possible outcomes, which we assume to be dynamically changing during the mental simulation of a sequence of events. We illustrate how our formalism can help us understand prominent economic paradoxes and gambling behaviors that involve multi-stage or sequential planning. We also discuss how neuroscience findings, like prediction error signals and introspective neuronal replay, as well as psychological theories like affective forecasting, are related to the features in our model. This provides hints for future experiments to investigate the role of these entities in decision-making.
Collapse
Affiliation(s)
- Ho Ka Chan
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
30
|
Lowet AS, Zheng Q, Meng M, Matias S, Drugowitsch J, Uchida N. An opponent striatal circuit for distributional reinforcement learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.02.573966. [PMID: 38260354 PMCID: PMC10802299 DOI: 10.1101/2024.01.02.573966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Machine learning research has achieved large performance gains on a wide range of tasks by expanding the learning target from mean rewards to entire probability distributions of rewards - an approach known as distributional reinforcement learning (RL)1. The mesolimbic dopamine system is thought to underlie RL in the mammalian brain by updating a representation of mean value in the striatum2,3, but little is known about whether, where, and how neurons in this circuit encode information about higher-order moments of reward distributions4. To fill this gap, we used high-density probes (Neuropixels) to acutely record striatal activity from well-trained, water-restricted mice performing a classical conditioning task in which reward mean, reward variance, and stimulus identity were independently manipulated. In contrast to traditional RL accounts, we found robust evidence for abstract encoding of variance in the striatum. Remarkably, chronic ablation of dopamine inputs disorganized these distributional representations in the striatum without interfering with mean value coding. Two-photon calcium imaging and optogenetics revealed that the two major classes of striatal medium spiny neurons - D1 and D2 MSNs - contributed to this code by preferentially encoding the right and left tails of the reward distribution, respectively. We synthesize these findings into a new model of the striatum and mesolimbic dopamine that harnesses the opponency between D1 and D2 MSNs5-15 to reap the computational benefits of distributional RL.
Collapse
Affiliation(s)
- Adam S Lowet
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
- Program in Neuroscience, Harvard University, Boston, MA, USA
| | - Qiao Zheng
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Melissa Meng
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Sara Matias
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Jan Drugowitsch
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Naoshige Uchida
- Center for Brain Science, Harvard University, Cambridge, MA, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
31
|
Hoy CW, Quiroga-Martinez DR, Sandoval E, King-Stephens D, Laxer KD, Weber P, Lin JJ, Knight RT. Asymmetric coding of reward prediction errors in human insula and dorsomedial prefrontal cortex. Nat Commun 2023; 14:8520. [PMID: 38129440 PMCID: PMC10739882 DOI: 10.1038/s41467-023-44248-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 12/05/2023] [Indexed: 12/23/2023] Open
Abstract
The signed value and unsigned salience of reward prediction errors (RPEs) are critical to understanding reinforcement learning (RL) and cognitive control. Dorsomedial prefrontal cortex (dMPFC) and insula (INS) are key regions for integrating reward and surprise information, but conflicting evidence for both signed and unsigned activity has led to multiple proposals for the nature of RPE representations in these brain areas. Recently developed RL models allow neurons to respond differently to positive and negative RPEs. Here, we use intracranially recorded high frequency activity (HFA) to test whether this flexible asymmetric coding strategy captures RPE coding diversity in human INS and dMPFC. At the region level, we found a bias towards positive RPEs in both areas which paralleled behavioral adaptation. At the local level, we found spatially interleaved neural populations responding to unsigned RPE salience and valence-specific positive and negative RPEs. Furthermore, directional connectivity estimates revealed a leading role of INS in communicating positive and unsigned RPEs to dMPFC. These findings support asymmetric coding across distinct but intermingled neural populations as a core principle of RPE processing and inform theories of the role of dMPFC and INS in RL and cognitive control.
Collapse
Affiliation(s)
- Colin W Hoy
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA.
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - David R Quiroga-Martinez
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
- Center for Music in the Brain, Aarhus University & The Royal Academy of Music, Aarhus, Denmark
| | - Eduardo Sandoval
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
| | - David King-Stephens
- Department of Neurology and Neurosurgery, California Pacific Medical Center, San Francisco, CA, USA
- Department of Neurology, Yale School of Medicine, New Haven, CT, USA
| | - Kenneth D Laxer
- Department of Neurology and Neurosurgery, California Pacific Medical Center, San Francisco, CA, USA
| | - Peter Weber
- Department of Neurology and Neurosurgery, California Pacific Medical Center, San Francisco, CA, USA
| | - Jack J Lin
- Department of Neurology, University of California, Davis, Davis, CA, USA
- Center for Mind and Brain, University of California, Davis, Davis, CA, USA
| | - Robert T Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA
- Department of Psychology, University of California, Berkeley, Berkeley, CA, USA
| |
Collapse
|
32
|
Shikano Y, Yagishita S, Tanaka KF, Takata N. Slow-rising and fast-falling dopaminergic dynamics jointly adjust negative prediction error in the ventral striatum. Eur J Neurosci 2023; 58:4502-4522. [PMID: 36843200 DOI: 10.1111/ejn.15945] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 02/22/2023] [Indexed: 02/28/2023]
Abstract
The greater the reward expectations are, the more different the brain's physiological response will be. Although it is well-documented that better-than-expected outcomes are encoded quantitatively via midbrain dopaminergic (DA) activity, it has been less addressed experimentally whether worse-than-expected outcomes are expressed quantitatively as well. We show that larger reward expectations upon unexpected reward omissions are associated with the preceding slower rise and following larger decrease (DA dip) in the DA concentration at the ventral striatum of mice. We set up a lever press task on a fixed ratio (FR) schedule requiring five lever presses as an effort for a food reward (FR5). The mice occasionally checked the food magazine without a reward before completing the task. The percentage of this premature magazine entry (PME) increased as the number of lever presses approached five, showing rising expectations with increasing proximity to task completion, and hence greater reward expectations. Fibre photometry of extracellular DA dynamics in the ventral striatum using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m ) revealed that the slow increase and fast decrease in DA levels around PMEs were correlated with the PME percentage, demonstrating a monotonic relationship between the DA dip amplitude and degree of expectations. Computational modelling of the lever press task implementing temporal difference errors and state transitions replicated the observed correlation between the PME frequency and DA dip amplitude in the FR5 task. Taken together, these findings indicate that the DA dip amplitude represents the degree of reward expectations monotonically, which may guide behavioural adjustment.
Collapse
Affiliation(s)
- Yu Shikano
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Sho Yagishita
- Center for Disease Biology and Integrative Medicine, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Kenji F Tanaka
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| | - Norio Takata
- Division of Brain Sciences, Institute for Advanced Medical Research, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
33
|
Danskin BP, Hattori R, Zhang YE, Babic Z, Aoi M, Komiyama T. Exponential history integration with diverse temporal scales in retrosplenial cortex supports hyperbolic behavior. SCIENCE ADVANCES 2023; 9:eadj4897. [PMID: 38019904 PMCID: PMC10686558 DOI: 10.1126/sciadv.adj4897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 10/27/2023] [Indexed: 12/01/2023]
Abstract
Animals use past experience to guide future choices. The integration of experiences typically follows a hyperbolic, rather than exponential, decay pattern with a heavy tail for distant history. Hyperbolic integration affords sensitivity to both recent environmental dynamics and long-term trends. However, it is unknown how the brain implements hyperbolic integration. We found that mouse behavior in a foraging task showed hyperbolic decay of past experience, but the activity of cortical neurons showed exponential decay. We resolved this apparent mismatch by observing that cortical neurons encode history information with heterogeneous exponential time constants that vary across neurons. A model combining these diverse timescales recreated the heavy-tailed, hyperbolic history integration observed in behavior. In particular, the time constants of retrosplenial cortex (RSC) neurons best matched the behavior, and optogenetic inactivation of RSC uniquely reduced behavioral history dependence. These results indicate that behavior-relevant history information is maintained across multiple timescales in parallel and that RSC is a critical reservoir of information guiding decision-making.
Collapse
Affiliation(s)
- Bethanny P. Danskin
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Ryoma Hattori
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Yu E. Zhang
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Zeljana Babic
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Mikio Aoi
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| | - Takaki Komiyama
- Department of Neurobiology, University of California San Diego, La Jolla, CA, USA
- Center for Neural Circuits and Behavior, University of California San Diego, La Jolla, CA, USA
- Department of Neurosciences, University of California San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
34
|
Sands LP, Jiang A, Liebenow B, DiMarco E, Laxton AW, Tatter SB, Montague PR, Kishida KT. Subsecond fluctuations in extracellular dopamine encode reward and punishment prediction errors in humans. SCIENCE ADVANCES 2023; 9:eadi4927. [PMID: 38039368 PMCID: PMC10691773 DOI: 10.1126/sciadv.adi4927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 11/03/2023] [Indexed: 12/03/2023]
Abstract
In the mammalian brain, midbrain dopamine neuron activity is hypothesized to encode reward prediction errors that promote learning and guide behavior by causing rapid changes in dopamine levels in target brain regions. This hypothesis (and alternatives regarding dopamine's role in punishment-learning) has limited direct evidence in humans. We report intracranial, subsecond measurements of dopamine release in human striatum measured, while volunteers (i.e., patients undergoing deep brain stimulation surgery) performed a probabilistic reward and punishment learning choice task designed to test whether dopamine release encodes only reward prediction errors or whether dopamine release may also encode adaptive punishment learning signals. Results demonstrate that extracellular dopamine levels can encode both reward and punishment prediction errors within distinct time intervals via independent valence-specific pathways in the human brain.
Collapse
Affiliation(s)
- L. Paul Sands
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Angela Jiang
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Brittany Liebenow
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Emily DiMarco
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Adrian W. Laxton
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - Stephen B. Tatter
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| | - P. Read Montague
- Wellcome Centre for Human Neuroimaging, University College London, WC1N 3BG London, UK
- Fralin Biomedical Research Institute, Virginia Tech, Roanoke, VA 24016, USA
- Department of Physics, Virginia Tech, Blacksburg, VA 24061, USA
| | - Kenneth T. Kishida
- Neuroscience Graduate Program, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Physiology and Pharmacology, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
- Department of Neurosurgery, Wake Forest School of Medicine, Winston-Salem, NC 27101, USA
| |
Collapse
|
35
|
Blaess S, Krabbe S. Cell type specificity for circuit output in the midbrain dopaminergic system. Curr Opin Neurobiol 2023; 83:102811. [PMID: 37972537 DOI: 10.1016/j.conb.2023.102811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 09/14/2023] [Accepted: 10/19/2023] [Indexed: 11/19/2023]
Abstract
Midbrain dopaminergic neurons are a relatively small group of neurons in the mammalian brain controlling a wide range of behaviors. In recent years, increasingly sophisticated tracing, imaging, transcriptomic, and machine learning approaches have provided substantial insights into the anatomical, molecular, and functional heterogeneity of dopaminergic neurons. Despite this wealth of new knowledge, it remains unclear whether and how the diverse features defining dopaminergic subclasses converge to delineate functional ensembles within the dopaminergic system. Here, we review recent studies investigating various aspects of dopaminergic heterogeneity and discuss how development, behavior, and disease influence subtype characteristics. We then outline what further approaches could be pursued to gain a more inclusive picture of dopaminergic diversity, which could be crucial to understanding the functional architecture of this system.
Collapse
Affiliation(s)
- Sandra Blaess
- Neurodevelopmental Genetics, Institute of Reconstructive Neurobiology, Medical Faculty, University of Bonn, 53127 Bonn, Germany.
| | - Sabine Krabbe
- German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany.
| |
Collapse
|
36
|
Pinto SR, Uchida N. Tonic dopamine and biases in value learning linked through a biologically inspired reinforcement learning model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.10.566580. [PMID: 38014087 PMCID: PMC10680794 DOI: 10.1101/2023.11.10.566580] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
A hallmark of various psychiatric disorders is biased future predictions. Here we examined the mechanisms for biased value learning using reinforcement learning models incorporating recent findings on synaptic plasticity and opponent circuit mechanisms in the basal ganglia. We show that variations in tonic dopamine can alter the balance between learning from positive and negative reward prediction errors, leading to biased value predictions. This bias arises from the sigmoidal shapes of the dose-occupancy curves and distinct affinities of D1- and D2-type dopamine receptors: changes in tonic dopamine differentially alters the slope of the dose-occupancy curves of these receptors, thus sensitivities, at baseline dopamine concentrations. We show that this mechanism can explain biased value learning in both mice and humans and may also contribute to symptoms observed in psychiatric disorders. Our model provides a foundation for understanding the basal ganglia circuit and underscores the significance of tonic dopamine in modulating learning processes.
Collapse
Affiliation(s)
- Sandra Romero Pinto
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
- Program in Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
37
|
Shih WY, Yu HY, Lee CC, Chou CC, Chen C, Glimcher PW, Wu SW. Electrophysiological population dynamics reveal context dependencies during decision making in human frontal cortex. Nat Commun 2023; 14:7821. [PMID: 38016973 PMCID: PMC10684521 DOI: 10.1038/s41467-023-42092-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 09/28/2023] [Indexed: 11/30/2023] Open
Abstract
Evidence from monkeys and humans suggests that the orbitofrontal cortex (OFC) encodes the subjective value of options under consideration during choice. Data from non-human primates suggests that these value signals are context-dependent, representing subjective value in a way influenced by the decision makers' recent experience. Using electrodes distributed throughout cortical and subcortical structures, human epilepsy patients performed an auction task where they repeatedly reported the subjective values they placed on snack food items. High-gamma activity in many cortical and subcortical sites including the OFC positively correlated with subjective value. Other OFC sites showed signals contextually modulated by the subjective value of previously offered goods-a context dependency predicted by theory but not previously observed in humans. These results suggest that value and value-context signals are simultaneously present but separately represented in human frontal cortical activity.
Collapse
Affiliation(s)
- Wan-Yu Shih
- Institute of Neuroscience, College of Life Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC.
| | - Hsiang-Yu Yu
- College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
| | - Cheng-Chia Lee
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
- Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
| | - Chien-Chen Chou
- College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
| | - Chien Chen
- College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
- Department of Epilepsy, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
| | - Paul W Glimcher
- Neuroscience Institute, NYU Grossman School of Medicine, New York, NY, USA.
- Department of Neuroscience and Physiology, NYU Grossman School of Medicine, New York, NY, USA.
| | - Shih-Wei Wu
- Institute of Neuroscience, College of Life Sciences, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC.
- Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC.
| |
Collapse
|
38
|
Willmore L, Minerva AR, Engelhard B, Murugan M, McMannon B, Oak N, Thiberge SY, Peña CJ, Witten IB. Overlapping representations of food and social stimuli in mouse VTA dopamine neurons. Neuron 2023; 111:3541-3553.e8. [PMID: 37657441 DOI: 10.1016/j.neuron.2023.08.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 05/17/2023] [Accepted: 08/03/2023] [Indexed: 09/03/2023]
Abstract
Dopamine neurons of the ventral tegmental area (VTADA) respond to food and social stimuli and contribute to both forms of motivation. However, it is unclear whether the same or different VTADA neurons encode these different stimuli. To address this question, we performed two-photon calcium imaging in mice presented with food and conspecifics and found statistically significant overlap in the populations responsive to both stimuli. Both hunger and opposite-sex social experience further increased the proportion of neurons that respond to both stimuli, implying that increasing motivation for one stimulus increases overlap. In addition, single-nucleus RNA sequencing revealed significant co-expression of feeding- and social-hormone-related genes in individual VTADA neurons. Taken together, our functional and transcriptional data suggest overlapping VTADA populations underlie food and social motivation.
Collapse
Affiliation(s)
- Lindsay Willmore
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Adelaide R Minerva
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Ben Engelhard
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Faculty of Medicine, Technion, Haifa 3525433, Israel.
| | - Malavika Murugan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Brenna McMannon
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Nirja Oak
- Faculty of Medicine, Technion, Haifa 3525433, Israel
| | - Stephan Y Thiberge
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA
| | - Catherine J Peña
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA.
| | - Ilana B Witten
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
39
|
Ferrari-Toniolo S, Schultz W. Reliable population code for subjective economic value from heterogeneous neuronal signals in primate orbitofrontal cortex. Neuron 2023; 111:3683-3696.e7. [PMID: 37678250 DOI: 10.1016/j.neuron.2023.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 03/31/2023] [Accepted: 08/08/2023] [Indexed: 09/09/2023]
Abstract
Behavior-related neuronal signals often vary between neurons, which might reflect the unreliability of individual neurons or a truly heterogeneous code. This notion may also apply to economic ("value-based") choices and the underlying reward signals. Reward value is subjective and can be described by a nonlinearly weighted magnitude (utility) and probability. Defining subjective values relies on the continuity axiom, whose testing involves structured variations of a wide range of reward magnitudes and probabilities. Axiom compliance demonstrates understanding of the stimuli and the meaningful character of choices. Using these tests, we investigated the encoding of subjective economic value by neurons in a key economic-decision structure of the monkey brain, the orbitofrontal cortex (OFC). We found that individual neurons carry heterogeneous neuronal value signals that largely fail to match the animal's choices. However, neuronal population signals matched the animal's choices well, suggesting accurate subjective economic value encoding by a heterogeneous population of unreliable neurons.
Collapse
Affiliation(s)
- Simone Ferrari-Toniolo
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK.
| | - Wolfram Schultz
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| |
Collapse
|
40
|
Masset P, Tano P, Kim HR, Malik AN, Pouget A, Uchida N. Multi-timescale reinforcement learning in the brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.12.566754. [PMID: 38014166 PMCID: PMC10680596 DOI: 10.1101/2023.11.12.566754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
To thrive in complex environments, animals and artificial agents must learn to act adaptively to maximize fitness and rewards. Such adaptive behavior can be learned through reinforcement learning1, a class of algorithms that has been successful at training artificial agents2-6 and at characterizing the firing of dopamine neurons in the midbrain7-9. In classical reinforcement learning, agents discount future rewards exponentially according to a single time scale, controlled by the discount factor. Here, we explore the presence of multiple timescales in biological reinforcement learning. We first show that reinforcement agents learning at a multitude of timescales possess distinct computational benefits. Next, we report that dopamine neurons in mice performing two behavioral tasks encode reward prediction error with a diversity of discount time constants. Our model explains the heterogeneity of temporal discounting in both cue-evoked transient responses and slower timescale fluctuations known as dopamine ramps. Crucially, the measured discount factor of individual neurons is correlated across the two tasks suggesting that it is a cell-specific property. Together, our results provide a new paradigm to understand functional heterogeneity in dopamine neurons, a mechanistic basis for the empirical observation that humans and animals use non-exponential discounts in many situations10-14, and open new avenues for the design of more efficient reinforcement learning algorithms.
Collapse
Affiliation(s)
- Paul Masset
- Department of Molecular and Cellular Biology, Harvard University, USA
- Center for Brain Science, Harvard University, USA
| | - Pablo Tano
- Department of Basic Neuroscience, University of Geneva, Switzerland
| | - HyungGoo R. Kim
- Department of Molecular and Cellular Biology, Harvard University, USA
- Center for Brain Science, Harvard University, USA
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
- Center for Neuroscience Imaging Research, Institute for Basic Science (IBS), Suwon 16419, Republic of Korea
| | - Athar N. Malik
- Department of Molecular and Cellular Biology, Harvard University, USA
- Center for Brain Science, Harvard University, USA
- Department of Neurosurgery, Warren Alpert Medical School of Brown University, USA
- Norman Prince Neurosciences Institute, Rhode Island Hospital, USA
| | - Alexandre Pouget
- Department of Basic Neuroscience, University of Geneva, Switzerland
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, USA
- Center for Brain Science, Harvard University, USA
| |
Collapse
|
41
|
Amo R, Uchida N, Watabe-Uchida M. Glutamate inputs send prediction error of reward but not negative value of aversive stimuli to dopamine neurons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566472. [PMID: 37986868 PMCID: PMC10659341 DOI: 10.1101/2023.11.09.566472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Midbrain dopamine neurons are thought to signal reward prediction errors (RPEs) but the mechanisms underlying RPE computation, particularly contributions of different neurotransmitters, remain poorly understood. Here we used a genetically-encoded glutamate sensor to examine the pattern of glutamate inputs to dopamine neurons. We found that glutamate inputs exhibit virtually all of the characteristics of RPE, rather than conveying a specific component of RPE computation such as reward or expectation. Notably, while glutamate inputs were transiently inhibited by reward omission, they were excited by aversive stimuli. Opioid analgesics altered dopamine negative responses to aversive stimuli toward more positive responses, while excitatory responses of glutamate inputs remained unchanged. Our findings uncover previously unknown synaptic mechanisms underlying RPE computations; dopamine responses are shaped by both synergistic and competitive interactions between glutamatergic and GABAergic inputs to dopamine neurons depending on valences, with competitive interactions playing a role in responses to aversive stimuli.
Collapse
Affiliation(s)
- Ryunosuke Amo
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Mitsuko Watabe-Uchida
- Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
42
|
Stetsenko A, Koos T. Neuronal implementation of the temporal difference learning algorithm in the midbrain dopaminergic system. Proc Natl Acad Sci U S A 2023; 120:e2309015120. [PMID: 37903252 PMCID: PMC10636325 DOI: 10.1073/pnas.2309015120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 09/29/2023] [Indexed: 11/01/2023] Open
Abstract
The temporal difference learning (TDL) algorithm has been essential to conceptualizing the role of dopamine in reinforcement learning (RL). Despite its theoretical importance, it remains unknown whether a neuronal implementation of this algorithm exists in the brain. Here, we provide an interpretation of the recently described signaling properties of ventral tegmental area (VTA) GABAergic neurons and show that a circuitry of these neurons implements the TDL algorithm. Specifically, we identified the neuronal mechanism of three key components of the TDL model: a sustained state value signal encoded by an afferent input to the VTA, a temporal differentiation circuit formed by two types of VTA GABAergic neurons the combined output of which computes momentary reward prediction (RP) as the derivative of the state value, and the computation of reward prediction errors (RPEs) in dopamine neurons utilizing the output of the differentiation circuit. Using computational methods, we also show that this mechanism is optimally adapted to the biophysics of RPE signaling in dopamine neurons, mechanistically links the emergence of conditioned reinforcement to RP, and can naturally account for the temporal discounting of reinforcement. Elucidating the implementation of the TDL algorithm may further the investigation of RL in biological and artificial systems.
Collapse
Affiliation(s)
- Anya Stetsenko
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ07102
| | - Tibor Koos
- Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ07102
| |
Collapse
|
43
|
Kalhan S, Garrido MI, Hester R, Redish AD. Reward prediction-errors weighted by cue salience produces addictive behaviours in simulations, with asymmetrical learning and steeper delay discounting. Neural Netw 2023; 168:631-650. [PMID: 37844522 DOI: 10.1016/j.neunet.2023.09.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 07/23/2023] [Accepted: 09/19/2023] [Indexed: 10/18/2023]
Abstract
Dysfunction in learning and motivational systems are thought to contribute to addictive behaviours. Previous models have suggested that dopaminergic roles in learning and motivation could produce addictive behaviours through pharmacological manipulations that provide excess dopaminergic signalling towards these learning and motivational systems. Redish (2004) suggested a role based on dopaminergic signals of value prediction error, while (Zhang et al., 2009) suggested a role based on dopaminergic signals of motivation. However, both models present significant limitations. They do not explain the reduced sensitivity to drug-related costs/negative consequences, the increased impulsivity generally found in people with a substance use disorder, craving behaviours, and non-pharmacological dependence, all of which are key hallmarks of addictive behaviours. Here, we propose a novel mathematical definition of salience, that combines aspects of dopamine's role in both learning and motivation within the reinforcement learning framework. Using a single parameter regime, we simulated addictive behaviours that the (Zhang et al., 2009; Redish, 2004) models also produce but we went further in simulating the downweighting of drug-related negative prediction-errors, steeper delay discounting of drug rewards, craving behaviours and aspects of behavioural/non-pharmacological addictions. The current salience model builds on our recently proposed conceptual theory that salience modulates internal representation updating and may contribute to addictive behaviours by producing misaligned internal representations (Kalhan et al., 2021). Critically, our current mathematical model of salience argues that the seemingly disparate learning and motivational aspects of dopaminergic functioning may interact through a salience mechanism that modulates internal representation updating.
Collapse
Affiliation(s)
- Shivam Kalhan
- University of Melbourne, School of Psychological Sciences, Melbourne, Victoria, Australia.
| | - Marta I Garrido
- University of Melbourne, School of Psychological Sciences, Melbourne, Victoria, Australia; Graeme Clark Institute for Biomedical Engineering, Melbourne, Victoria, Australia
| | - Robert Hester
- University of Melbourne, School of Psychological Sciences, Melbourne, Victoria, Australia
| | - A David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
44
|
Walker EY, Pohl S, Denison RN, Barack DL, Lee J, Block N, Ma WJ, Meyniel F. Studying the neural representations of uncertainty. Nat Neurosci 2023; 26:1857-1867. [PMID: 37814025 DOI: 10.1038/s41593-023-01444-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 08/30/2023] [Indexed: 10/11/2023]
Abstract
The study of the brain's representations of uncertainty is a central topic in neuroscience. Unlike most quantities of which the neural representation is studied, uncertainty is a property of an observer's beliefs about the world, which poses specific methodological challenges. We analyze how the literature on the neural representations of uncertainty addresses those challenges and distinguish between 'code-driven' and 'correlational' approaches. Code-driven approaches make assumptions about the neural code for representing world states and the associated uncertainty. By contrast, correlational approaches search for relationships between uncertainty and neural activity without constraints on the neural representation of the world state that this uncertainty accompanies. To compare these two approaches, we apply several criteria for neural representations: sensitivity, specificity, invariance and functionality. Our analysis reveals that the two approaches lead to different but complementary findings, shaping new research questions and guiding future experiments.
Collapse
Affiliation(s)
- Edgar Y Walker
- Department of Physiology and Biophysics, Computational Neuroscience Center, University of Washington, Seattle, WA, USA
| | - Stephan Pohl
- Department of Philosophy, New York University, New York, NY, USA
| | - Rachel N Denison
- Department of Psychological & Brain Sciences, Boston University, Boston, MA, USA
| | - David L Barack
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
- Department of Philosophy, University of Pennsylvania, Philadelphia, PA, USA
| | - Jennifer Lee
- Center for Neural Science, New York University, New York, NY, USA
| | - Ned Block
- Department of Philosophy, New York University, New York, NY, USA
| | - Wei Ji Ma
- Center for Neural Science, New York University, New York, NY, USA
- Department of Psychology, New York University, New York, NY, USA
| | - Florent Meyniel
- Cognitive Neuroimaging Unit, INSERM, CEA, CNRS, Université Paris-Saclay, NeuroSpin center, Gif-sur-Yvette, France.
| |
Collapse
|
45
|
Chen L, Liang X, Feng Y, Zhang L, Yang J, Liu Z. Online Intention Recognition With Incomplete Information Based on a Weighted Contrastive Predictive Coding Model in Wargame. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7515-7528. [PMID: 35108210 DOI: 10.1109/tnnls.2022.3144171] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The incomplete and imperfect essence of the battlefield situation results in a challenge to the efficiency, stability, and reliability of traditional intention recognition methods. For this problem, we propose a deep learning architecture that consists of a contrastive predictive coding (CPC) model, a variable-length long short-term memory network (LSTM) model, and an attention weight allocator for online intention recognition with incomplete information in wargame (W-CPCLSTM). First, based on the typical characteristics of intelligence data, a CPC model is designed to capture more global structures from limited battlefield information. Then, a variable-length LSTM model is employed to classify the learned representations into predefined intention categories. Next, a weighted approach to the training attention of CPC and LSTM is introduced to allow for the stability of the model. Finally, performance evaluation and application analysis of the proposed model for the online intention recognition task were carried out based on four different degrees of detection information and a perfect situation of ideal conditions in a wargame. Besides, we explored the effect of different lengths of intelligence data on recognition performance and gave application examples of the proposed model to a wargame platform. The simulation results demonstrate that our method not only contributes to the growth of recognition stability, but it also improves recognition accuracy by 7%-11%, 3%-7%, 3%-13%, and 3%-7%, the recognition speed by 6- 32× , 4- 18× , 13-* × , and 1- 6× compared with the traditional LSTM, classical FCN, OctConv, and OctFCN models, respectively, which characterizes it as a promising reference tool for command decision-making.
Collapse
|
46
|
Azcorra M, Gaertner Z, Davidson C, He Q, Kim H, Nagappan S, Hayes CK, Ramakrishnan C, Fenno L, Kim YS, Deisseroth K, Longnecker R, Awatramani R, Dombeck DA. Unique functional responses differentially map onto genetic subtypes of dopamine neurons. Nat Neurosci 2023; 26:1762-1774. [PMID: 37537242 PMCID: PMC10545540 DOI: 10.1038/s41593-023-01401-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 07/05/2023] [Indexed: 08/05/2023]
Abstract
Dopamine neurons are characterized by their response to unexpected rewards, but they also fire during movement and aversive stimuli. Dopamine neuron diversity has been observed based on molecular expression profiles; however, whether different functions map onto such genetic subtypes remains unclear. In this study, we established that three genetic dopamine neuron subtypes within the substantia nigra pars compacta, characterized by the expression of Slc17a6 (Vglut2), Calb1 and Anxa1, each have a unique set of responses to rewards, aversive stimuli and accelerations and decelerations, and these signaling patterns are highly correlated between somas and axons within subtypes. Remarkably, reward responses were almost entirely absent in the Anxa1+ subtype, which instead displayed acceleration-correlated signaling. Our findings establish a connection between functional and genetic dopamine neuron subtypes and demonstrate that molecular expression patterns can serve as a common framework to dissect dopaminergic functions.
Collapse
Affiliation(s)
- Maite Azcorra
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
- Department of Neurology, Northwestern University, Chicago, IL, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Zachary Gaertner
- Department of Neurology, Northwestern University, Chicago, IL, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Connor Davidson
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Qianzi He
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Hailey Kim
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
| | - Shivathmihai Nagappan
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Cooper K Hayes
- Department of Microbiology and Immunology, Northwestern University, Chicago, IL, USA
| | - Charu Ramakrishnan
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
| | - Lief Fenno
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
- Departments of Neuroscience & Psychiatry, The University of Texas at Austin, Austin, TX, USA
| | - Yoon Seok Kim
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
| | - Karl Deisseroth
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
| | - Richard Longnecker
- Department of Microbiology and Immunology, Northwestern University, Chicago, IL, USA
| | - Rajeshwar Awatramani
- Department of Neurology, Northwestern University, Chicago, IL, USA.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA.
| | - Daniel A Dombeck
- Department of Neurobiology, Northwestern University, Evanston, IL, USA.
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA.
| |
Collapse
|
47
|
Cone I, Clopath C, Shouval HZ. Learning to Express Reward Prediction Error-like Dopaminergic Activity Requires Plastic Representations of Time. RESEARCH SQUARE 2023:rs.3.rs-3289985. [PMID: 37790466 PMCID: PMC10543312 DOI: 10.21203/rs.3.rs-3289985/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference (TD) reinforcement learning. The TD framework predicts that some neuronal elements should represent the reward prediction error (RPE), which means they signal the difference between the expected future rewards and the actual rewards. The prominence of the TD theory arises from the observation that firing properties of dopaminergic neurons in the ventral tegmental area appear similar to those of RPE model-neurons in TD learning. Previous implementations of TD learning assume a fixed temporal basis for each stimulus that might eventually predict a reward. Here we show that such a fixed temporal basis is implausible and that certain predictions of TD learning are inconsistent with experiments. We propose instead an alternative theoretical framework, coined FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, feature specific representations of time are learned, allowing for neural representations of stimuli to adjust their timing and relation to rewards in an online manner. In FLEX dopamine acts as an instructive signal which helps build temporal models of the environment. FLEX is a general theoretical framework that has many possible biophysical implementations. In order to show that FLEX is a feasible approach, we present a specific biophysically plausible model which implements the principles of FLEX. We show that this implementation can account for various reinforcement learning paradigms, and that its results and predictions are consistent with a preponderance of both existing and reanalyzed experimental data.
Collapse
Affiliation(s)
- Ian Cone
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX
- Applied Physics Program, Rice University, Houston, TX
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, United Kingdom
| | - Harel Z Shouval
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX
- Department of Electrical and Computer Engineering, Rice University, Houston, TX
| |
Collapse
|
48
|
Llobera J, Charbonnier C. Physics-based character animation and human motor control. Phys Life Rev 2023; 46:190-219. [PMID: 37480729 DOI: 10.1016/j.plrev.2023.06.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 06/25/2023] [Indexed: 07/24/2023]
Abstract
Motor neuroscience and physics-based character animation (PBCA) approach human and humanoid control from different perspectives. The primary goal of PBCA is to control the movement of a ragdoll (humanoid or animal) applying forces and torques within a physical simulation. The primary goal of motor neuroscience is to understand the contribution of different parts of the nervous system to generate coordinated movements. We review the functional principles and the functional anatomy of human motor control and the main strategies used in PBCA. We then explore common research points by discussing the functional anatomy and ongoing debates in motor neuroscience from the perspective of PBCA. We also suggest there are several benefits to be found in studying sensorimotor integration and human-character coordination through closer collaboration between these two fields.
Collapse
Affiliation(s)
- Joan Llobera
- Artanim Foundation, 40, chemin du Grand-Puits, 1217 Meyrin - Geneva, Switzerland.
| | - Caecilia Charbonnier
- Artanim Foundation, 40, chemin du Grand-Puits, 1217 Meyrin - Geneva, Switzerland
| |
Collapse
|
49
|
Wärnberg E, Kumar A. Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia. Proc Natl Acad Sci U S A 2023; 120:e2221994120. [PMID: 37527344 PMCID: PMC10410740 DOI: 10.1073/pnas.2221994120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 06/08/2023] [Indexed: 08/03/2023] Open
Abstract
It is well established that midbrain dopaminergic neurons support reinforcement learning (RL) in the basal ganglia by transmitting a reward prediction error (RPE) to the striatum. In particular, different computational models and experiments have shown that a striatum-wide RPE signal can support RL over a small discrete set of actions (e.g., no/no-go, choose left/right). However, there is accumulating evidence that the basal ganglia functions not as a selector between predefined actions but rather as a dynamical system with graded, continuous outputs. To reconcile this view with RL, there is a need to explain how dopamine could support learning of continuous outputs, rather than discrete action values. Inspired by the recent observations that besides RPE, the firing rates of midbrain dopaminergic neurons correlate with motor and cognitive variables, we propose a model in which dopamine signal in the striatum carries a vector-valued error feedback signal (a loss gradient) instead of a homogeneous scalar error (a loss). We implement a local, "three-factor" corticostriatal plasticity rule involving the presynaptic firing rate, a postsynaptic factor, and the unique dopamine concentration perceived by each striatal neuron. With this learning rule, we show that such a vector-valued feedback signal results in an increased capacity to learn a multidimensional series of real-valued outputs. Crucially, we demonstrate that this plasticity rule does not require precise nigrostriatal synapses but remains compatible with experimental observations of random placement of varicosities and diffuse volume transmission of dopamine.
Collapse
Affiliation(s)
- Emil Wärnberg
- Department of Neuroscience, Karolinska Institutet, 171 77Stockholm, Sweden
- Division of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 114 28Stockholm, Sweden
| | - Arvind Kumar
- Division of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 114 28Stockholm, Sweden
| |
Collapse
|
50
|
Grogans SE, Bliss-Moreau E, Buss KA, Clark LA, Fox AS, Keltner D, Cowen AS, Kim JJ, Kragel PA, MacLeod C, Mobbs D, Naragon-Gainey K, Fullana MA, Shackman AJ. The nature and neurobiology of fear and anxiety: State of the science and opportunities for accelerating discovery. Neurosci Biobehav Rev 2023; 151:105237. [PMID: 37209932 PMCID: PMC10330657 DOI: 10.1016/j.neubiorev.2023.105237] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/11/2023] [Accepted: 05/13/2023] [Indexed: 05/22/2023]
Abstract
Fear and anxiety play a central role in mammalian life, and there is considerable interest in clarifying their nature, identifying their biological underpinnings, and determining their consequences for health and disease. Here we provide a roundtable discussion on the nature and biological bases of fear- and anxiety-related states, traits, and disorders. The discussants include scientists familiar with a wide variety of populations and a broad spectrum of techniques. The goal of the roundtable was to take stock of the state of the science and provide a roadmap to the next generation of fear and anxiety research. Much of the discussion centered on the key challenges facing the field, the most fruitful avenues for future research, and emerging opportunities for accelerating discovery, with implications for scientists, funders, and other stakeholders. Understanding fear and anxiety is a matter of practical importance. Anxiety disorders are a leading burden on public health and existing treatments are far from curative, underscoring the urgency of developing a deeper understanding of the factors governing threat-related emotions.
Collapse
Affiliation(s)
- Shannon E Grogans
- Department of Psychology, University of Maryland, College Park, MD 20742, USA
| | - Eliza Bliss-Moreau
- Department of Psychology, University of California, Davis, CA 95616, USA; California National Primate Research Center, University of California, Davis, CA 95616, USA
| | - Kristin A Buss
- Department of Psychology, The Pennsylvania State University, University Park, PA 16802 USA
| | - Lee Anna Clark
- Department of Psychology, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Andrew S Fox
- Department of Psychology, University of California, Davis, CA 95616, USA; California National Primate Research Center, University of California, Davis, CA 95616, USA
| | - Dacher Keltner
- Department of Psychology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | - Jeansok J Kim
- Department of Psychology, University of Washington, Seattle, WA 98195, USA
| | - Philip A Kragel
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| | - Colin MacLeod
- Centre for the Advancement of Research on Emotion, School of Psychological Science, The University of Western Australia, Perth, WA 6009, Australia
| | - Dean Mobbs
- Department of Humanities and Social Sciences, California Institute of Technology, Pasadena, California 91125, USA; Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kristin Naragon-Gainey
- School of Psychological Science, University of Western Australia, Perth, WA 6009, Australia
| | - Miquel A Fullana
- Adult Psychiatry and Psychology Department, Institute of Neurosciences, Hospital Clinic, Barcelona, Spain; Imaging of Mood, and Anxiety-Related Disorders Group, Institut d'Investigacions Biomèdiques August Pi i Sunyer, CIBERSAM, University of Barcelona, Barcelona, Spain
| | - Alexander J Shackman
- Department of Psychology, University of Maryland, College Park, MD 20742, USA; Neuroscience and Cognitive Science Program, University of Maryland, College Park, MD 20742, USA; Maryland Neuroimaging Center, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|