1
|
Sidorenko N, Chung HK, Grueschow M, Quednow BB, Hayward-Könnecke H, Jetter A, Tobler PN. Acetylcholine and noradrenaline enhance foraging optimality in humans. Proc Natl Acad Sci U S A 2023; 120:e2305596120. [PMID: 37639601 PMCID: PMC10483619 DOI: 10.1073/pnas.2305596120] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 07/26/2023] [Indexed: 08/31/2023] Open
Abstract
Foraging theory prescribes when optimal foragers should leave the current option for more rewarding alternatives. Actual foragers often exploit options longer than prescribed by the theory, but it is unclear how this foraging suboptimality arises. We investigated whether the upregulation of cholinergic, noradrenergic, and dopaminergic systems increases foraging optimality. In a double-blind, between-subject design, participants (N = 160) received placebo, the nicotinic acetylcholine receptor agonist nicotine, a noradrenaline reuptake inhibitor reboxetine, or a preferential dopamine reuptake inhibitor methylphenidate, and played the role of a farmer who collected milk from patches with different yield. Across all groups, participants on average overharvested. While methylphenidate had no effects on this bias, nicotine, and to some extent also reboxetine, significantly reduced deviation from foraging optimality, which resulted in better performance compared to placebo. Concurring with amplified goal-directedness and excluding heuristic explanations, nicotine independently also improved trial initiation and time perception. Our findings elucidate the neurochemical basis of behavioral flexibility and decision optimality and open unique perspectives on psychiatric disorders affecting these functions.
Collapse
Affiliation(s)
- Nick Sidorenko
- Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich8006, Switzerland
- Department of Economics, Zurich Center for Neuroeconomics, University of Zurich, Zurich8006, Switzerland
| | - Hui-Kuan Chung
- Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich8006, Switzerland
- Department of Economics, Zurich Center for Neuroeconomics, University of Zurich, Zurich8006, Switzerland
| | - Marcus Grueschow
- Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich8006, Switzerland
- Department of Economics, Zurich Center for Neuroeconomics, University of Zurich, Zurich8006, Switzerland
| | - Boris B. Quednow
- Experimental and Clinical Pharmacopsychology, Department of Psychiatry, Psychotherapy and Psychosomatics, Psychiatric University Hospital Zurich, University of Zurich, Zurich8008, Switzerland
- Neuroscience Center Zurich, ETH Zurich and University of Zurich, Zurich8057, Switzerland
| | - Helen Hayward-Könnecke
- Department of Neurology, Section of Neuroimmunology and Multiple Sclerosis Research, University Hospital Zurich, Zurich8091, Switzerland
| | - Alexander Jetter
- National Poisons Information Centre, Tox Info Suisse, Associated Institute of the University of Zurich, Zurich8032, Switzerland
| | - Philippe N. Tobler
- Department of Economics, Laboratory for Social and Neural Systems Research, University of Zurich, Zurich8006, Switzerland
- Department of Economics, Zurich Center for Neuroeconomics, University of Zurich, Zurich8006, Switzerland
- Neuroscience Center Zurich, ETH Zurich and University of Zurich, Zurich8057, Switzerland
| |
Collapse
|
2
|
Sinclair AH, Wang YC, Adcock RA. Instructed motivational states bias reinforcement learning and memory formation. Proc Natl Acad Sci U S A 2023; 120:e2304881120. [PMID: 37490530 PMCID: PMC10401012 DOI: 10.1073/pnas.2304881120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/19/2023] [Indexed: 07/27/2023] Open
Abstract
Motivation influences goals, decisions, and memory formation. Imperative motivation links urgent goals to actions, narrowing the focus of attention and memory. Conversely, interrogative motivation integrates goals over time and space, supporting rich memory encoding for flexible future use. We manipulated motivational states via cover stories for a reinforcement learning task: The imperative group imagined executing a museum heist, whereas the interrogative group imagined planning a future heist. Participants repeatedly chose among four doors, representing different museum rooms, to sample trial-unique paintings with variable rewards (later converted to bonus payments). The next day, participants performed a surprise memory test. Crucially, only the cover stories differed between the imperative and interrogative groups; the reinforcement learning task was identical, and all participants had the same expectations about how and when bonus payments would be awarded. In an initial sample and a preregistered replication, we demonstrated that imperative motivation increased exploitation during reinforcement learning. Conversely, interrogative motivation increased directed (but not random) exploration, despite the cost to participants' earnings. At test, the interrogative group was more accurate at recognizing paintings and recalling associated values. In the interrogative group, higher value paintings were more likely to be remembered; imperative motivation disrupted this effect of reward modulating memory. Overall, we demonstrate that a prelearning motivational manipulation can bias learning and memory, bearing implications for education, behavior change, clinical interventions, and communication.
Collapse
Affiliation(s)
- Alyssa H. Sinclair
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
| | - Yuxi C. Wang
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
| | - R. Alison Adcock
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
- Department of Psychiatry & Behavioral Sciences, Duke University, Durham, NC27710
| |
Collapse
|
3
|
Chernyshev BV, Pultsina KI, Tretyakova VD, Miasnikova AS, Prokofyev AO, Kozunova GL, Stroganova TA. Losses resulting from deliberate exploration trigger beta oscillations in frontal cortex. Front Neurosci 2023; 17:1152926. [PMID: 37250414 PMCID: PMC10211346 DOI: 10.3389/fnins.2023.1152926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 04/24/2023] [Indexed: 05/31/2023] Open
Abstract
We examined the neural signature of directed exploration by contrasting MEG beta (16-30 Hz) power changes between disadvantageous and advantageous choices in the two-choice probabilistic reward task. We analyzed the choices made after the participants have learned the probabilistic contingency between choices and their outcomes, i.e., acquired the inner model of choice values. Therefore, rare disadvantageous choices might serve explorative, environment-probing purposes. The study brought two main findings. Firstly, decision making leading to disadvantageous choices took more time and evidenced greater large-scale suppression of beta oscillations than its advantageous alternative. Additional neural resources recruited during disadvantageous decisions strongly suggest their deliberately explorative nature. Secondly, an outcome of disadvantageous and advantageous choices had qualitatively different impact on feedback-related beta oscillations. After the disadvantageous choices, only losses-but not gains-were followed by late beta synchronization in frontal cortex. Our results are consistent with the role of frontal beta oscillations in the stabilization of neural representations for selected behavioral rule when explorative strategy conflicts with value-based behavior. Punishment for explorative choice being congruent with its low value in the reward history is more likely to strengthen, through punishment-related beta oscillations, the representation of exploitative choices consistent with the inner utility model.
Collapse
Affiliation(s)
- Boris V. Chernyshev
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
- Department of Higher Nervous Activity, Lomonosov Moscow State University, Moscow, Russia
- Department of Psychology, Higher School of Economics, Moscow, Russia
| | - Kristina I. Pultsina
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| | - Vera D. Tretyakova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| | - Aleksandra S. Miasnikova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| | - Andrey O. Prokofyev
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| | - Galina L. Kozunova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| | - Tatiana A. Stroganova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russia
| |
Collapse
|
4
|
Pupil dilation and response slowing distinguish deliberate explorative choices in the probabilistic learning task. COGNITIVE, AFFECTIVE, & BEHAVIORAL NEUROSCIENCE 2022; 22:1108-1129. [PMID: 35359274 PMCID: PMC9458574 DOI: 10.3758/s13415-022-00996-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/07/2022] [Indexed: 12/22/2022]
Abstract
This study examined whether pupil size and response time would distinguish directed exploration from random exploration and exploitation. Eighty-nine participants performed the two-choice probabilistic learning task while their pupil size and response time were continuously recorded. Using LMM analysis, we estimated differences in the pupil size and response time between the advantageous and disadvantageous choices as a function of learning success, i.e., whether or not a participant has learned the probabilistic contingency between choices and their outcomes. We proposed that before a true value of each choice became known to a decision-maker, both advantageous and disadvantageous choices represented a random exploration of the two options with an equally uncertain outcome, whereas the same choices after learning manifested exploitation and direct exploration strategies, respectively. We found that disadvantageous choices were associated with increases both in response time and pupil size, but only after the participants had learned the choice-reward contingencies. For the pupil size, this effect was strongly amplified for those disadvantageous choices that immediately followed gains as compared to losses in the preceding choice. Pupil size modulations were evident during the behavioral choice rather than during the pretrial baseline. These findings suggest that occasional disadvantageous choices, which violate the acquired internal utility model, represent directed exploration. This exploratory strategy shifts choice priorities in favor of information seeking and its autonomic and behavioral concomitants are mainly driven by the conflict between the behavioral plan of the intended exploratory choice and its strong alternative, which has already proven to be more rewarding.
Collapse
|
5
|
Dijkstra FM, Zuiker RGJA, Siebenga PS, Leigh-Pemberton RA, Sun L, Manthis JD, de Kam ML, Lin R, von Moltke LL, Rezendes D, van Gerven JMA. Pharmacological profile of ALKS 7119, an investigational compound evaluated for the treatment of neuropsychiatric disorders, in healthy volunteers. Br J Clin Pharmacol 2022; 88:2909-2925. [PMID: 35014069 PMCID: PMC9302689 DOI: 10.1111/bcp.15229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 12/31/2021] [Accepted: 01/03/2022] [Indexed: 11/29/2022] Open
Abstract
AIM ALKS 7119 is a novel compound with in vitro affinity highest for the SERT, and for μ receptor, α1A -adrenoceptor, α1B -adrenoceptor, NMDA receptor and sigma non-opioid intracellular receptor 1. This first-in-human study evaluated safety and PK/PD effects of single ascending doses (SAD) of ALKS 7119 in healthy males and compared effects with neurotransmitter modulators with partially overlapping targets. METHODS In 10 cohorts (n=10 subjects each), PK, safety and PD (NeuroCart tests, measuring neurophysiologic effects [pupillometry, pharmaco-EEG (pEEG)], visuomotor coordination, alertness, [sustained] attention [saccadic peak velocity, adaptive tracking], subjective drug effects [VAS Bowdle and VAS Bond and Lader] and postural stability [body sway]) were evaluated. Neuroendocrine effects (cortisol, prolactin, growth hormone) were measured. Data were analysed over the 12-hour post-dose period using mixed-effects model for repeated measure (MMRM) with baseline as covariate. RESULTS ALKS 7119 demonstrated linear PK and was generally well tolerated. QTcF interval increases of 30-60 ms compared to baseline were observed with ALKS 7119 doses of ≥50 mg without related adverse events. Significant increases in left and right pupil/iris ratio were observed at dose levels ≥50 mg (estimate of difference [95%CI], p-value) (0.04 [0.01; 0.07], P < 0.01) and (0.06 [0.03; 0.09], P = 0.01), respectively. From dose levels ≥50 mg significant increases (% change) of serum cortisol (51.7 [8.4; 112.3], P = 0.02) and prolactin (77.9 [34.2; 135.8], P < 0.01) were observed. CONCLUSION In line with ALKS 7119's in vitro pharmacological profile, the clinical profile observed in this study is most comparable to SERT inhibition.
Collapse
Affiliation(s)
- Francis M Dijkstra
- Centre for Human Drug Research (CHDR), Leiden, CL, Leiden, The Netherlands.,Leiden University Medical Center, ZA, Leiden, US
| | - Rob G J A Zuiker
- Centre for Human Drug Research (CHDR), Leiden, CL, Leiden, The Netherlands.,Leiden University Medical Center, ZA, Leiden, US
| | - Pieter S Siebenga
- Centre for Human Drug Research (CHDR), Leiden, CL, Leiden, The Netherlands
| | | | - Lei Sun
- Alkermes, Inc., Waltham, MA, USA
| | | | - Marieke L de Kam
- Centre for Human Drug Research (CHDR), Leiden, CL, Leiden, The Netherlands
| | | | | | | | - Joop M A van Gerven
- Centre for Human Drug Research (CHDR), Leiden, CL, Leiden, The Netherlands.,Leiden University Medical Center, ZA, Leiden, US
| |
Collapse
|
6
|
Wilson RC, Bonawitz E, Costa VD, Ebitz RB. Balancing exploration and exploitation with information and randomization. Curr Opin Behav Sci 2021; 38:49-56. [PMID: 33184605 PMCID: PMC7654823 DOI: 10.1016/j.cobeha.2020.10.001] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Explore-exploit decisions require us to trade off the benefits of exploring unknown options to learn more about them, with exploiting known options, for immediate reward. Such decisions are ubiquitous in nature, but from a computational perspective, they are notoriously hard. There is therefore much interest in how humans and animals make these decisions and recently there has been an explosion of research in this area. Here we provide a biased and incomplete snapshot of this field focusing on the major finding that many organisms use two distinct strategies to solve the explore-exploit dilemma: a bias for information ('directed exploration') and the randomization of choice ('random exploration'). We review evidence for the existence of these strategies, their computational properties, their neural implementations, as well as how directed and random exploration vary over the lifespan. We conclude by highlighting open questions in this field that are ripe to both explore and exploit.
Collapse
Affiliation(s)
- Robert C. Wilson
- Department of Psychology, University of Arizona, Tucson AZ USA
- Cognitive Science Program, University of Arizona, Tucson AZ USA
- Evelyn F. McKnight Brain Institute, University of Arizona, Tucson AZ USA
| | | | - Vincent D. Costa
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland OR USA
| | - R. Becket Ebitz
- Department of Neuroscience, University of Montréal, Montréal, Québec, Canada
| |
Collapse
|
7
|
Dubois M, Habicht J, Michely J, Moran R, Dolan RJ, Hauser TU. Human complex exploration strategies are enriched by noradrenaline-modulated heuristics. eLife 2021; 10:e59907. [PMID: 33393461 PMCID: PMC7815309 DOI: 10.7554/elife.59907] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 01/03/2021] [Indexed: 01/15/2023] Open
Abstract
An exploration-exploitation trade-off, the arbitration between sampling a lesser-known against a known rich option, is thought to be solved using computationally demanding exploration algorithms. Given known limitations in human cognitive resources, we hypothesised the presence of additional cheaper strategies. We examined for such heuristics in choice behaviour where we show this involves a value-free random exploration, that ignores all prior knowledge, and a novelty exploration that targets novel options alone. In a double-blind, placebo-controlled drug study, assessing contributions of dopamine (400 mg amisulpride) and noradrenaline (40 mg propranolol), we show that value-free random exploration is attenuated under the influence of propranolol, but not under amisulpride. Our findings demonstrate that humans deploy distinct computationally cheap exploration strategies and that value-free random exploration is under noradrenergic control.
Collapse
Affiliation(s)
- Magda Dubois
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
| | - Johanna Habicht
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
| | - Jochen Michely
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
- Department of Psychiatry and Psychotherapy, Charité – Universitätsmedizin BerlinBerlinGermany
| | - Rani Moran
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
| | - Ray J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
| | - Tobias U Hauser
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
- Wellcome Trust Centre for Neuroimaging, University College LondonLondonUnited Kingdom
| |
Collapse
|
8
|
Pupil-Linked Arousal Responds to Unconscious Surprisal. J Neurosci 2019; 39:5369-5376. [PMID: 31061089 DOI: 10.1523/jneurosci.3010-18.2019] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 04/26/2019] [Accepted: 04/27/2019] [Indexed: 12/14/2022] Open
Abstract
Pupil size under constant illumination reflects brain arousal state, and dilates in response to novel information, or surprisal. Whether this response can be observed regardless of conscious perception is still unknown. In the present study, male and female adult humans performed an implicit learning task across a series of three experiments. We measured pupil and brain-evoked potentials to stimuli that violated transition statistics but were not relevant to the task. We found that pupil size dilated following these surprising events, in the absence of awareness of transition statistics, and only when attention was allocated to the stimulus. These pupil responses correlated with central potentials, evoking an anterior cingulate origin. Arousal response to surprisal outside the scope of conscious perception points to the fundamental relationship between arousal and information processing and indicates that pupil size can be used to track the progression of implicit learning.SIGNIFICANCE STATEMENT Pupil size dilates following increase in mental effort, surprise, or more generally global arousal. However, whether this response arises as a conscious response or reflects a more fundamental mechanism outside the scrutiny of awareness is still unknown. Here, we demonstrate that unexpected changes in the environment, even when processed unconsciously and without being relevant to the task, lead to an increase in arousal levels as reflected by the pupillary response. Further, we show that the concurrent electrophysiological response shares similarities with mismatch negativity, suggesting the involvement of anterior cingulate cortex. All in all, our results establish novel insights about the mechanisms driving global arousal levels, and it provides new possibilities for reliably measuring unconscious processes.
Collapse
|
9
|
Graf H, Wiegers M, Metzger CD, Walter M, Abler B. Differential Noradrenergic Modulation of Monetary Reward and Visual Erotic Stimulus Processing. Front Psychiatry 2018; 9:346. [PMID: 30108528 PMCID: PMC6079271 DOI: 10.3389/fpsyt.2018.00346] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2017] [Accepted: 07/10/2018] [Indexed: 12/17/2022] Open
Abstract
We recently investigated the effects of the noradrenergic antidepressant reboxetine and the antipsychotic amisulpride compared to placebo on neural correlates of primary reinforcers by visual erotic stimulation in healthy subjects. Whereas, amisulpride left subjective sexual functions and corresponding neural activations unimpaired, attenuated neural activations were observed under reboxetine within the nucleus accumbens (Nacc) along with diminished behavioral sexual functioning. However, a global dampening of the reward system under reboxetine seemed not intuitive considering the complementary role of the noradrenergic to the dopamine system in reward-related learning mediated by prediction error processing. We therefore investigated the sample of 17 healthy males in a mean age of 23.8 years again by functional magnetic resonance imaging (fMRI), to explore the noradrenergic effects on neural reward prediction error signaling. Participants took reboxetine (4 mg/d), amisulpride (200 mg/d), and placebo each for 7 days within a randomized, double-blind, within-subject cross-over design. During fMRI, we used an established monetary incentive task to assess neural reward expectation and prediction error signals within the bilateral Nacc using an independent anatomical mask for a region of interest (ROI) analysis. Activations within the same ROI were also assessed for the erotic picture paradigm. We confirmed our previous results from the whole brain analysis for the selected ROI by significant (p < 0.05 FWE-corrected) attenuated activations within the Nacc during visual sexual stimulation under reboxetine compared to placebo. However, activations in the Nacc concerning prediction error processing and monetary reward expectation were unimpaired under reboxetine compared to placebo, along with unimpaired reaction times in the reward task. For both tasks, neural activations and behavioral processing were not altered by amisulpride compared to placebo. The observed attenuated neural activations within the Nacc during visual erotic stimulation along with unimpaired neural prediction error and monetary reward expectation processing provide evidence for a differential modulation of the neural reward system by the noradrenergic agent reboxetine depending on the presence of primary reinforcers such as erotic stimuli in contrast to secondary such as monetary rewards.
Collapse
Affiliation(s)
- Heiko Graf
- Department of Psychiatry and Psychotherapy III, Ulm University, Ulm, Germany
| | - Maike Wiegers
- Department of Psychiatry and Psychotherapy III, Ulm University, Ulm, Germany
| | - Coraline D Metzger
- Department of Psychiatry, Otto von Guericke University, Magdeburg, Germany.,Institute of Cognitive Neurology and Dementia Research, Otto von Guericke University, Magdeburg, Germany.,German Center for Neurodegenerative Diseases, Bonn, Germany
| | - Martin Walter
- Department of Psychiatry, Eberhard Karls University, Tuebingen, Germany
| | - Birgit Abler
- Department of Psychiatry and Psychotherapy III, Ulm University, Ulm, Germany
| |
Collapse
|
10
|
Addicott MA, Pearson JM, Sweitzer MM, Barack DL, Platt ML. A Primer on Foraging and the Explore/Exploit Trade-Off for Psychiatry Research. Neuropsychopharmacology 2017; 42:1931-1939. [PMID: 28553839 PMCID: PMC5561336 DOI: 10.1038/npp.2017.108] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Revised: 05/19/2017] [Accepted: 05/24/2017] [Indexed: 12/18/2022]
Abstract
Foraging is a fundamental behavior, and many types of animals appear to have solved foraging problems using a shared set of mechanisms. Perhaps the most common foraging problem is the choice between exploiting a familiar option for a known reward and exploring unfamiliar options for unknown rewards-the so-called explore/exploit trade-off. This trade-off has been studied extensively in behavioral ecology and computational neuroscience, but is relatively new to the field of psychiatry. Explore/exploit paradigms can offer psychiatry research a new approach to studying motivation, outcome valuation, and effort-related processes, which are disrupted in many mental and emotional disorders. In addition, the explore/exploit trade-off encompasses elements of risk-taking and impulsivity-common behaviors in psychiatric disorders-and provides a novel framework for understanding these behaviors within an ecological context. Here we explain relevant concepts and some common paradigms used to measure explore/exploit decisions in the laboratory, review clinically relevant research on the neurobiology and neuroanatomy of explore/exploit decision making, and discuss how computational psychiatry can benefit from foraging theory.
Collapse
Affiliation(s)
- M A Addicott
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
| | - J M Pearson
- Center for Cognitive Neuroscience, Duke University, Durham, NC, USA
| | - M M Sweitzer
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
| | - D L Barack
- Department of Philosophy and Neuroscience, Columbia University, New York, NY, USA
| | - M L Platt
- Departments of Psychology, Neuroscience, and Marketing, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
11
|
Warren CM, Wilson RC, van der Wee NJ, Giltay EJ, van Noorden MS, Cohen JD, Nieuwenhuis S. The effect of atomoxetine on random and directed exploration in humans. PLoS One 2017; 12:e0176034. [PMID: 28445519 PMCID: PMC5405969 DOI: 10.1371/journal.pone.0176034] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 04/04/2017] [Indexed: 11/19/2022] Open
Abstract
The adaptive regulation of the trade-off between pursuing a known reward (exploitation) and sampling lesser-known options in search of something better (exploration) is critical for optimal performance. Theory and recent empirical work suggest that humans use at least two strategies for solving this dilemma: a directed strategy in which choices are explicitly biased toward information seeking, and a random strategy in which decision noise leads to exploration by chance. Here we examined the hypothesis that random exploration is governed by the neuromodulatory locus coeruleus-norepinephrine system. We administered atomoxetine, a norepinephrine transporter blocker that increases extracellular levels of norepinephrine throughout the cortex, to 22 healthy human participants in a double-blind crossover design. We examined the effect of treatment on performance in a gambling task designed to produce distinct measures of directed exploration and random exploration. In line with our hypothesis we found an effect of atomoxetine on random, but not directed exploration. However, contrary to expectation, atomoxetine reduced rather than increased random exploration. We offer three potential explanations of our findings, involving the non-linear relationship between tonic NE and cognitive performance, the interaction of atomoxetine with other neuromodulators, and the possibility that atomoxetine affected phasic norepinephrine activity more so than tonic norepinephrine activity.
Collapse
Affiliation(s)
- Christopher M. Warren
- Institute of Psychology, Leiden University, Leiden, Netherlands
- Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands
- * E-mail:
| | - Robert C. Wilson
- Department of Psychology and Cognitive Science Program, University of Arizona, Tucson, Arizona, United States of America
| | - Nic J. van der Wee
- Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands
- Department of Psychiatry, Leiden University Medical Center, Leiden, Netherlands
| | - Eric J. Giltay
- Department of Psychiatry, Leiden University Medical Center, Leiden, Netherlands
| | | | - Jonathan D. Cohen
- Department of Psychology, Princeton University, Princeton, New Jersey, United States of America
- Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, United States of America
| | - Sander Nieuwenhuis
- Institute of Psychology, Leiden University, Leiden, Netherlands
- Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands
| |
Collapse
|
12
|
Marshall L, Mathys C, Ruge D, de Berker AO, Dayan P, Stephan KE, Bestmann S. Pharmacological Fingerprints of Contextual Uncertainty. PLoS Biol 2016; 14:e1002575. [PMID: 27846219 PMCID: PMC5113004 DOI: 10.1371/journal.pbio.1002575] [Citation(s) in RCA: 66] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 10/05/2016] [Indexed: 11/19/2022] Open
Abstract
Successful interaction with the environment requires flexible updating of our beliefs about the world. By estimating the likelihood of future events, it is possible to prepare appropriate actions in advance and execute fast, accurate motor responses. According to theoretical proposals, agents track the variability arising from changing environments by computing various forms of uncertainty. Several neuromodulators have been linked to uncertainty signalling, but comprehensive empirical characterisation of their relative contributions to perceptual belief updating, and to the selection of motor responses, is lacking. Here we assess the roles of noradrenaline, acetylcholine, and dopamine within a single, unified computational framework of uncertainty. Using pharmacological interventions in a sample of 128 healthy human volunteers and a hierarchical Bayesian learning model, we characterise the influences of noradrenergic, cholinergic, and dopaminergic receptor antagonism on individual computations of uncertainty during a probabilistic serial reaction time task. We propose that noradrenaline influences learning of uncertain events arising from unexpected changes in the environment. In contrast, acetylcholine balances attribution of uncertainty to chance fluctuations within an environmental context, defined by a stable set of probabilistic associations, or to gross environmental violations following a contextual switch. Dopamine supports the use of uncertainty representations to engender fast, adaptive responses. Pharmacological interventions and hierarchical Bayesian modelling pinpoint the roles of noradrenaline, acetylcholine, and dopamine in computing different forms of uncertainty and in sensitizing actions to our beliefs about uncertainty. Interacting with dynamic and ever-changing environments requires frequent updating of our beliefs about the world. By learning the relationships that link events in the current environmental context, it is possible to prepare and execute fast, accurate responses to those events that are predictable. However, the world’s complex dynamics give rise to uncertainty about the relationships that exist between events and uncertainty about how these relationships might change over time. Several neuromodulators have been proposed to signal these different forms of uncertainty, but their relative contributions to updating beliefs and modulating responses have remained elusive. Here we combine a probabilistic reaction time task, pharmacological interventions, and a hierarchical Bayesian learning model to identify the roles of noradrenaline, acetylcholine, and dopamine in individual computations of uncertainty. We propose that noradrenaline modulates learning about the instability of the relationships that link environmental events. Acetylcholine balances the attribution of uncertainty to unexpected events occurring within an environmental context or to gross violations of our expectations following a context change. In contrast, dopamine sensitises our actions to our beliefs about uncertainty.
Collapse
Affiliation(s)
- Louise Marshall
- Sobell Department of Motor Neuroscience and Movement Disorders, Institute of Neurology, University College London, London, United Kingdom
- * E-mail:
| | - Christoph Mathys
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| | - Diane Ruge
- Sobell Department of Motor Neuroscience and Movement Disorders, Institute of Neurology, University College London, London, United Kingdom
- Department of Psychology and Neurosciences, Leibniz Research Centre for Working Environment and Human Factors, Technical University Dortmund, Dortmund, Germany
| | - Archy O. de Berker
- Sobell Department of Motor Neuroscience and Movement Disorders, Institute of Neurology, University College London, London, United Kingdom
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Klaas E. Stephan
- Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, United Kingdom
- Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich & ETH Zurich, Zurich, Switzerland
- Max Planck Institute for Metabolism Research, Cologne, Germany
| | - Sven Bestmann
- Sobell Department of Motor Neuroscience and Movement Disorders, Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
13
|
Jepma M, Murphy PR, Nassar MR, Rangel-Gomez M, Meeter M, Nieuwenhuis S. Catecholaminergic Regulation of Learning Rate in a Dynamic Environment. PLoS Comput Biol 2016; 12:e1005171. [PMID: 27792728 PMCID: PMC5085041 DOI: 10.1371/journal.pcbi.1005171] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 09/27/2016] [Indexed: 12/15/2022] Open
Abstract
Adaptive behavior in a changing world requires flexibly adapting one's rate of learning to the rate of environmental change. Recent studies have examined the computational mechanisms by which various environmental factors determine the impact of new outcomes on existing beliefs (i.e., the 'learning rate'). However, the brain mechanisms, and in particular the neuromodulators, involved in this process are still largely unknown. The brain-wide neurophysiological effects of the catecholamines norepinephrine and dopamine on stimulus-evoked cortical responses suggest that the catecholamine systems are well positioned to regulate learning about environmental change, but more direct evidence for a role of this system is scant. Here, we report evidence from a study employing pharmacology, scalp electrophysiology and computational modeling (N = 32) that suggests an important role for catecholamines in learning rate regulation. We found that the P3 component of the EEG-an electrophysiological index of outcome-evoked phasic catecholamine release in the cortex-predicted learning rate, and formally mediated the effect of prediction-error magnitude on learning rate. P3 amplitude also mediated the effects of two computational variables-capturing the unexpectedness of an outcome and the uncertainty of a preexisting belief-on learning rate. Furthermore, a pharmacological manipulation of catecholamine activity affected learning rate following unanticipated task changes, in a way that depended on participants' baseline learning rate. Our findings provide converging evidence for a causal role of the human catecholamine systems in learning-rate regulation as a function of environmental change.
Collapse
Affiliation(s)
- Marieke Jepma
- Cognitive Psychology Unit, Institute of Psychology, Leiden University; and Leiden Institute for Brain and Cognition, Leiden University, Leiden, the Netherlands
- * E-mail:
| | - Peter R. Murphy
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Matthew R. Nassar
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, RI, United States of America
| | - Mauricio Rangel-Gomez
- Department of Psychology, University of California, Berkeley, United States of America
| | - Martijn Meeter
- Department of Education Sciences, Vrije Universiteit Amsterdam, The Netherlands
| | - Sander Nieuwenhuis
- Cognitive Psychology Unit, Institute of Psychology, Leiden University; and Leiden Institute for Brain and Cognition, Leiden University, Leiden, the Netherlands
| |
Collapse
|
14
|
|
15
|
Abstract
Mind wandering is an ubiquitous phenomenon in everyday life. In the cognitive neurosciences, mind wandering has been associated with several distinct neural processes, most notably increased activity in the default mode network (DMN), suppressed activity within the anti-correlated (task-positive) network (ACN), and changes in neuromodulation. By using an integrative multimodal approach combining machine-learning techniques with modeling of latent cognitive processes, we show that mind wandering in humans is characterized by inefficiencies in executive control (task-monitoring) processes. This failure is predicted by a single-trial signature of (co)activations in the DMN, ACN, and neuromodulation, and accompanied by a decreased rate of evidence accumulation and response thresholds in the cognitive model.
Collapse
|
16
|
Frijda NH, Ridderinkhof KR, Rietveld E. Impulsive action: emotional impulses and their control. Front Psychol 2014; 5:518. [PMID: 24917835 PMCID: PMC4040919 DOI: 10.3389/fpsyg.2014.00518] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 05/11/2014] [Indexed: 01/26/2023] Open
Abstract
This paper presents a novel theoretical view on impulsive action, integrating thus far separate perspectives on non-reflective action, motivation, emotion regulation, and impulse control. We frame impulsive action in terms of directedness of the individual organism toward, away, or against other givens - toward future states and away from one's present state. First, appraisal of a perceived or thought-of event or object on occasion, rapidly and without premonition or conscious deliberation, triggers a motive to modify one's relation to that event or object. Situational specifics of the event as perceived and appraised motivate and guide selection of readiness for a particular kind of purposive action. Second, perception of complex situations can give rise to multiple appraisals, multiple motives, and multiple simultaneous changes in action readiness. Multiple states of action readiness may interact in generating action, by reinforcing or attenuating each other, thereby yielding impulse control. We show how emotion control can itself result from a motive state or state of action readiness. Our view links impulsive action mechanistically to states of action readiness, which is the central feature of what distinguishes one kind of emotion from another. It thus provides a novel theoretical perspective to the somewhat fragmented literature on impulsive action.
Collapse
Affiliation(s)
- Nico H. Frijda
- Department of Psychology, University of AmsterdamNetherlands
| | - K. Richard Ridderinkhof
- Department of Psychology, University of AmsterdamNetherlands
- Amsterdam Brain and Cognition, University of AmsterdamNetherlands
| | - Erik Rietveld
- Amsterdam Brain and Cognition, University of AmsterdamNetherlands
- Department of Psychiatry, Academic Medical CenterAmsterdam, Netherlands
| |
Collapse
|
17
|
Addicott MA, Pearson JM, Wilson J, Platt ML, McClernon FJ. Smoking and the bandit: a preliminary study of smoker and nonsmoker differences in exploratory behavior measured with a multiarmed bandit task. Exp Clin Psychopharmacol 2013; 21:66-73. [PMID: 23245198 PMCID: PMC4028629 DOI: 10.1037/a0030843] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Advantageous decision-making is an adaptive trade-off between exploring alternatives and exploiting the most rewarding option. This trade-off may be related to maladaptive decision-making associated with nicotine dependence; however, explore/exploit behavior has not been previously investigated in the context of addiction. The explore/exploit trade-off is captured by the multiarmed bandit task, in which different arms of a slot machine are chosen to discover the relative payoffs. The goal of this study was to preliminarily investigate whether smokers differ from nonsmokers in their degree of exploratory behavior. Smokers (n = 18) and nonsmokers (n = 17) completed a 6-armed bandit task as well as self-report measures of behavior and personality traits. Smokers were found to exhibit less exploratory behavior (i.e., made fewer switches between slot machine arms) than nonsmokers within the first 300 trials of the bandit task. The overall proportion of exploratory choices negatively correlated with self-reported measures of delay aversion and nonplanning impulsivity. These preliminary results suggest that smokers make fewer initial exploratory choices on the bandit task. The bandit task is a promising measure that could provide valuable insights into how nicotine use and dependence is associated with explore/exploit decision-making.
Collapse
Affiliation(s)
- Merideth A. Addicott
- Department of Psychiatry and Behavioral Research, Duke University and Duke University School of Medicine, Durham NC USA 27710,Duke-UNC Brain Imaging and Analysis Center, Duke University and Duke University School of Medicine, Durham NC USA 27710
| | - John M. Pearson
- Department of Neurobiology, Duke University and Duke University School of Medicine, Durham NC USA 27710,Center for Cognitive Neuroscience, Duke University and Duke University School of Medicine, Durham NC USA 27710
| | - Jessica Wilson
- Center for Cognitive Neuroscience, Duke University and Duke University School of Medicine, Durham NC USA 27710,Department of Psychology and Neuroscience, Duke University and Duke University School of Medicine, Durham NC USA 27710
| | - Michael L. Platt
- Department of Neurobiology, Duke University and Duke University School of Medicine, Durham NC USA 27710,Center for Cognitive Neuroscience, Duke University and Duke University School of Medicine, Durham NC USA 27710,Department of Evolutionary Anthropology, Duke University and Duke University School of Medicine, Durham NC USA 27710
| | - F. Joseph McClernon
- Department of Psychiatry and Behavioral Research, Duke University and Duke University School of Medicine, Durham NC USA 27710,Duke-UNC Brain Imaging and Analysis Center, Duke University and Duke University School of Medicine, Durham NC USA 27710,Durham Veterans Affairs Medical Center, Duke University and Duke University School of Medicine, Durham NC USA 27710
| |
Collapse
|
18
|
Blinking predicts enhanced cognitive control. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2012; 13:346-54. [DOI: 10.3758/s13415-012-0138-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
19
|
Fröber K, Dreisbach G. How positive affect modulates proactive control: reduced usage of informative cues under positive affect with low arousal. Front Psychol 2012; 3:265. [PMID: 22866047 PMCID: PMC3406411 DOI: 10.3389/fpsyg.2012.00265] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 07/09/2012] [Indexed: 11/28/2022] Open
Abstract
An example of proactive control is the usage of informative cues to prepare for an upcoming task. Here the authors will present data from a series of three experiments, showing that positive affect along with low arousal reduces proactive control in form of a reduced reliance on informative cues. In three affect groups, neutral or positive affective picture stimuli with low and high arousal preceded every trial. In Experiments 1 and 2, using a simple response cueing paradigm with informative cues (66% cue validity), a reduced cue validity effect (CVE) was found under positive affect with low arousal. To test the robustness of the effect and to see whether reactive control is also modulated by positive affect, Experiment 3 used a cued task switching paradigm with predicitive cues (75% cue validity). As expected, a reduced CVE was again found specifically in the positive affect condition with low arousal, but only for task repetitions. Furthermore, there was no difference in switch costs between affect groups (with and without task cues). Taken together, the reduced CVE indicates that positive affect with low arousal reduces proactive control, while comparable switch costs suggest that there is no influence of positive affect on reactive control.
Collapse
Affiliation(s)
- Kerstin Fröber
- Department of Psychology, University of Regensburg Regensburg, Germany
| | | |
Collapse
|
20
|
Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 2012; 73:595-607. [PMID: 22325209 DOI: 10.1016/j.neuron.2011.12.025] [Citation(s) in RCA: 161] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2011] [Indexed: 11/21/2022]
Abstract
How do individuals decide to act based on a rewarding status quo versus an unexplored choice that might yield a better outcome? Recent evidence suggests that individuals may strategically explore as a function of the relative uncertainty about the expected value of options. However, the neural mechanisms supporting uncertainty-driven exploration remain underspecified. The present fMRI study scanned a reinforcement learning task in which participants stop a rotating clock hand in order to win points. Reward schedules were such that expected value could increase, decrease, or remain constant with respect to time. We fit several mathematical models to subject behavior to generate trial-by-trial estimates of exploration as a function of relative uncertainty. These estimates were used to analyze our fMRI data. Results indicate that rostrolateral prefrontal cortex tracks trial-by-trial changes in relative uncertainty, and this pattern distinguished individuals who rely on relative uncertainty for their exploratory decisions versus those who do not.
Collapse
|