1
|
Lamba A, Frank MJ, FeldmanHall O. Keeping an eye out for change: Anxiety disrupts adaptive resolution of policy uncertainty. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2024:S2451-9022(24)00203-9. [PMID: 39069235 DOI: 10.1016/j.bpsc.2024.07.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 07/17/2024] [Accepted: 07/17/2024] [Indexed: 07/30/2024]
Abstract
BACKGROUND Human learning unfolds under uncertainty. Uncertainty is heterogeneous with different forms exerting distinct influences on learning. While one can be uncertain about what to do to maximize rewarding outcomes, known as policy uncertainty, one can also be uncertain about general world knowledge, known as epistemic uncertainty. In complex and naturalistic environments such as the social world, adaptive learning may hinge on striking a balance between attending to and resolving each type of uncertainty. Prior work illustrates that people with anxiety-those with increased threat and uncertainty sensitivity-learn less from aversive outcomes, particularly as outcomes become more uncertain. How does a learner adaptively trade-off between attending to these distinct sources of uncertainty to successfully learn about their social environment? METHODS We developed a novel eye-tracking method to capture highly granular estimates of policy and epistemic uncertainty based on gaze patterns and pupil diameter (a physiological estimate of arousal) RESULTS: These empirically derived uncertainty measures reveal that humans (N = 94) flexibly switch between resolving policy and epistemic uncertainty to adaptively learn about which individuals can be trusted and which should be avoided. However, those with increased anxiety (N = 49) do not flexibly switch between resolving policy and epistemic uncertainty, and instead express less uncertainty overall CONCLUSIONS: Combining modeling and eye-tracking techniques, we show that altered learning in people with anxiety emerges from an insensitivity to policy uncertainty and rigid choice policies, leading to maladaptive behaviors with untrustworthy people.
Collapse
Affiliation(s)
- Amrita Lamba
- Department of Cognitive & Psychological Sciences, Brown University, Providence, RI; Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA
| | - Michael J Frank
- Department of Cognitive & Psychological Sciences, Brown University, Providence, RI; Carney Institute of Brain Sciences, Brown University, Providence, RI
| | - Oriel FeldmanHall
- Department of Cognitive & Psychological Sciences, Brown University, Providence, RI; Carney Institute of Brain Sciences, Brown University, Providence, RI.
| |
Collapse
|
2
|
Higashi H. Dynamics of visual attention in exploration and exploitation for reward-guided adjustment tasks. Conscious Cogn 2024; 123:103724. [PMID: 38996747 DOI: 10.1016/j.concog.2024.103724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 06/24/2024] [Accepted: 06/26/2024] [Indexed: 07/14/2024]
Abstract
The learning process encompasses exploration and exploitation phases. While reinforcement learning models have revealed functional and neuroscientific distinctions between these phases, knowledge regarding how they affect visual attention while observing the external environment is limited. This study sought to elucidate the interplay between these learning phases and visual attention allocation using visual adjustment tasks combined with a two-armed bandit problem tailored to detect serial effects only when attention is dispersed across both arms. Per our findings, human participants exhibited a distinct serial effect only during the exploration phase, suggesting enhanced attention to the visual stimulus associated with the non-target arm. Remarkably, although rewards did not motivate attention dispersion in our task, during the exploration phase, individuals engaged in active observation and searched for targets to observe. This behavior highlights a unique information-seeking process in exploration that is distinct from exploitation.
Collapse
Affiliation(s)
- Hiroshi Higashi
- Graduate School of Engineering, Osaka University, Suita, Osaka, Japan.
| |
Collapse
|
3
|
Hou G, Li R, Tian M, Ding J, Zhang X, Yang B, Chen C, Huang R, Yin Y. Improving Efficiency: Automatic Intelligent Weighing System as a Replacement for Manual Pig Weighing. Animals (Basel) 2024; 14:1614. [PMID: 38891661 PMCID: PMC11171250 DOI: 10.3390/ani14111614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/27/2024] [Accepted: 05/27/2024] [Indexed: 06/21/2024] Open
Abstract
To verify the accuracy of AIWS, we weighed 106 pen growing-finishing pigs' weights using both the manual and AIWS methods, respectively. Accuracy was evaluated based on the values of MAE, MAPE, and RMSE. In the growth experiment, manual weighing was conducted every two weeks and AIWS predicted weight data was recorded daily, followed by fitting the growth curves. The results showed that MAE, MAPE, and RMSE values for 60 to 120 kg pigs were 3.48 kg, 3.71%, and 4.43 kg, respectively. The correlation coefficient r between the AIWS and manual method was 0.9410, and R2 was 0.8854. The two were extremely significant correlations (p < 0.001). In growth curve fitting, the AIWS method has lower AIC and BIC values than the manual method. The Logistic model by AIWS was the best-fit model. The age and body weight at the inflection point of the best-fit model were 164.46 d and 93.45 kg, respectively. The maximum growth rate was 831.66 g/d. In summary, AIWS can accurately predict pigs' body weights in actual production and has a better fitting effect on the growth curves of growing-finishing pigs. This study suggested that it was feasible for AIWS to replace manual weighing to measure the weight of 50 to 120 kg live pigs in large-scale farming.
Collapse
Affiliation(s)
- Gaifeng Hou
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| | - Rui Li
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| | - Mingzhou Tian
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| | - Jing Ding
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| | - Xingfu Zhang
- College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China;
- Beijing Focused Loong Technology Co., Ltd., Beijing 100086, China
| | - Bin Yang
- Key Laboratory of Visual Perception and Artificial Intelligence of Hunan Province, College of Electrical and Information Engineering, Hunan University, Changsha 410082, China;
| | - Chunyu Chen
- College of Information and Communication, Harbin Engineering University, Harbin 150001, China;
| | - Ruilin Huang
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| | - Yulong Yin
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, National Engineering Laboratory for Poultry Breeding Pollution Control and Resource Technology, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China; (G.H.); (R.L.); (M.T.); (J.D.)
| |
Collapse
|
4
|
Katayama R, Shiraki R, Ishii S, Yoshida W. Belief inference for hierarchical hidden states in spatial navigation. Commun Biol 2024; 7:614. [PMID: 38773301 PMCID: PMC11109253 DOI: 10.1038/s42003-024-06316-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Accepted: 05/10/2024] [Indexed: 05/23/2024] Open
Abstract
Uncertainty abounds in the real world, and in environments with multiple layers of unobservable hidden states, decision-making requires resolving uncertainties based on mutual inference. Focusing on a spatial navigation problem, we develop a Tiger maze task that involved simultaneously inferring the local hidden state and the global hidden state from probabilistically uncertain observation. We adopt a Bayesian computational approach by proposing a hierarchical inference model. Applying this to human task behaviour, alongside functional magnetic resonance brain imaging, allows us to separate the neural correlates associated with reinforcement and reassessment of belief in hidden states. The imaging results also suggest that different layers of uncertainty differentially involve the basal ganglia and dorsomedial prefrontal cortex, and that the regions responsible are organised along the rostral axis of these areas according to the type of inference and the level of abstraction of the hidden state, i.e. higher-order state inference involves more anterior parts.
Collapse
Affiliation(s)
- Risa Katayama
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan.
- Department of AI-Brain Integration, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan.
| | - Ryo Shiraki
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan
| | - Shin Ishii
- Graduate School of Informatics, Kyoto University, Kyoto, 606-8501, Japan
- Neural Information Analysis Laboratories, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan
- International Research Center for Neurointelligence, the University of Tokyo, Tokyo, 113-0033, Japan
| | - Wako Yoshida
- Department of Neural Computation for Decision-Making, Advanced Telecommunications Research Institute International, Kyoto, 619-0288, Japan
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| |
Collapse
|
5
|
Shintaki R, Tanaka D, Suzuki S, Yoshimoto T, Sadato N, Chikazoe J, Jimura K. Continuous decision to wait for a future reward is guided by fronto-hippocampal anticipatory dynamics. Cereb Cortex 2024; 34:bhae217. [PMID: 38798003 DOI: 10.1093/cercor/bhae217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 05/02/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
Deciding whether to wait for a future reward is crucial for surviving in an uncertain world. While seeking rewards, agents anticipate a reward in the present environment and constantly face a trade-off between staying in their environment or leaving it. It remains unclear, however, how humans make continuous decisions in such situations. Here, we show that anticipatory activity in the anterior prefrontal cortex, ventrolateral prefrontal cortex, and hippocampus underpins continuous stay-leave decision-making. Participants awaited real liquid rewards available after tens of seconds, and their continuous decision was tracked by dynamic brain activity associated with the anticipation of a reward. Participants stopped waiting more frequently and sooner after they experienced longer delays and received smaller rewards. When the dynamic anticipatory brain activity was enhanced in the anterior prefrontal cortex, participants remained in their current environment, but when this activity diminished, they left the environment. Moreover, while experiencing a delayed reward in a novel environment, the ventrolateral prefrontal cortex and hippocampus showed anticipatory activity. Finally, the activity in the anterior prefrontal cortex and ventrolateral prefrontal cortex was enhanced in participants adopting a leave strategy, whereas those remaining stationary showed enhanced hippocampal activity. Our results suggest that fronto-hippocampal anticipatory dynamics underlie continuous decision-making while anticipating a future reward.
Collapse
Affiliation(s)
- Reiko Shintaki
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Daiki Tanaka
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Shinsuke Suzuki
- Centre for Brain, Mind and Markets, The University of Melbourne, Grattan Street, Parkville, Victoria, 3010, Australia
- Faculty of Social Data Science and HIAS Brain Research Center, Hitotsubashi University, 2-1 Naka, Kunitachi, 186-8601, Japan
| | - Takaaki Yoshimoto
- Research Organization of Science and Technology, Ritsumeikan University, 1-1-1, Nojihigashi, Kusatsu, 525-8577, Japan
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
| | - Norihiro Sadato
- Research Organization of Science and Technology, Ritsumeikan University, 1-1-1, Nojihigashi, Kusatsu, 525-8577, Japan
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
| | - Junichi Chikazoe
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
- Araya, Inc., 1-11 Kanda Sakuma-cho, Chiyoda, Tokyo, 101-0025, Japan
| | - Koji Jimura
- Department of Informatics, Gunma University, 4-2 Aramaki-machi, Maebashi, 371-8510, Japan
| |
Collapse
|
6
|
Hagan KE, Aimufua I, Haynos AF, Walsh BT. The explore/exploit trade-off: An ecologically valid and translational framework that can advance mechanistic understanding of eating disorders. Int J Eat Disord 2024; 57:1102-1108. [PMID: 38385592 DOI: 10.1002/eat.24173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/26/2024] [Accepted: 02/08/2024] [Indexed: 02/23/2024]
Abstract
The explore/exploit trade-off is a decision-making process that is conserved across species and balances exploring unfamiliar choices of unknown value with choosing familiar options of known value to maximize reward. This framework is rooted in behavioral ecology and has traditionally been used to study maladaptive versus adaptive non-human animal foraging behavior. Researchers have begun to recognize the potential utility of understanding human decision-making and psychopathology through the explore/exploit trade-off. In this article, we propose that explore/exploit trade-off holds promise for advancing our mechanistic understanding of decision-making processes that confer vulnerability for and maintain eating pathology due to its neurodevelopmental bases, conservation across species, and ability to be mathematically modeled. We present a model for how suboptimal explore/exploit decision-making can promote disordered eating and present recommendations for future research applying this framework to eating pathology. Taken together, the explore/exploit trade-off provides a translational framework for expanding etiologic and maintenance models of eating pathology, given developmental changes in explore/exploit decision-making that coincide in time with the emergence of eating pathology and evidence of biased explore/exploit decision-making in psychopathology. Additionally, understanding explore/exploit decision-making in eating disorders may improve knowledge of their underlying pathophysiology, informing targeted clinical interventions such as neuromodulation and pharmacotherapy. PUBLIC SIGNIFICANCE STATEMENT: The explore/exploit trade-off is a cross-species decision-making process whereby organisms choose between a known option with a known reward or sampling unfamiliar options. We hypothesize that imbalanced explore/exploit decision-making can promote disordered eating and present preliminary data. We propose that explore/exploit trade-off has significant potential to advance understanding of the neurocognitive and neurodevelopmental mechanisms of eating pathology, which could ultimately guide revisions of etiologic models and inform novel interventions.
Collapse
Affiliation(s)
- Kelsey E Hagan
- Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia, USA
- Institute for Women's Health, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Ivieosa Aimufua
- Department of Psychiatry, New York State Psychiatric Institute, Columbia University Irving Medical Center, New York, New York, USA
| | - Ann F Haynos
- Department of Psychiatry, Virginia Commonwealth University, Richmond, Virginia, USA
- Department of Psychology, Virginia Commonwealth University, Richmond, Virginia, USA
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota, USA
| | - B Timothy Walsh
- Department of Psychiatry, New York State Psychiatric Institute, Columbia University Irving Medical Center, New York, New York, USA
| |
Collapse
|
7
|
Gilmour W, Mackenzie G, Feile M, Tayler-Grint L, Suveges S, Macfarlane JA, Macleod AD, Marshall V, Grunwald IQ, Steele JD, Gilbertson T. Impaired value-based decision-making in Parkinson's disease apathy. Brain 2024; 147:1362-1376. [PMID: 38305691 PMCID: PMC10994558 DOI: 10.1093/brain/awae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/07/2023] [Accepted: 01/13/2024] [Indexed: 02/03/2024] Open
Abstract
Apathy is a common and disabling complication of Parkinson's disease characterized by reduced goal-directed behaviour. Several studies have reported dysfunction within prefrontal cortical regions and projections from brainstem nuclei whose neuromodulators include dopamine, serotonin and noradrenaline. Work in animal and human neuroscience have confirmed contributions of these neuromodulators on aspects of motivated decision-making. Specifically, these neuromodulators have overlapping contributions to encoding the value of decisions, and influence whether to explore alternative courses of action or persist in an existing strategy to achieve a rewarding goal. Building upon this work, we hypothesized that apathy in Parkinson's disease should be associated with an impairment in value-based learning. Using a four-armed restless bandit reinforcement learning task, we studied decision-making in 75 volunteers; 53 patients with Parkinson's disease, with and without clinical apathy, and 22 age-matched healthy control subjects. Patients with apathy exhibited impaired ability to choose the highest value bandit. Task performance predicted an individual patient's apathy severity measured using the Lille Apathy Rating Scale (R = -0.46, P < 0.001). Computational modelling of the patient's choices confirmed the apathy group made decisions that were indifferent to the learnt value of the options, consistent with previous reports of reward insensitivity. Further analysis demonstrated a shift away from exploiting the highest value option and a reduction in perseveration, which also correlated with apathy scores (R = -0.5, P < 0.001). We went on to acquire functional MRI in 59 volunteers; a group of 19 patients with and 20 without apathy and 20 age-matched controls performing the Restless Bandit Task. Analysis of the functional MRI signal at the point of reward feedback confirmed diminished signal within ventromedial prefrontal cortex in Parkinson's disease, which was more marked in apathy, but not predictive of their individual apathy severity. Using a model-based categorization of choice type, decisions to explore lower value bandits in the apathy group activated prefrontal cortex to a similar degree to the age-matched controls. In contrast, Parkinson's patients without apathy demonstrated significantly increased activation across a distributed thalamo-cortical network. Enhanced activity in the thalamus predicted individual apathy severity across both patient groups and exhibited functional connectivity with dorsal anterior cingulate cortex and anterior insula. Given that task performance in patients without apathy was no different to the age-matched control subjects, we interpret the recruitment of this network as a possible compensatory mechanism, which compensates against symptomatic manifestation of apathy in Parkinson's disease.
Collapse
Affiliation(s)
- William Gilmour
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| | - Graeme Mackenzie
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| | - Mathias Feile
- Rehabilitation Psychiatry, Murray Royal Hospital, Perth PH2 7BH, UK
| | | | - Szabolcs Suveges
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - Jennifer A Macfarlane
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Medical Physics, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
- SINAPSE, University of Glasgow, Imaging Centre of Excellence, Level 2, Queen Elizabeth University Hospital, Glasgow G51 4TF, Scotland, UK
| | - Angus D Macleod
- Institute of Applied Health Sciences, School of Medicine, University of Aberdeen, Foresterhill, Aberdeen AB24 2ZD, UK
- Department of Neurology, Aberdeen Royal Infirmary, Foresterhill, Aberdeen AB24 2ZD, UK
| | - Vicky Marshall
- Institute of Neurological Sciences, Queen Elizabeth University Hospital, Glasgow G51 4TF, UK
| | - Iris Q Grunwald
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - J Douglas Steele
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - Tom Gilbertson
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| |
Collapse
|
8
|
Aberg KC, Paz R. The neurobehavioral correlates of exploration without learning: Trading off value for explicit, prospective, and variable information gains. Cell Rep 2024; 43:113880. [PMID: 38416639 DOI: 10.1016/j.celrep.2024.113880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 01/10/2024] [Accepted: 02/13/2024] [Indexed: 03/01/2024] Open
Abstract
Exploration is typically motivated by gaining information, with previous research showing that potential information gains drive a "directed" type of exploration. Yet, this research usually studies exploration in the context of learning paradigms and does not directly manipulate multiple levels of information gain. Here, we present a task that isolates learning from decision-making and controls the magnitude of prospective information gains. As predicted, participants explore more with larger future information gains. Both value gains and information gains, at a trial-by-trial level, engage the ventromedial prefrontal cortex (vmPFC), the ventral striatum (VStr), the amygdala, the dorsal anterior cingulate cortex (dACC), and the anterior insula (aINS). Moreover, individual sensitivities to value gains and information gains modulate the vmPFC, dACC, and aINS, but the amygdala and VStr are modulated only by individual sensitivities to information gains. Overall, we identify the neural circuitry of information-based exploration and its relationship with inter-individual exploration biases.
Collapse
Affiliation(s)
- Kristoffer C Aberg
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot 76100, Israel.
| | - Rony Paz
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
9
|
Sazhin D, Dachs A, Smith DV. Meta-Analysis Reveals That Explore-Exploit Decisions are Dissociable by Activation in the Dorsal Lateral Prefrontal Cortex and the Anterior Cingulate Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.21.563317. [PMID: 37961286 PMCID: PMC10634720 DOI: 10.1101/2023.10.21.563317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Explore-exploit research has challenges in generalizability due to a limited theoretical basis of exploration and exploitation. Neuroimaging can help identify whether explore-exploit decisions use an opponent processing system to address this issue. Thus, we conducted a coordinate-based meta-analysis (N=23 studies) where we found activation in the dorsal lateral prefrontal cortex and anterior cingulate cortex during exploration versus exploitation, providing some evidence for opponent processing. However, the conjunction of explore-exploit decisions was associated with activation in the dorsal anterior cingulate cortex, dorsal medial prefrontal cortex, and anterior insula, suggesting that these brain regions do not engage in opponent processing. Further, exploratory analyses revealed heterogeneity in brain responses between task types during exploration and exploitation respectively. Coupled with results suggesting that activation in exploration and exploitation decisions is generally more similar than it is different suggests there remain significant challenges toward characterizing explore-exploit decision making. Nonetheless, dlPFC and ACC activation differentiate explore and exploit decisions and identifying these responses can help in targeted interventions aimed at manipulating these decisions.
Collapse
|
10
|
Wyatt LE, Hewan PA, Hogeveen J, Spreng RN, Turner GR. Exploration versus exploitation decisions in the human brain: A systematic review of functional neuroimaging and neuropsychological studies. Neuropsychologia 2024; 192:108740. [PMID: 38036246 DOI: 10.1016/j.neuropsychologia.2023.108740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 10/15/2023] [Accepted: 11/21/2023] [Indexed: 12/02/2023]
Abstract
Thoughts and actions are often driven by a decision to either explore new avenues with unknown outcomes, or to exploit known options with predictable outcomes. Yet, the neural mechanisms underlying this exploration-exploitation trade-off in humans remain poorly understood. This is attributable to variability in the operationalization of exploration and exploitation as psychological constructs, as well as the heterogeneity of experimental protocols and paradigms used to study these choice behaviours. To address this gap, here we present a comprehensive review of the literature to investigate the neural basis of explore-exploit decision-making in humans. We first conducted a systematic review of functional magnetic resonance imaging (fMRI) studies of exploration-versus exploitation-based decision-making in healthy adult humans during foraging, reinforcement learning, and information search. Eleven fMRI studies met inclusion criterion for this review. Adopting a network neuroscience framework, synthesis of the findings across these studies revealed that exploration-based choice was associated with the engagement of attentional, control, and salience networks. In contrast, exploitation-based choice was associated with engagement of default network brain regions. We interpret these results in the context of a network architecture that supports the flexible switching between externally and internally directed cognitive processes, necessary for adaptive, goal-directed behaviour. To further investigate potential neural mechanisms underlying the exploration-exploitation trade-off we next surveyed studies involving neurodevelopmental, neuropsychological, and neuropsychiatric disorders, as well as lifespan development, and neurodegenerative diseases. We observed striking differences in patterns of explore-exploit decision-making across these populations, again suggesting that these two decision-making modes are supported by independent neural circuits. Taken together, our review highlights the need for precision-mapping of the neural circuitry and behavioural correlates associated with exploration and exploitation in humans. Characterizing exploration versus exploitation decision-making biases may offer a novel, trans-diagnostic approach to assessment, surveillance, and intervention for cognitive decline and dysfunction in normal development and clinical populations.
Collapse
Affiliation(s)
- Lindsay E Wyatt
- Department of Psychology, York University, Toronto, ON, Canada
| | - Patrick A Hewan
- Department of Psychology, York University, Toronto, ON, Canada
| | - Jeremy Hogeveen
- Department of Psychology, The University of New Mexico, Albuquerque, NM, USA
| | - R Nathan Spreng
- Montréal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montréal, QC, H3A 2B4, Canada; Department of Psychology, McGill University, Montréal, QC, Canada; Department of Psychiatry, McGill University, Montréal, QC, Canada; McConnell Brain Imaging Centre, Montréal Neurological Institute, McGill University, Montréal, QC, Canada.
| | - Gary R Turner
- Department of Psychology, York University, Toronto, ON, Canada.
| |
Collapse
|
11
|
Hemmatian B, Varshney LR, Pi F, Barbey AK. The utilitarian brain: Moving beyond the Free Energy Principle. Cortex 2024; 170:69-79. [PMID: 38135613 DOI: 10.1016/j.cortex.2023.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/28/2023] [Accepted: 11/28/2023] [Indexed: 12/24/2023]
Abstract
The Free Energy Principle (FEP) is a normative computational framework for iterative reduction of prediction error and uncertainty through perception-intervention cycles that has been presented as a potential unifying theory of all brain functions (Friston, 2006). Any theory hoping to unify the brain sciences must be able to explain the mechanisms of decision-making, an important cognitive faculty, without the addition of independent, irreducible notions. This challenge has been accepted by several proponents of the FEP (Friston, 2010; Gershman, 2019). We evaluate attempts to reduce decision-making to the FEP, using Lucas' (2005) meta-theory of the brain's contextual constraints as a guidepost. We find reductive variants of the FEP for decision-making unable to explain behavior in certain types of diagnostic, predictive, and multi-armed bandit tasks. We trace the shortcomings to the core theory's lack of an adequate notion of subjective preference or "utility", a concept central to decision-making and grounded in the brain's biological reality. We argue that any attempts to fully reduce utility to the FEP would require unrealistic assumptions, making the principle an unlikely candidate for unifying brain science. We suggest that researchers instead attempt to identify contexts in which either informational or independent reward constraints predominate, delimiting the FEP's area of applicability. To encourage this type of research, we propose a two-factor formal framework that can subsume any FEP model and allows experimenters to compare the contributions of informational versus reward constraints to behavior.
Collapse
Affiliation(s)
- Babak Hemmatian
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, USA
| | - Lav R Varshney
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, USA; Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, USA
| | - Frederick Pi
- Department of Cognitive Science, University of California San Diego, USA
| | - Aron K Barbey
- Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, USA; Center for Brain, Biology and Behavior, University of Nebraska Lincoln, USA.
| |
Collapse
|
12
|
Witkowski PP, Geng JJ. Prefrontal Cortex Codes Representations of Target Identity and Feature Uncertainty. J Neurosci 2023; 43:8769-8776. [PMID: 37875376 PMCID: PMC10727173 DOI: 10.1523/jneurosci.1117-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/04/2023] [Accepted: 10/07/2023] [Indexed: 10/26/2023] Open
Abstract
Many objects in the real world have features that vary over time, creating uncertainty in how they will look in the future. This uncertainty makes statistical knowledge about the likelihood of features critical to attention demanding processes such as visual search. However, little is known about how the uncertainty of visual features is integrated into predictions about search targets in the brain. In the current study, we test the idea that regions prefrontal cortex code statistical knowledge about search targets before the onset of search. Across 20 human participants (13 female; 7 male), we observe target identity in the multivariate pattern and uncertainty in the overall activation of dorsolateral prefrontal cortex (DLPFC) and inferior frontal junction (IFJ) in advance of the search display. This indicates that the target identity (mean) and uncertainty (variance) of the target distribution are coded independently within the same regions. Furthermore, once the search display appears the univariate IFJ signal scaled with the distance of the actual target from the expected mean, but more so when expected variability was low. These results inform neural theories of attention by showing how the prefrontal cortex represents both the identity and expected variability of features in service of top-down attentional control.SIGNIFICANCE STATEMENT Theories of attention and working memory posit that when we engage in complex cognitive tasks our performance is determined by how precisely we remember task-relevant information. However, in the real world the properties of objects change over time, creating uncertainty about many aspects of the task. There is currently a gap in our understanding of how neural systems represent this uncertainty and combine it with target identity information in anticipation of attention demanding cognitive tasks. In this study, we show that the prefrontal cortex represents identity and uncertainty as unique codes before task onset. These results advance theories of attention by showing that the prefrontal cortex codes both target identity and uncertainty to implement top-down attentional control.
Collapse
Affiliation(s)
- Phillip P Witkowski
- Center for Mind and Brain, University of California, Davis, Davis, California 95618
- Department of Psychology, University of California, Davis, Davis, California 95618
| | - Joy J Geng
- Center for Mind and Brain, University of California, Davis, Davis, California 95618
- Department of Psychology, University of California, Davis, Davis, California 95618
| |
Collapse
|
13
|
Walker EY, Pohl S, Denison RN, Barack DL, Lee J, Block N, Ma WJ, Meyniel F. Studying the neural representations of uncertainty. Nat Neurosci 2023; 26:1857-1867. [PMID: 37814025 DOI: 10.1038/s41593-023-01444-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 08/30/2023] [Indexed: 10/11/2023]
Abstract
The study of the brain's representations of uncertainty is a central topic in neuroscience. Unlike most quantities of which the neural representation is studied, uncertainty is a property of an observer's beliefs about the world, which poses specific methodological challenges. We analyze how the literature on the neural representations of uncertainty addresses those challenges and distinguish between 'code-driven' and 'correlational' approaches. Code-driven approaches make assumptions about the neural code for representing world states and the associated uncertainty. By contrast, correlational approaches search for relationships between uncertainty and neural activity without constraints on the neural representation of the world state that this uncertainty accompanies. To compare these two approaches, we apply several criteria for neural representations: sensitivity, specificity, invariance and functionality. Our analysis reveals that the two approaches lead to different but complementary findings, shaping new research questions and guiding future experiments.
Collapse
Affiliation(s)
- Edgar Y Walker
- Department of Physiology and Biophysics, Computational Neuroscience Center, University of Washington, Seattle, WA, USA
| | - Stephan Pohl
- Department of Philosophy, New York University, New York, NY, USA
| | - Rachel N Denison
- Department of Psychological & Brain Sciences, Boston University, Boston, MA, USA
| | - David L Barack
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
- Department of Philosophy, University of Pennsylvania, Philadelphia, PA, USA
| | - Jennifer Lee
- Center for Neural Science, New York University, New York, NY, USA
| | - Ned Block
- Department of Philosophy, New York University, New York, NY, USA
| | - Wei Ji Ma
- Center for Neural Science, New York University, New York, NY, USA
- Department of Psychology, New York University, New York, NY, USA
| | - Florent Meyniel
- Cognitive Neuroimaging Unit, INSERM, CEA, CNRS, Université Paris-Saclay, NeuroSpin center, Gif-sur-Yvette, France.
| |
Collapse
|
14
|
Lloyd A, Viding E, McKay R, Furl N. Understanding patch foraging strategies across development. Trends Cogn Sci 2023; 27:1085-1098. [PMID: 37500422 DOI: 10.1016/j.tics.2023.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 07/05/2023] [Accepted: 07/06/2023] [Indexed: 07/29/2023]
Abstract
Patch foraging is a near-ubiquitous behaviour across the animal kingdom and characterises many decision-making domains encountered by humans. We review how a disposition to explore in adolescence may reflect the evolutionary conditions under which hunter-gatherers foraged for resources. We propose that neurocomputational mechanisms responsible for reward processing, learning, and cognitive control facilitate the transition from exploratory strategies in adolescence to exploitative strategies in adulthood - where individuals capitalise on known resources. This developmental transition may be disrupted by psychopathology, as there is emerging evidence of biases in explore/exploit choices in mental health problems. Explore/exploit choices may be an informative marker for mental health across development and future research should consider this feature of decision-making as a target for clinical intervention.
Collapse
Affiliation(s)
- Alex Lloyd
- Clinical, Educational, and Health Psychology, Psychology and Language Sciences, University College London, 26 Bedford Way, London, WC1H 0AP, UK.
| | - Essi Viding
- Clinical, Educational, and Health Psychology, Psychology and Language Sciences, University College London, 26 Bedford Way, London, WC1H 0AP, UK
| | - Ryan McKay
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX, UK
| | - Nicholas Furl
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX, UK
| |
Collapse
|
15
|
Daumas L, Zory R, Junquera-Badilla I, Ferrandez M, Ettore E, Robert P, Sacco G, Manera V, Ramanoël S. How does apathy impact exploration-exploitation decision-making in older patients with neurocognitive disorders? NPJ AGING 2023; 9:25. [PMID: 37903801 PMCID: PMC10616174 DOI: 10.1038/s41514-023-00121-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/14/2023] [Indexed: 11/01/2023]
Abstract
Apathy is a pervasive clinical syndrome in neurocognitive disorders, characterized by a quantitative reduction in goal-directed behaviors. The brain structures involved in the physiopathology of apathy have also been connected to the brain structures involved in probabilistic reward learning in the exploration-exploitation dilemma. This dilemma in question involves the challenge of selecting between a familiar option with a more predictable outcome, and another option whose outcome is uncertain and may yield potentially greater rewards compared to the known option. The aim of this study was to combine experimental procedures and computational modeling to examine whether, in older adults with mild neurocognitive disorders, apathy affects performance in the exploration-exploitation dilemma. Through using a four-armed bandit reinforcement-learning task, we showed that apathetic older adults explored more and performed worse than non-apathetic subjects. Moreover, the mental flexibility assessed by the Trail-making test-B was negatively associated with the percentage of exploration. These results suggest that apathy is characterized by an increased explorative behavior and inefficient decision-making, possibly due to weak mental flexibility to switch toward the exploitation of the more rewarding options. Apathetic participants also took longer to make a choice and failed more often to respond in the allotted time, which could reflect the difficulties in action initiation and selection. In conclusion, the present results suggest that apathy in participants with neurocognitive disorders is associated with specific disturbances in the exploration-exploitation trade-off and sheds light on the disturbances in reward processing in patients with apathy.
Collapse
Affiliation(s)
- Lyne Daumas
- Université Côte d'Azur, LAMHESS, Nice, France.
- Université Côte d'Azur, CoBTeK, Nice, France.
| | - Raphaël Zory
- Université Côte d'Azur, LAMHESS, Nice, France
- Institut Universitaire de France, Paris, France
| | | | - Marion Ferrandez
- Université Côte d'Azur, CoBTeK, Nice, France
- Université Côte d'Azur, Centre Hospitalier Universitaire de Nice, service Clinique Gériatrique de Soins Ambulatoires, Centre Mémoire de Ressources et de Recherche, Nice, France
| | - Eric Ettore
- Université Côte d'Azur, CoBTeK, Nice, France
- Université Côte d'Azur, Centre Hospitalier Universitaire de Nice, service Clinique Gériatrique de Soins Ambulatoires, Centre Mémoire de Ressources et de Recherche, Nice, France
- Association Innovation Alzheimer, Nice, France
| | - Philippe Robert
- Université Côte d'Azur, CoBTeK, Nice, France
- Association Innovation Alzheimer, Nice, France
| | - Guillaume Sacco
- Université Côte d'Azur, CoBTeK, Nice, France
- Université Côte d'Azur, Centre Hospitalier Universitaire de Nice, service Clinique Gériatrique de Soins Ambulatoires, Centre Mémoire de Ressources et de Recherche, Nice, France
- Association Innovation Alzheimer, Nice, France
- Univ Angers, Université de Nantes, LPPL, SFR CONFLUENCES, 49000, Angers, France
| | - Valeria Manera
- Université Côte d'Azur, CoBTeK, Nice, France
- Association Innovation Alzheimer, Nice, France
| | - Stephen Ramanoël
- Université Côte d'Azur, LAMHESS, Nice, France
- Sorbonne Université, INSERM, CNRS, Institut de la Vision, 17 rue Moreau, 75012, Paris, France
| |
Collapse
|
16
|
Topel S, Ma I, Sleutels J, van Steenbergen H, de Bruijn ERA, van Duijvenvoorde ACK. Expecting the unexpected: a review of learning under uncertainty across development. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023:10.3758/s13415-023-01098-0. [PMID: 37237092 PMCID: PMC10390612 DOI: 10.3758/s13415-023-01098-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/28/2023] [Indexed: 05/28/2023]
Abstract
Many of our decisions take place under uncertainty. To successfully navigate the environment, individuals need to estimate the degree of uncertainty and adapt their behaviors accordingly by learning from experiences. However, uncertainty is a broad construct and distinct types of uncertainty may differentially influence our learning. We provide a semi-systematic review to illustrate cognitive and neurobiological processes involved in learning under two types of uncertainty: learning in environments with stochastic outcomes, and with volatile outcomes. We specifically reviewed studies (N = 26 studies) that included an adolescent population, because adolescence is a period in life characterized by heightened exploration and learning, as well as heightened uncertainty due to experiencing many new, often social, environments. Until now, reviews have not comprehensively compared learning under distinct types of uncertainties in this age range. Our main findings show that although the overall developmental patterns were mixed, most studies indicate that learning from stochastic outcomes, as indicated by increased accuracy in performance, improved with age. We also found that adolescents tended to have an advantage compared with adults and children when learning from volatile outcomes. We discuss potential mechanisms explaining these age-related differences and conclude by outlining future research directions.
Collapse
Affiliation(s)
- Selin Topel
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands.
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands.
| | - Ili Ma
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Jan Sleutels
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden University, Institute for Philosophy, Leiden, The Netherlands
| | - Henk van Steenbergen
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Ellen R A de Bruijn
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - Anna C K van Duijvenvoorde
- Leiden University, Institute of Psychology, Wassenaarseweg 52, 2333, AK, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| |
Collapse
|
17
|
Tomov MS, Tsividis PA, Pouncy T, Tenenbaum JB, Gershman SJ. The neural architecture of theory-based reinforcement learning. Neuron 2023; 111:1331-1344.e8. [PMID: 36898374 PMCID: PMC10200004 DOI: 10.1016/j.neuron.2023.01.023] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 11/06/2022] [Accepted: 01/27/2023] [Indexed: 03/11/2023]
Abstract
Humans learn internal models of the world that support planning and generalization in complex environments. Yet it remains unclear how such internal models are represented and learned in the brain. We approach this question using theory-based reinforcement learning, a strong form of model-based reinforcement learning in which the model is a kind of intuitive theory. We analyzed fMRI data from human participants learning to play Atari-style games. We found evidence of theory representations in prefrontal cortex and of theory updating in prefrontal cortex, occipital cortex, and fusiform gyrus. Theory updates coincided with transient strengthening of theory representations. Effective connectivity during theory updating suggests that information flows from prefrontal theory-coding regions to posterior theory-updating regions. Together, our results are consistent with a neural architecture in which top-down theory representations originating in prefrontal regions shape sensory predictions in visual areas, where factored theory prediction errors are computed and trigger bottom-up updates of the theory.
Collapse
Affiliation(s)
- Momchil S Tomov
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Motional AD, Inc., Boston, MA 02210, USA.
| | - Pedro A Tsividis
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Thomas Pouncy
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
| | - Joshua B Tenenbaum
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA; Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
18
|
Speers LJ, Bilkey DK. Maladaptive explore/exploit trade-offs in schizophrenia. Trends Neurosci 2023; 46:341-354. [PMID: 36878821 DOI: 10.1016/j.tins.2023.02.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 01/30/2023] [Accepted: 02/08/2023] [Indexed: 03/07/2023]
Abstract
Schizophrenia is a complex disorder that remains poorly understood, particularly at the systems level. In this opinion article we argue that the explore/exploit trade-off concept provides a holistic and ecologically valid framework to resolve some of the apparent paradoxes that have emerged within schizophrenia research. We review recent evidence suggesting that fundamental explore/exploit behaviors may be maladaptive in schizophrenia during physical, visual, and cognitive foraging. We also describe how theories from the broader optimal foraging literature, such as the marginal value theorem (MVT), could provide valuable insight into how aberrant processing of reward, context, and cost/effort evaluations interact to produce maladaptive responses.
Collapse
Affiliation(s)
- Lucinda J Speers
- Department of Psychology, University of Otago, Dunedin 9016, New Zealand
| | - David K Bilkey
- Department of Psychology, University of Otago, Dunedin 9016, New Zealand.
| |
Collapse
|
19
|
Bounmy T, Eger E, Meyniel F. A characterization of the neural representation of confidence during probabilistic learning. Neuroimage 2023; 268:119849. [PMID: 36640947 DOI: 10.1016/j.neuroimage.2022.119849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 12/09/2022] [Accepted: 12/29/2022] [Indexed: 01/13/2023] Open
Abstract
Learning in a stochastic and changing environment is a difficult task. Models of learning typically postulate that observations that deviate from the learned predictions are surprising and used to update those predictions. Bayesian accounts further posit the existence of a confidence-weighting mechanism: learning should be modulated by the confidence level that accompanies those predictions. However, the neural bases of this confidence are much less known than the ones of surprise. Here, we used a dynamic probability learning task and high-field MRI to identify putative cortical regions involved in the representation of confidence about predictions during human learning. We devised a stringent test based on the conjunction of four criteria. We localized several regions in parietal and frontal cortices whose activity is sensitive to the confidence of an ideal observer, specifically so with respect to potential confounds (surprise and predictability), and in a way that is invariant to which item is predicted. We also tested for functionality in two ways. First, we localized regions whose activity patterns at the subject level showed an effect of both confidence and surprise in qualitative agreement with the confidence-weighting principle. Second, we found neural representations of ideal confidence that also accounted for subjective confidence. Taken together, those results identify a set of cortical regions potentially implicated in the confidence-weighting of learning.
Collapse
Affiliation(s)
- Tiffany Bounmy
- Cognitive Neuroimaging Unit, CEA DRF/Joliot, INSERM, Université Paris-Saclay, NeuroSpin Center, Gif-sur-Yvette, France; Université de Paris, Paris, France.
| | - Evelyn Eger
- Cognitive Neuroimaging Unit, CEA DRF/Joliot, INSERM, Université Paris-Saclay, NeuroSpin Center, Gif-sur-Yvette, France
| | - Florent Meyniel
- Cognitive Neuroimaging Unit, CEA DRF/Joliot, INSERM, Université Paris-Saclay, NeuroSpin Center, Gif-sur-Yvette, France.
| |
Collapse
|
20
|
Conceptualisation of Uncertainty in Decision Neuroscience Research: Do We Really Know What Types of Uncertainties The Measured Neural Correlates Relate To? Integr Psychol Behav Sci 2023; 57:88-116. [PMID: 35943682 DOI: 10.1007/s12124-022-09719-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/14/2022] [Indexed: 01/13/2023]
Abstract
In the article "What are neural correlates neural correlates of?" published in the journal BioSocieties, Gabriel Abend points out that neuroscientists cannot avoid philosophical questions concerning the conceptualization and operationalization of social-psychological phenomena they deal with at the physiological level. In this article, we build on Abend's thesis and, through a systematic literature review of decision neuroscience studies, test it with the example of the social-psychological phenomenon of uncertainty in decision making. In this paper, we provide an overview of studies that appropriately attempt to conceptualise uncertainty, and then use these studies to analyse papers looking for neural correlates of uncertainty. Based on a systematic review of studies, we investigate what types of uncertainty authors in the field of decision neuroscience address and define, what criteria they use to distinguish between these types, what problems are associated with their conceptualization, and whether the neural correlates of different types of uncertainty can be accurately identified. The paper concludes that, particularly in the economic context, a collaboration between the natural and social sciences works well, and neuroscience studies use economic conceptualizations of uncertainty that are further developed by sophisticated decision tasks. However, the paper also highlights problematic aspects that obscure the understanding of the phenomena under study. These include the lack of criteria for distinguishing between different types of phenomena, the unclear use of the general concept of uncertainty, and the confusion of phenomena or their erroneous synonymous use.
Collapse
|
21
|
de A Marcelino AL, Gray O, Al-Fatly B, Gilmour W, Douglas Steele J, Kühn AA, Gilbertson T. Pallidal neuromodulation of the explore/exploit trade-off in decision-making. eLife 2023; 12:79642. [PMID: 36727860 PMCID: PMC9940911 DOI: 10.7554/elife.79642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 02/01/2023] [Indexed: 02/03/2023] Open
Abstract
Every decision that we make involves a conflict between exploiting our current knowledge of an action's value or exploring alternative courses of action that might lead to a better, or worse outcome. The sub-cortical nuclei that make up the basal ganglia have been proposed as a neural circuit that may contribute to resolving this explore-exploit 'dilemma'. To test this hypothesis, we examined the effects of neuromodulating the basal ganglia's output nucleus, the globus pallidus interna, in patients who had undergone deep brain stimulation (DBS) for isolated dystonia. Neuromodulation enhanced the number of exploratory choices to the lower value option in a two-armed bandit probabilistic reversal-learning task. Enhanced exploration was explained by a reduction in the rate of evidence accumulation (drift rate) in a reinforcement learning drift diffusion model. We estimated the functional connectivity profile between the stimulating DBS electrode and the rest of the brain using a normative functional connectome derived from heathy controls. Variation in the extent of neuromodulation induced exploration between patients was associated with functional connectivity from the stimulation electrode site to a distributed brain functional network. We conclude that the basal ganglia's output nucleus, the globus pallidus interna, can adaptively modify decision choice when faced with the dilemma to explore or exploit.
Collapse
Affiliation(s)
- Ana Luisa de A Marcelino
- Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité Campus MitteBerlinGermany
- Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Core Facility GenomicsBerlinGermany
| | - Owen Gray
- Division of Imaging Science and Technology, Medical School, University of DundeeDundeeUnited Kingdom
| | - Bassam Al-Fatly
- Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité Campus MitteBerlinGermany
| | - William Gilmour
- Division of Imaging Science and Technology, Medical School, University of DundeeDundeeUnited Kingdom
| | - J Douglas Steele
- Division of Imaging Science and Technology, Medical School, University of DundeeDundeeUnited Kingdom
| | - Andrea A Kühn
- Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité Campus MitteBerlinGermany
- Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Core Facility GenomicsBerlinGermany
- Berlin School of Mind and Brain, Charité - University Medicine BerlinBerlinGermany
- NeuroCure, Charité - University Medicine BerlinBerlinGermany
- DZNE, German Centre for Degenerative DiseasesBerlinGermany
| | - Tom Gilbertson
- Division of Imaging Science and Technology, Medical School, University of DundeeDundeeUnited Kingdom
- Department of Neurology, Ninewells Hospital & Medical SchoolDundeeUnited Kingdom
| |
Collapse
|
22
|
Trait somatic anxiety is associated with reduced directed exploration and underestimation of uncertainty. Nat Hum Behav 2023; 7:102-113. [PMID: 36192493 DOI: 10.1038/s41562-022-01455-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 08/26/2022] [Indexed: 02/01/2023]
Abstract
Anxiety has been related to decreased physical exploration, but past findings on the interaction between anxiety and exploration during decision making were inconclusive. Here we examined how latent factors of trait anxiety relate to different exploration strategies when facing volatility-induced uncertainty. Across two studies (total N = 985), we demonstrated that people used a hybrid of directed, random and undirected exploration strategies, which were respectively sensitive to relative uncertainty, total uncertainty and value difference. Trait somatic anxiety, that is, the propensity to experience physical symptoms of anxiety, was inversely correlated with directed exploration and undirected exploration, manifesting as a lesser likelihood for choosing the uncertain option and reducing choice stochasticity regardless of uncertainty. Somatic anxiety is also associated with underestimation of relative uncertainty. Together, these results reveal the selective role of trait somatic anxiety in modulating both uncertainty-driven and value-driven exploration strategies.
Collapse
|
23
|
Kamat A, Makled B, Norfleet J, Schwaitzberg SD, Intes X, De S, Dutta A. Directed information flow during laparoscopic surgical skill acquisition dissociated skill level and medical simulation technology. NPJ SCIENCE OF LEARNING 2022; 7:19. [PMID: 36008451 PMCID: PMC9411170 DOI: 10.1038/s41539-022-00138-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 08/04/2022] [Indexed: 05/11/2023]
Abstract
Virtual reality (VR) simulator has emerged as a laparoscopic surgical skill training tool that needs validation using brain-behavior analysis. Therefore, brain network and skilled behavior relationship were evaluated using functional near-infrared spectroscopy (fNIRS) from seven experienced right-handed surgeons and six right-handed medical students during the performance of Fundamentals of Laparoscopic Surgery (FLS) pattern of cutting tasks in a physical and a VR simulator. Multiple regression and path analysis (MRPA) found that the FLS performance score was statistically significantly related to the interregional directed functional connectivity from the right prefrontal cortex to the supplementary motor area with F (2, 114) = 9, p < 0.001, and R2 = 0.136. Additionally, a two-way multivariate analysis of variance (MANOVA) found a statistically significant effect of the simulator technology on the interregional directed functional connectivity from the right prefrontal cortex to the left primary motor cortex (F (1, 15) = 6.002, p = 0.027; partial η2 = 0.286) that can be related to differential right-lateralized executive control of attention. Then, MRPA found that the coefficient of variation (CoV) of the FLS performance score was statistically significantly associated with the CoV of the interregionally directed functional connectivity from the right primary motor cortex to the left primary motor cortex and the left primary motor cortex to the left prefrontal cortex with F (2, 22) = 3.912, p = 0.035, and R2 = 0.262. This highlighted the importance of the efference copy information from the motor cortices to the prefrontal cortex for postulated left-lateralized perceptual decision-making to reduce behavioral variability.
Collapse
Affiliation(s)
- Anil Kamat
- Center for Modeling, Simulation and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Basiel Makled
- US Army Futures Command, Combat Capabilities Development Command Soldier Center STTC, Orlando, FL, USA
| | - Jack Norfleet
- US Army Futures Command, Combat Capabilities Development Command Soldier Center STTC, Orlando, FL, USA
| | | | - Xavier Intes
- Center for Modeling, Simulation and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, NY, USA
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Suvranu De
- Center for Modeling, Simulation and Imaging in Medicine, Rensselaer Polytechnic Institute, Troy, NY, USA
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Anirban Dutta
- Neuroengineering and Informatics for Rehabilitation Laboratory, Department of Biomedical Engineering, University at Buffalo, Buffalo, NY, USA.
| |
Collapse
|
24
|
Dennison JB, Sazhin D, Smith DV. Decision neuroscience and neuroeconomics: Recent progress and ongoing challenges. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2022; 13:e1589. [PMID: 35137549 PMCID: PMC9124684 DOI: 10.1002/wcs.1589] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/28/2021] [Accepted: 12/21/2021] [Indexed: 01/10/2023]
Abstract
In the past decade, decision neuroscience and neuroeconomics have developed many new insights in the study of decision making. This review provides an overarching update on how the field has advanced in this time period. Although our initial review a decade ago outlined several theoretical, conceptual, methodological, empirical, and practical challenges, there has only been limited progress in resolving these challenges. We summarize significant trends in decision neuroscience through the lens of the challenges outlined for the field and review examples where the field has had significant, direct, and applicable impacts across economics and psychology. First, we review progress on topics including reward learning, explore-exploit decisions, risk and ambiguity, intertemporal choice, and valuation. Next, we assess the impacts of emotion, social rewards, and social context on decision making. Then, we follow up with how individual differences impact choices and new exciting developments in the prediction and neuroforecasting of future decisions. Finally, we consider how trends in decision-neuroscience research reflect progress toward resolving past challenges, discuss new and exciting applications of recent research, and identify new challenges for the field. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Emotion and Motivation.
Collapse
Affiliation(s)
- Jeffrey B Dennison
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - Daniel Sazhin
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - David V Smith
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
25
|
Cogliati Dezza I, Cleeremans A, Alexander WH. Independent and interacting value systems for reward and information in the human brain. eLife 2022; 11:66358. [PMID: 35416151 PMCID: PMC9064296 DOI: 10.7554/elife.66358] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 03/29/2022] [Indexed: 11/17/2022] Open
Abstract
Theories of prefrontal cortex (PFC) as optimizing reward value have been widely deployed to explain its activity in a diverse range of contexts, with substantial empirical support in neuroeconomics and decision neuroscience. Similar neural circuits, however, have also been associated with information processing. By using computational modeling, model-based functional magnetic resonance imaging analysis, and a novel experimental paradigm, we aim at establishing whether a dedicated and independent value system for information exists in the human PFC. We identify two regions in the human PFC that independently encode reward and information. Our results provide empirical evidence for PFC as an optimizer of independent information and reward signals during decision-making under realistic scenarios, with potential implications for the interpretation of PFC activity in both healthy and clinical populations.
Collapse
Affiliation(s)
- Irene Cogliati Dezza
- Department of Experimental Psychology, University College London, London, United Kingdom
| | - Axel Cleeremans
- Center for Research in Cognition and Neurosciences, Université Libre de Bruxelles, Brussels, Belgium
| | - William H Alexander
- Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, United States
| |
Collapse
|
26
|
Poli F, Meyer M, Mars RB, Hunnius S. Contributions of expected learning progress and perceptual novelty to curiosity-driven exploration. Cognition 2022; 225:105119. [PMID: 35421742 PMCID: PMC9194910 DOI: 10.1016/j.cognition.2022.105119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 03/31/2022] [Accepted: 04/01/2022] [Indexed: 11/30/2022]
Abstract
Exploration is curiosity-driven when it relies on the intrinsic motivation to know rather than on extrinsic rewards. Recent evidence shows that artificial agents perform better on a variety of tasks when their learning is curiosity-driven, and humans often engage in curiosity-driven learning when sampling information from the environment. However, which mechanisms underlie curiosity is still unclear. Here, we let participants freely explore different unknown environments that contained learnable sequences of events with varying degrees of noise and volatility. A hierarchical reinforcement learning model captured how participants were learning in these different kinds of unknown environments, and it also tracked the errors they expected to make and the learning opportunities they were planning to seek. With this computational approach, we show that participants' exploratory behavior is guided by learning progress and perceptual novelty. Moreover, we demonstrate an overall tendency of participants to avoid extreme forms of uncertainty. These findings elucidate the cognitive mechanisms that underlie curiosity-driven exploration of unknown environments. Implications of this novel way of quantifying curiosity within a reinforcement learning framework are discussed.
Collapse
Affiliation(s)
- Francesco Poli
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.
| | - Marlene Meyer
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| | - Rogier B Mars
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands; Wellcome Centre for Integrative Neuroimaging, Centre for Functional MRI of the Brain (FMRIB), Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Sabine Hunnius
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands
| |
Collapse
|
27
|
A neural and behavioral trade-off between value and uncertainty underlies exploratory decisions in normative anxiety. Mol Psychiatry 2022; 27:1573-1587. [PMID: 34725456 DOI: 10.1038/s41380-021-01363-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 10/10/2021] [Accepted: 10/14/2021] [Indexed: 11/08/2022]
Abstract
Exploration reduces uncertainty about the environment and improves the quality of future decisions, but at the cost of provisional uncertain and suboptimal outcomes. Although anxiety promotes intolerance to uncertainty, it remains unclear whether and by which mechanisms anxiety relates to exploratory decision-making. We use a dynamic three-armed-bandit task and find that higher trait-anxiety is associated with increased exploration, which in turn harms overall performance. We identify two distinct behavioral sources: first, decisions made by anxious individuals are guided toward reduction of uncertainty; and second, decisions are less guided by immediate value gains. These findings are similar in both loss and gain domains, and further demonstrate that an affective trait relates to exploration and results in an inverse-U-shaped relationship between anxiety and overall performance. Additional imaging data (fMRI) suggests that normative anxiety correlates negatively with the representation of expected-value in the dorsal-anterior-cingulate-cortex, and in contrast, positively with the representation of uncertainty in the anterior-insula. We conclude that a trade-off between value-gains and uncertainty-reduction entails maladaptive decision-making in individuals with higher normal-range anxiety.
Collapse
|
28
|
Unni A, Trende A, Pauley C, Weber L, Biebl B, Kacianka S, Lüdtke A, Bengler K, Pretschner A, Fränzle M, Rieger JW. Investigating Differences in Behavior and Brain in Human-Human and Human-Autonomous Vehicle Interactions in Time-Critical Situations. FRONTIERS IN NEUROERGONOMICS 2022; 3:836518. [PMID: 38235443 PMCID: PMC10790869 DOI: 10.3389/fnrgo.2022.836518] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 02/01/2022] [Indexed: 01/19/2024]
Abstract
Some studies provide evidence that humans could actively exploit the alleged technological advantages of autonomous vehicles (AVs). This implies that humans may tend to interact differently with AVs as compared to human driven vehicles (HVs) with the knowledge that AVs are programmed to be risk-averse. Hence, it is important to investigate how humans interact with AVs in complex traffic situations. Here, we investigated whether participants would value interactions with AVs differently compared to HVs, and if these differences can be characterized on the behavioral and brain-level. We presented participants with a cover story while recording whole-head brain activity using fNIRS that they were driving under time pressure through urban traffic in the presence of other HVs and AVs. Moreover, the AVs were programmed defensively to avoid collisions and had faster braking reaction times than HVs. Participants would receive a monetary reward if they managed to finish the driving block within a given time-limit without risky driving maneuvers. During the drive, participants were repeatedly confronted with left-lane turning situations at unsignalized intersections. They had to stop and find a gap to turn in front of an oncoming stream of vehicles consisting of HVs and AVs. While the behavioral results did not show any significant difference between the safety margin used during the turning maneuvers with respect to AVs or HVs, participants tended to be more certain in their decision-making process while turning in front of AVs as reflected by the smaller variance in the gap size acceptance as compared to HVs. Importantly, using a multivariate logistic regression approach, we were able to predict whether the participants decided to turn in front of HVs or AVs from whole-head fNIRS in the decision-making phase for every participant (mean accuracy = 67.2%, SD = 5%). Channel-wise univariate fNIRS analysis revealed increased brain activation differences for turning in front of AVs compared to HVs in brain areas that represent the valuation of actions taken during decision-making. The insights provided here may be useful for the development of control systems to assess interactions in future mixed traffic environments involving AVs and HVs.
Collapse
Affiliation(s)
- Anirudh Unni
- Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Alexander Trende
- OFFIS Institute for Information Technology, Division of Transportation Research, Oldenburg, Germany
| | - Claire Pauley
- Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Lars Weber
- OFFIS Institute for Information Technology, Division of Transportation Research, Oldenburg, Germany
| | - Bianca Biebl
- Chair of Ergonomics, Technical University of Munich, Garching, Germany
| | - Severin Kacianka
- Chair of Software and Systems Engineering, Technical University of Munich, Garching, Germany
| | - Andreas Lüdtke
- OFFIS Institute for Information Technology, Division of Transportation Research, Oldenburg, Germany
| | - Klaus Bengler
- Chair of Ergonomics, Technical University of Munich, Garching, Germany
| | - Alexander Pretschner
- Chair of Software and Systems Engineering, Technical University of Munich, Garching, Germany
| | - Martin Fränzle
- OFFIS Institute for Information Technology, Division of Transportation Research, Oldenburg, Germany
- Department of Computer Science, University of Oldenburg, Oldenburg, Germany
| | - Jochem W. Rieger
- Department of Psychology, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
29
|
Kamiya T, Takahashi T. Softsatisficing: Risk-sensitive softmax action selection. Biosystems 2022; 213:104633. [DOI: 10.1016/j.biosystems.2022.104633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 12/01/2022]
|
30
|
Womelsdorf T, Watson MR, Tiesinga P. Learning at Variable Attentional Load Requires Cooperation of Working Memory, Meta-learning, and Attention-augmented Reinforcement Learning. J Cogn Neurosci 2021; 34:79-107. [PMID: 34813644 PMCID: PMC9830786 DOI: 10.1162/jocn_a_01780] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves using working memory of recently rewarded objects to guide choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we aim to disentangle their relative contributions in rhesus monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six monkeys is consistently best predicted with a model combining (i) fast working memory and (ii) slower reinforcement learning from differently weighted positive and negative prediction errors as well as (iii) selective suppression of nonchosen feature values and (iv) a meta-learning mechanism that enhances exploration rates based on a memory trace of recent errors. The optimal model parameter settings suggest that these mechanisms cooperate differently at low and high attentional loads. Whereas working memory was essential for efficient learning at lower attentional loads, enhanced weighting of negative prediction errors and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings pinpoint a canonical set of learning mechanisms and suggest how they may cooperate when subjects flexibly adjust to environments with variable real-world attentional demands.
Collapse
Affiliation(s)
- Thilo Womelsdorf
- Department of Psychology, Vanderbilt University, Nashville, TN 37240
| | - Marcus R. Watson
- School of Kinesiology and Health Science, Centre for Vision Research, York University, 4700 Keele Street, Toronto, Ontario M6J 1P3, Canada
| | - Paul Tiesinga
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen 6525 EN, Netherlands
| |
Collapse
|
31
|
Foucault C, Meyniel F. Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments. eLife 2021; 10:71801. [PMID: 34854377 PMCID: PMC8735865 DOI: 10.7554/elife.71801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 12/01/2021] [Indexed: 11/13/2022] Open
Abstract
From decision making to perception to language, predicting what is coming next is crucial. It is also challenging in stochastic, changing, and structured environments; yet the brain makes accurate predictions in many situations. What computational architecture could enable this feat? Bayesian inference makes optimal predictions but is prohibitively difficult to compute. Here, we show that a specific recurrent neural network architecture enables simple and accurate solutions in several environments. This architecture relies on three mechanisms: gating, lateral connections, and recurrent weight training. Like the optimal solution and the human brain, such networks develop internal representations of their changing environment (including estimates of the environment’s latent variables and the precision of these estimates), leverage multiple levels of latent structure, and adapt their effective learning rate to changes without changing their connection weights. Being ubiquitous in the brain, gated recurrence could therefore serve as a generic building block to predict in real-life environments.
Collapse
Affiliation(s)
- Cédric Foucault
- INSERM, CEA, Université Paris-Saclay, Gif sur Yvette, France
| | | |
Collapse
|
32
|
Spreng RN, Turner GR. From exploration to exploitation: a shifting mental mode in late life development. Trends Cogn Sci 2021; 25:1058-1071. [PMID: 34593321 PMCID: PMC8844884 DOI: 10.1016/j.tics.2021.09.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 08/30/2021] [Accepted: 09/01/2021] [Indexed: 12/31/2022]
Abstract
Changes in cognition, affect, and brain function combine to promote a shift in the nature of mentation in older adulthood, favoring exploitation of prior knowledge over exploratory search as the starting point for thought and action. Age-related exploitation biases result from the accumulation of prior knowledge, reduced cognitive control, and a shift toward affective goals. These are accompanied by changes in cortical networks, as well as attention and reward circuits. By incorporating these factors into a unified account, the exploration-to-exploitation shift offers an integrative model of cognitive, affective, and brain aging. Here, we review evidence for this model, identify determinants and consequences, and survey the challenges and opportunities posed by an exploitation-biased mental mode in later life.
Collapse
Affiliation(s)
- R Nathan Spreng
- Laboratory of Brain and Cognition, Montreal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montreal, QC H3A 2B4, Canada; McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada; Departments of Psychiatry and Psychology, McGill University, Montreal, QC H3A 0G4, Canada.
| | - Gary R Turner
- Department of Psychology, York University, Toronto, ON M3J 1P3, Canada
| |
Collapse
|
33
|
Dissociable mechanisms of information sampling in prefrontal cortex and the dopaminergic system. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
34
|
Zhen S, Yaple ZA, Eickhoff SB, Yu R. To learn or to gain: neural signatures of exploration in human decision-making. Brain Struct Funct 2021; 227:63-76. [PMID: 34596757 DOI: 10.1007/s00429-021-02389-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 09/19/2021] [Indexed: 11/26/2022]
Abstract
Individuals not only take actions to obtain immediate rewards but also to gain more information to guide future choices. An ideal exploration-exploitation balance is crucial for maximizing reward over the long run. However, the neural signatures of exploration in humans remain unclear. Using quantitative meta-analyses of functional magnetic resonance imaging experiments on exploratory behaviors, we sought to identify the concordant activity pertaining to exploration over a range of experiments. The results revealed that exploration activates concordant brain activity associated with risk (e.g., dorsal medial prefrontal cortex and anterior insula), cognitive control (e.g., dorsolateral prefrontal cortex and inferior frontal gyrus), and motor processing (e.g., premotor cortex). These stereotaxic maps of exploration may indicate that exploration is highly linked to risk processing, but is also specifically associated with regions involved in executive control processes. Although this explanation should be treated as exploratory, these findings support theories positing an important role for the prefrontal-insular-motor cortical network in exploration.
Collapse
Affiliation(s)
- Shanshan Zhen
- Department of Management, Hong Kong Baptist University, Hong Kong, China
| | - Zachary A Yaple
- Department of Psychology, Faculty of Health, York University, Toronto, ON, Canada
| | - Simon B Eickhoff
- Medical Faculty, Institute of Systems Neuroscience, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Institute of Neuroscience and Medicine, Brain and Behaviour (INM-7), Research Centre Jülich, Jülich, Germany
| | - Rongjun Yu
- Department of Management, Hong Kong Baptist University, Hong Kong, China.
| |
Collapse
|
35
|
Dezza IC, Noel X, Cleeremans A, Yu AJ. Distinct motivations to seek out information in healthy individuals and problem gamblers. Transl Psychiatry 2021; 11:408. [PMID: 34312367 PMCID: PMC8313706 DOI: 10.1038/s41398-021-01523-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 06/04/2021] [Accepted: 06/28/2021] [Indexed: 02/07/2023] Open
Abstract
As massive amounts of information are becoming available to people, understanding the mechanisms underlying information-seeking is more pertinent today than ever. In this study, we investigate the underlying motivations to seek out information in healthy and addicted individuals. We developed a novel decision-making task and a novel computational model which allows dissociating the relative contribution of two motivating factors to seek out information: a desire for novelty and a general desire for knowledge. To investigate whether/how the motivations to seek out information vary between healthy and addicted individuals, in addition to healthy controls we included a sample of individuals with gambling disorder-a form of addiction without the confound of substance consumption and characterized by compulsive gambling. Our results indicate that healthy subjects and problem gamblers adopt distinct information-seeking "modes". Healthy information-seeking behavior was mostly motivated by a desire for novelty. Problem gamblers, on the contrary, displayed reduced novelty-seeking and an increased desire for accumulating knowledge compared to healthy controls. Our findings not only shed new light on the motivations driving healthy and addicted individuals to seek out information, but they also have important implications for the treatment and diagnosis of behavioral addiction.
Collapse
Affiliation(s)
- Irene Cogliati Dezza
- grid.4989.c0000 0001 2348 0746Centre for Research in Cognition and Neurosciences, ULB Neuroscience Institute, Université Libre de Bruxelles, Bruxelles, Belgium ,grid.83440.3b0000000121901201Department of Experimental Psychology, Faculty of Brain Sciences, University College London, London, UK ,grid.83440.3b0000000121901201The Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK ,grid.5342.00000 0001 2069 7798Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Xavier Noel
- grid.4989.c0000 0001 2348 0746Faculty of Medicine, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Axel Cleeremans
- grid.4989.c0000 0001 2348 0746Centre for Research in Cognition and Neurosciences, ULB Neuroscience Institute, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Angela J. Yu
- grid.266100.30000 0001 2107 4242Department of Cognitive Science, University of California San Diego, San Diego, USA
| |
Collapse
|
36
|
Gilbertson T, Steele D. Tonic dopamine, uncertainty and basal ganglia action selection. Neuroscience 2021; 466:109-124. [PMID: 34015370 DOI: 10.1016/j.neuroscience.2021.05.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 05/04/2021] [Accepted: 05/08/2021] [Indexed: 11/29/2022]
Abstract
To make optimal decisions in uncertain circumstances flexible adaption of behaviour is required; exploring alternatives when the best choice is unknown, exploiting what is known when that is best. Using a computational model of the basal ganglia, we propose that switches between exploratory and exploitative decisions are mediated by the interaction between tonic dopamine and cortical input to the basal ganglia. We show that a biologically detailed action selection circuit model, endowed with dopamine dependant striatal plasticity, can optimally solve the explore-exploit problem, estimating the true underlying state of a noisy Gaussian diffusion process. Critical to the model's performance was a fluctuating level of tonic dopamine which increased under conditions of uncertainty. With an optimal range of tonic dopamine, explore-exploit decisions were mediated by the effects of tonic dopamine on the precision of the model action selection mechanism. Under conditions of uncertain reward pay-out, the model's reduced selectivity allowed disinhibition of multiple alternative actions to be explored at random. Conversely, when uncertainly about reward pay-out was low, enhanced selectivity of the action selection circuit facilitated exploitation of the high value choice. Model performance was at the level of a Kalman filter which provides an optimal solution for the task. These simulations support the idea that this subcortical neural circuit may have evolved to facilitate decision making in non-stationary reward environments. The model generates several experimental predictions with relevance to abnormal decision making in neuropsychiatric and neurological disease.
Collapse
Affiliation(s)
- Tom Gilbertson
- Department of Neurology, Level 6, South Block, Ninewells Hospital & Medical School, Dundee DD2 4BF, UK; Division of Imaging Science and Technology, Medical School, University of Dundee, DD2 4BF, UK.
| | - Douglas Steele
- Division of Imaging Science and Technology, Medical School, University of Dundee, DD2 4BF, UK
| |
Collapse
|
37
|
Wilson RC, Bonawitz E, Costa VD, Ebitz RB. Balancing exploration and exploitation with information and randomization. Curr Opin Behav Sci 2021; 38:49-56. [PMID: 33184605 PMCID: PMC7654823 DOI: 10.1016/j.cobeha.2020.10.001] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Explore-exploit decisions require us to trade off the benefits of exploring unknown options to learn more about them, with exploiting known options, for immediate reward. Such decisions are ubiquitous in nature, but from a computational perspective, they are notoriously hard. There is therefore much interest in how humans and animals make these decisions and recently there has been an explosion of research in this area. Here we provide a biased and incomplete snapshot of this field focusing on the major finding that many organisms use two distinct strategies to solve the explore-exploit dilemma: a bias for information ('directed exploration') and the randomization of choice ('random exploration'). We review evidence for the existence of these strategies, their computational properties, their neural implementations, as well as how directed and random exploration vary over the lifespan. We conclude by highlighting open questions in this field that are ripe to both explore and exploit.
Collapse
Affiliation(s)
- Robert C. Wilson
- Department of Psychology, University of Arizona, Tucson AZ USA
- Cognitive Science Program, University of Arizona, Tucson AZ USA
- Evelyn F. McKnight Brain Institute, University of Arizona, Tucson AZ USA
| | | | - Vincent D. Costa
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland OR USA
| | - R. Becket Ebitz
- Department of Neuroscience, University of Montréal, Montréal, Québec, Canada
| |
Collapse
|
38
|
Feng SF, Wang S, Zarnescu S, Wilson RC. The dynamics of explore-exploit decisions reveal a signal-to-noise mechanism for random exploration. Sci Rep 2021; 11:3077. [PMID: 33542333 PMCID: PMC7862437 DOI: 10.1038/s41598-021-82530-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 12/16/2020] [Indexed: 12/29/2022] Open
Abstract
Growing evidence suggests that behavioral variability plays a critical role in how humans manage the tradeoff between exploration and exploitation. In these decisions a little variability can help us to overcome the desire to exploit known rewards by encouraging us to randomly explore something else. Here we investigate how such 'random exploration' could be controlled using a drift-diffusion model of the explore-exploit choice. In this model, variability is controlled by either the signal-to-noise ratio with which reward is encoded (the 'drift rate'), or the amount of information required before a decision is made (the 'threshold'). By fitting this model to behavior, we find that while, statistically, both drift and threshold change when people randomly explore, numerically, the change in drift rate has by far the largest effect. This suggests that random exploration is primarily driven by changes in the signal-to-noise ratio with which reward information is represented in the brain.
Collapse
Affiliation(s)
- Samuel F Feng
- Department of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, UAE
- Khalifa University Centre for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE
| | - Siyu Wang
- Department of Psychology, University of Arizona, Tucson, AZ, USA
| | - Sylvia Zarnescu
- Department of Psychology, University of Arizona, Tucson, AZ, USA
| | - Robert C Wilson
- Department of Psychology, University of Arizona, Tucson, AZ, USA.
- Cognitive Science Program, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
39
|
Tessereau C, O’Dea R, Coombes S, Bast T. Reinforcement learning approaches to hippocampus-dependent flexible spatial navigation. Brain Neurosci Adv 2021; 5:2398212820975634. [PMID: 33954259 PMCID: PMC8042550 DOI: 10.1177/2398212820975634] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 10/21/2020] [Indexed: 11/17/2022] Open
Abstract
Humans and non-human animals show great flexibility in spatial navigation, including the ability to return to specific locations based on as few as one single experience. To study spatial navigation in the laboratory, watermaze tasks, in which rats have to find a hidden platform in a pool of cloudy water surrounded by spatial cues, have long been used. Analogous tasks have been developed for human participants using virtual environments. Spatial learning in the watermaze is facilitated by the hippocampus. In particular, rapid, one-trial, allocentric place learning, as measured in the delayed-matching-to-place variant of the watermaze task, which requires rodents to learn repeatedly new locations in a familiar environment, is hippocampal dependent. In this article, we review some computational principles, embedded within a reinforcement learning framework, that utilise hippocampal spatial representations for navigation in watermaze tasks. We consider which key elements underlie their efficacy, and discuss their limitations in accounting for hippocampus-dependent navigation, both in terms of behavioural performance (i.e. how well do they reproduce behavioural measures of rapid place learning) and neurobiological realism (i.e. how well do they map to neurobiological substrates involved in rapid place learning). We discuss how an actor-critic architecture, enabling simultaneous assessment of the value of the current location and of the optimal direction to follow, can reproduce one-trial place learning performance as shown on watermaze and virtual delayed-matching-to-place tasks by rats and humans, respectively, if complemented with map-like place representations. The contribution of actor-critic mechanisms to delayed-matching-to-place performance is consistent with neurobiological findings implicating the striatum and hippocampo-striatal interaction in delayed-matching-to-place performance, given that the striatum has been associated with actor-critic mechanisms. Moreover, we illustrate that hierarchical computations embedded within an actor-critic architecture may help to account for aspects of flexible spatial navigation. The hierarchical reinforcement learning approach separates trajectory control via a temporal-difference error from goal selection via a goal prediction error and may account for flexible, trial-specific, navigation to familiar goal locations, as required in some arm-maze place memory tasks, although it does not capture one-trial learning of new goal locations, as observed in open field, including watermaze and virtual, delayed-matching-to-place tasks. Future models of one-shot learning of new goal locations, as observed on delayed-matching-to-place tasks, should incorporate hippocampal plasticity mechanisms that integrate new goal information with allocentric place representation, as such mechanisms are supported by substantial empirical evidence.
Collapse
Affiliation(s)
- Charline Tessereau
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
- School of Psychology, University of Nottingham, Nottingham, UK
- Neuroscience@Nottingham
| | - Reuben O’Dea
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
- Neuroscience@Nottingham
| | - Stephen Coombes
- School of Mathematical Sciences, University of Nottingham, Nottingham, UK
- Neuroscience@Nottingham
| | - Tobias Bast
- School of Psychology, University of Nottingham, Nottingham, UK
- Neuroscience@Nottingham
| |
Collapse
|
40
|
|