Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gershman SJ. Origin of perseveration in the trade-off between reward and complexity. Cognition 2020;204:104394. [DOI: 10.1016/j.cognition.2020.104394] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 11/25/2022]

For:	Gershman SJ. Origin of perseveration in the trade-off between reward and complexity. Cognition 2020;204:104394. [DOI: 10.1016/j.cognition.2020.104394] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 06/29/2020] [Accepted: 06/30/2020] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Gershman SJ, Lak A. Policy complexity suppresses dopamine responses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.15.613150. [PMID: 39345642 PMCID: PMC11429712 DOI: 10.1101/2024.09.15.613150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]

Paunov A, L’Hôtellier M, Guo D, He Z, Yu A, Meyniel F. Multiple and subject-specific roles of uncertainty in reward-guided decision-making. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.27.587016. [PMID: 38585958 PMCID: PMC10996615 DOI: 10.1101/2024.03.27.587016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Fang Z, Zhao M, Xu T, Li Y, Xie H, Quan P, Geng H, Zhang RY. Individuals with anxiety and depression use atypical decision strategies in an uncertain world. eLife 2024;13:RP93887. [PMID: 39255007 PMCID: PMC11386953 DOI: 10.7554/elife.93887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024] Open

Arumugam D, Ho MK, Goodman ND, Van Roy B. Bayesian Reinforcement Learning With Limited Cognitive Load. Open Mind (Camb) 2024;8:395-438. [PMID: 38665544 PMCID: PMC11045037 DOI: 10.1162/opmi_a_00132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Accepted: 02/16/2024] [Indexed: 04/28/2024] Open

Lai L, Gershman SJ. Human decision making balances reward maximization and policy compression. PLoS Comput Biol 2024;20:e1012057. [PMID: 38669280 PMCID: PMC11078408 DOI: 10.1371/journal.pcbi.1012057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/08/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024] Open

Webb J, Steffan P, Hayden BY, Lee D, Kemere C, McGinley M. Foraging Under Uncertainty Follows the Marginal Value Theorem with Bayesian Updating of Environment Representations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.30.587253. [PMID: 38585964 PMCID: PMC10996644 DOI: 10.1101/2024.03.30.587253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Colas JT, O’Doherty JP, Grafton ST. Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts. PLoS Comput Biol 2024;20:e1011950. [PMID: 38552190 PMCID: PMC10980507 DOI: 10.1371/journal.pcbi.1011950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/26/2024] [Indexed: 04/01/2024] Open

Abstract

Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.

Collapse

Valentin S, Kleinegesse S, Bramley NR, Seriès P, Gutmann MU, Lucas CG. Designing optimal behavioral experiments using machine learning. eLife 2024;13:e86224. [PMID: 38261382 PMCID: PMC10805374 DOI: 10.7554/elife.86224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 11/19/2023] [Indexed: 01/24/2024] Open

Abstract

Computational models are powerful tools for understanding human cognition and behavior. They let us express our theories clearly and precisely and offer predictions that can be subtle and often counter-intuitive. However, this same richness and ability to surprise means our scientific intuitions and traditional tools are ill-suited to designing experiments to test and compare these models. To avoid these pitfalls and realize the full potential of computational modeling, we require tools to design experiments that provide clear answers about what models explain human behavior and the auxiliary assumptions those models must make. Bayesian optimal experimental design (BOED) formalizes the search for optimal experimental designs by identifying experiments that are expected to yield informative data. In this work, we provide a tutorial on leveraging recent advances in BOED and machine learning to find optimal experiments for any kind of model that we can simulate data from, and show how by-products of this procedure allow for quick and straightforward evaluation of models and their parameters against real experimental data. As a case study, we consider theories of how people balance exploration and exploitation in multi-armed bandit decision-making tasks. We validate the presented approach using simulations and a real-world experiment. As compared to experimental designs commonly used in the literature, we show that our optimal designs more efficiently determine which of a set of models best account for individual human behavior, and more efficiently characterize behavior given a preferred model. At the same time, formalizing a scientific question such that it can be adequately addressed with BOED can be challenging and we discuss several potential caveats and pitfalls that practitioners should be aware of. We provide code to replicate all analyses as well as tutorial notebooks and pointers to adapt the methodology to different experimental settings.

Collapse

Futrell R. An Information-Theoretic Account of Availability Effects in Language Production. Top Cogn Sci 2024;16:38-53. [PMID: 38145974 DOI: 10.1111/tops.12716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 11/30/2023] [Accepted: 12/01/2023] [Indexed: 12/27/2023]

Futrell R. Information-theoretic principles in incremental language production. Proc Natl Acad Sci U S A 2023;120:e2220593120. [PMID: 37725652 PMCID: PMC10523564 DOI: 10.1073/pnas.2220593120] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 07/22/2023] [Indexed: 09/21/2023] Open

Gong T, Gerstenberg T, Mayrhofer R, Bramley NR. Active causal structure learning in continuous time. Cogn Psychol 2023;140:101542. [PMID: 36586246 DOI: 10.1016/j.cogpsych.2022.101542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 12/07/2022] [Accepted: 12/11/2022] [Indexed: 12/30/2022]

Bari BA, Gershman SJ. Undermatching Is a Consequence of Policy Compression. J Neurosci 2023;43:447-457. [PMID: 36639891 PMCID: PMC9864556 DOI: 10.1523/jneurosci.1003-22.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 10/14/2022] [Accepted: 11/17/2022] [Indexed: 12/12/2022] Open

Abstract

The matching law describes the tendency of agents to match the ratio of choices allocated to the ratio of rewards received when choosing among multiple options (Herrnstein, 1961). Perfect matching, however, is infrequently observed. Instead, agents tend to undermatch or bias choices toward the poorer option. Overmatching, or the tendency to bias choices toward the richer option, is rarely observed. Despite the ubiquity of undermatching, it has received an inadequate normative justification. Here, we assume agents not only seek to maximize reward, but also seek to minimize cognitive cost, which we formalize as policy complexity (the mutual information between actions and states of the environment). Policy complexity measures the extent to which the policy of an agent is state dependent. Our theory states that capacity-constrained agents (i.e., agents that must compress their policies to reduce complexity) can only undermatch or perfectly match, but not overmatch, consistent with the empirical evidence. Moreover, using mouse behavioral data (male), we validate a novel prediction about which task conditions exaggerate undermatching. Finally, in patients with Parkinson's disease (male and female), we argue that a reduction in undermatching with higher dopamine levels is consistent with an increased policy complexity.SIGNIFICANCE STATEMENT The matching law describes the tendency of agents to match the ratio of choices allocated to different options to the ratio of reward received. For example, if option a yields twice as much reward as option b, matching states that agents will choose option a twice as much. However, agents typically undermatch: they choose the poorer option more frequently than expected. Here, we assume that agents seek to simultaneously maximize reward and minimize the complexity of their action policies. We show that this theory explains when and why undermatching occurs. Neurally, we show that policy complexity, and by extension undermatching, is controlled by tonic dopamine, consistent with other evidence that dopamine plays an important role in cognitive resource allocation.

Collapse

Dubois M, Bowler A, Moses-Payne ME, Habicht J, Moran R, Steinbeis N, Hauser TU. Exploration heuristics decrease during youth. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2022;22:969-983. [PMID: 35589910 PMCID: PMC9458685 DOI: 10.3758/s13415-022-01009-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Accepted: 04/22/2022] [Indexed: 01/01/2023]

Beron CC, Neufeld SQ, Linderman SW, Sabatini BL. Mice exhibit stochastic and efficient action switching during probabilistic decision making. Proc Natl Acad Sci U S A 2022;119:e2113961119. [PMID: 35385355 PMCID: PMC9169659 DOI: 10.1073/pnas.2113961119] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 03/03/2022] [Indexed: 12/05/2022] Open

Time pressure changes how people explore and respond to uncertainty. Sci Rep 2022;12:4122. [PMID: 35260717 PMCID: PMC8904509 DOI: 10.1038/s41598-022-07901-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 02/28/2022] [Indexed: 12/25/2022] Open

Ashwood ZC, Roy NA, Stone IR, Urai AE, Churchland AK, Pouget A, Pillow JW. Mice alternate between discrete strategies during perceptual decision-making. Nat Neurosci 2022;25:201-212. [PMID: 35132235 PMCID: PMC8890994 DOI: 10.1038/s41593-021-01007-z] [Citation(s) in RCA: 59] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 12/17/2021] [Indexed: 12/21/2022]

Bhui R, Lai L, Gershman SJ. Resource-rational decision making. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.02.015] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Bongiorno C, Zhou Y, Kryven M, Theurel D, Rizzo A, Santi P, Tenenbaum J, Ratti C. Vector-based pedestrian navigation in cities. NATURE COMPUTATIONAL SCIENCE 2021;1:678-685. [PMID: 38217198 DOI: 10.1038/s43588-021-00130-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 08/12/2021] [Indexed: 01/15/2024]

Futrell R. An Information-Theoretic Account of Semantic Interference in Word Production. Front Psychol 2021;12:672408. [PMID: 34135832 PMCID: PMC8200775 DOI: 10.3389/fpsyg.2021.672408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 04/27/2021] [Indexed: 11/30/2022] Open

Ruel A, Devine S, Eppinger B. Resource‐rational approach to meta‐control problems across the lifespan. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2021;12:e1556. [DOI: 10.1002/wcs.1556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/09/2020] [Accepted: 01/19/2021] [Indexed: 01/26/2023]

Lai L, Gershman SJ. Policy compression: An information bottleneck in action selection. PSYCHOLOGY OF LEARNING AND MOTIVATION 2021. [DOI: 10.1016/bs.plm.2021.02.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]