1
|
Cecchini G, DePass M, Baspinar E, Andujar M, Ramawat S, Pani P, Ferraina S, Destexhe A, Moreno-Bote R, Cos I. Cognitive mechanisms of learning in sequential decision-making under uncertainty: an experimental and theoretical approach. Front Behav Neurosci 2024; 18:1399394. [PMID: 39188591 PMCID: PMC11346247 DOI: 10.3389/fnbeh.2024.1399394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/19/2024] [Indexed: 08/28/2024] Open
Abstract
Learning to make adaptive decisions involves making choices, assessing their consequence, and leveraging this assessment to attain higher rewarding states. Despite vast literature on value-based decision-making, relatively little is known about the cognitive processes underlying decisions in highly uncertain contexts. Real world decisions are rarely accompanied by immediate feedback, explicit rewards, or complete knowledge of the environment. Being able to make informed decisions in such contexts requires significant knowledge about the environment, which can only be gained via exploration. Here we aim at understanding and formalizing the brain mechanisms underlying these processes. To this end, we first designed and performed an experimental task. Human participants had to learn to maximize reward while making sequences of decisions with only basic knowledge of the environment, and in the absence of explicit performance cues. Participants had to rely on their own internal assessment of performance to reveal a covert relationship between their choices and their subsequent consequences to find a strategy leading to the highest cumulative reward. Our results show that the participants' reaction times were longer whenever the decision involved a future consequence, suggesting greater introspection whenever a delayed value had to be considered. The learning time varied significantly across participants. Second, we formalized the neurocognitive processes underlying decision-making within this task, combining mean-field representations of competing neural populations with a reinforcement learning mechanism. This model provided a plausible characterization of the brain dynamics underlying these processes, and reproduced each aspect of the participants' behavior, from their reaction times and choices to their learning rates. In summary, both the experimental results and the model provide a principled explanation to how delayed value may be computed and incorporated into the neural dynamics of decision-making, and to how learning occurs in these uncertain scenarios.
Collapse
Affiliation(s)
- Gloria Cecchini
- Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Barcelona, Spain
- Center for Brain and Cognition, DTIC, Universitat Pompeu Fabra, Barcelona, Spain
| | - Michael DePass
- Center for Brain and Cognition, DTIC, Universitat Pompeu Fabra, Barcelona, Spain
| | - Emre Baspinar
- CNRS, Institute of Neuroscience (NeuroPSI), Paris-Saclay University, Saclay, France
| | - Marta Andujar
- Department of Physiology and Pharmacology, Sapienza University of Rome, Rome, Italy
| | - Surabhi Ramawat
- Department of Physiology and Pharmacology, Sapienza University of Rome, Rome, Italy
| | - Pierpaolo Pani
- Department of Physiology and Pharmacology, Sapienza University of Rome, Rome, Italy
| | - Stefano Ferraina
- Department of Physiology and Pharmacology, Sapienza University of Rome, Rome, Italy
| | - Alain Destexhe
- CNRS, Institute of Neuroscience (NeuroPSI), Paris-Saclay University, Saclay, France
| | - Rubén Moreno-Bote
- Center for Brain and Cognition, DTIC, Universitat Pompeu Fabra, Barcelona, Spain
- Serra-Hunter Fellow Programme, Barcelona, Spain
| | - Ignasi Cos
- Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Barcelona, Spain
- Serra-Hunter Fellow Programme, Barcelona, Spain
| |
Collapse
|
2
|
Alister M, McKay KT, Sewell DK, Evans NJ. Uncovering the cognitive mechanisms underlying the gaze cueing effect. Q J Exp Psychol (Hove) 2024; 77:803-827. [PMID: 37246917 PMCID: PMC10960327 DOI: 10.1177/17470218231181238] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 04/03/2023] [Accepted: 05/16/2023] [Indexed: 05/30/2023]
Abstract
The gaze cueing effect is the tendency for people to respond faster to targets appearing at locations gazed at by others, compared with locations gazed away from by others. The effect is robust, widely studied, and is an influential finding within social cognition. Formal evidence accumulation models provide the dominant theoretical account of the cognitive processes underlying speeded decision-making, but they have rarely been applied to social cognition research. In this study, using a combination of individual-level and hierarchical computational modelling techniques, we applied evidence accumulation models to gaze cueing data (three data sets total, N = 171, 139,001 trials) for the first time to assess the relative capacity that an attentional orienting mechanism and information processing mechanisms have for explaining the gaze cueing effect. We found that most participants were best described by the attentional orienting mechanism, such that response times were slower at gazed away from locations because they had to reorient to the target before they could process the cue. However, we found evidence for individual differences, whereby the models suggested that some gaze cueing effects were driven by a short allocation of information processing resources to the gazed at location, allowing for a brief period where orienting and processing could occur in parallel. There was exceptionally little evidence to suggest any sustained reallocation of information processing resources neither at the group nor individual level. We discuss how this individual variability might represent credible individual differences in the cognitive mechanisms that subserve behaviourally observed gaze cueing effects.
Collapse
Affiliation(s)
- Manikya Alister
- Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Kate T McKay
- School of Psychology, The University of Queensland, Saint Lucia, QLD, Australia
| | - David K Sewell
- School of Psychology, The University of Queensland, Saint Lucia, QLD, Australia
| | - Nathan J Evans
- School of Psychology, The University of Queensland, Saint Lucia, QLD, Australia
- Department of Psychology, Ludwig Maximilian University of Munich, Munich, Germany
| |
Collapse
|
3
|
Ko YH, Zhou A, Niessen E, Stahl J, Weiss PH, Hester R, Bode S, Feuerriegel D. Neural correlates of confidence during decision formation in a perceptual judgment task. Cortex 2024; 173:248-262. [PMID: 38432176 DOI: 10.1016/j.cortex.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/06/2023] [Accepted: 01/23/2024] [Indexed: 03/05/2024]
Abstract
When we make a decision, we also estimate the probability that our choice is correct or accurate. This probability estimate is termed our degree of decision confidence. Recent work has reported event-related potential (ERP) correlates of confidence both during decision formation (the centro-parietal positivity component; CPP) and after a decision has been made (the error positivity component; Pe). However, there are several measurement confounds that complicate the interpretation of these findings. More recent studies that overcome these issues have so far produced conflicting results. To better characterise the ERP correlates of confidence we presented participants with a comparative brightness judgment task while recording electroencephalography. Participants judged which of two flickering squares (varying in luminance over time) was brighter on average. Participants then gave confidence ratings ranging from "surely incorrect" to "surely correct". To elicit a range of confidence ratings we manipulated both the mean luminance difference between the brighter and darker squares (relative evidence) and the overall luminance of both squares (absolute evidence). We found larger CPP amplitudes in trials with higher confidence ratings. This association was not simply a by-product of differences in relative evidence (which covaries with confidence) across trials. We did not identify postdecisional ERP correlates of confidence, except when they were artificially produced by pre-response ERP baselines. These results provide further evidence for neural correlates of processes that inform confidence judgments during decision formation.
Collapse
Affiliation(s)
- Yiu Hong Ko
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia; Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Centre Jülich, Germany; Department of Psychology, Faculty of Human Sciences, University of Cologne, Germany
| | - Andong Zhou
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia
| | - Eva Niessen
- Department of Psychology, Faculty of Human Sciences, University of Cologne, Germany
| | - Jutta Stahl
- Department of Psychology, Faculty of Human Sciences, University of Cologne, Germany
| | - Peter H Weiss
- Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Centre Jülich, Germany; Department of Neurology, University Hospital Cologne and Faculty of Medicine, University of Cologne, Germany
| | - Robert Hester
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia
| | - Stefan Bode
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia
| | - Daniel Feuerriegel
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia.
| |
Collapse
|
4
|
Murrow M, Holmes WR. PyBEAM: A Bayesian approach to parameter inference for a wide class of binary evidence accumulation models. Behav Res Methods 2024; 56:2636-2656. [PMID: 37550470 DOI: 10.3758/s13428-023-02162-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/03/2023] [Indexed: 08/09/2023]
Abstract
Many decision-making theories are encoded in a class of processes known as evidence accumulation models (EAM). These assume that noisy evidence stochastically accumulates until a set threshold is reached, triggering a decision. One of the most successful and widely used of this class is the Diffusion Decision Model (DDM). The DDM however is limited in scope and does not account for processes such as evidence leakage, changes of evidence, or time varying caution. More complex EAMs can encode a wider array of hypotheses, but are currently limited by computational challenges. In this work, we develop the Python package PyBEAM (Bayesian Evidence Accumulation Models) to fill this gap. Toward this end, we develop a general probabilistic framework for predicting the choice and response time distributions for a general class of binary decision models. In addition, we have heavily computationally optimized this modeling process and integrated it with PyMC, a widely used Python package for Bayesian parameter estimation. This 1) substantially expands the class of EAM models to which Bayesian methods can be applied, 2) reduces the computational time to do so, and 3) lowers the entry fee for working with these models. Here we demonstrate the concepts behind this methodology, its application to parameter recovery for a variety of models, and apply it to a recently published data set to demonstrate its practical use.
Collapse
Affiliation(s)
- Matthew Murrow
- Department of Physics and Astronomy, Vanderbilt University, 6301 Stevenson Science Center, Nashville, 37212, TN, USA
| | - William R Holmes
- Cognitive Science Program and Department of Mathematics, Indiana University, 1001 E. 10th St., Bloomington, 47405, IN, USA.
| |
Collapse
|
5
|
Corbett EA, Martinez-Rodriguez LA, Judd C, O'Connell RG, Kelly SP. Multiphasic value biases in fast-paced decisions. eLife 2023; 12:67711. [PMID: 36779966 PMCID: PMC9925050 DOI: 10.7554/elife.67711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 01/04/2023] [Indexed: 02/11/2023] Open
Abstract
Perceptual decisions are biased toward higher-value options when overall gains can be improved. When stimuli demand immediate reactions, the neurophysiological decision process dynamically evolves through distinct phases of growing anticipation, detection, and discrimination, but how value biases are exerted through these phases remains unknown. Here, by parsing motor preparation dynamics in human electrophysiology, we uncovered a multiphasic pattern of countervailing biases operating in speeded decisions. Anticipatory preparation of higher-value actions began earlier, conferring a 'starting point' advantage at stimulus onset, but the delayed preparation of lower-value actions was steeper, conferring a value-opposed buildup-rate bias. This, in turn, was countered by a transient deflection toward the higher-value action evoked by stimulus detection. A neurally-constrained process model featuring anticipatory urgency, biased detection, and accumulation of growing stimulus-discriminating evidence, successfully captured both behavior and motor preparation dynamics. Thus, an intricate interplay of distinct biasing mechanisms serves to prioritise time-constrained perceptual decisions.
Collapse
Affiliation(s)
- Elaine A Corbett
- Trinity College Institute of Neuroscience, Trinity College DublinDublinIreland,School of Psychology, Trinity College DublinDublinIreland,School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College DublinDublinIreland
| | - L Alexandra Martinez-Rodriguez
- School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College DublinDublinIreland
| | - Cian Judd
- Trinity College Institute of Neuroscience, Trinity College DublinDublinIreland
| | - Redmond G O'Connell
- Trinity College Institute of Neuroscience, Trinity College DublinDublinIreland,School of Psychology, Trinity College DublinDublinIreland
| | - Simon P Kelly
- Trinity College Institute of Neuroscience, Trinity College DublinDublinIreland,School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College DublinDublinIreland
| |
Collapse
|
6
|
Barendregt NW, Gold JI, Josić K, Kilpatrick ZP. Normative decision rules in changing environments. eLife 2022; 11:e79824. [PMID: 36282065 PMCID: PMC9754630 DOI: 10.7554/elife.79824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/20/2022] [Indexed: 11/13/2022] Open
Abstract
Models based on normative principles have played a major role in our understanding of how the brain forms decisions. However, these models have typically been derived for simple, stable conditions, and their relevance to decisions formed under more naturalistic, dynamic conditions is unclear. We previously derived a normative decision model in which evidence accumulation is adapted to fluctuations in the evidence-generating process that occur during a single decision (Glaze et al., 2015), but the evolution of commitment rules (e.g. thresholds on the accumulated evidence) under dynamic conditions is not fully understood. Here, we derive a normative model for decisions based on changing contexts, which we define as changes in evidence quality or reward, over the course of a single decision. In these cases, performance (reward rate) is maximized using decision thresholds that respond to and even anticipate these changes, in contrast to the static thresholds used in many decision models. We show that these adaptive thresholds exhibit several distinct temporal motifs that depend on the specific predicted and experienced context changes and that adaptive models perform robustly even when implemented imperfectly (noisily). We further show that decision models with adaptive thresholds outperform those with constant or urgency-gated thresholds in accounting for human response times on a task with time-varying evidence quality and average reward. These results further link normative and neural decision-making while expanding our view of both as dynamic, adaptive processes that update and use expectations to govern both deliberation and commitment.
Collapse
Affiliation(s)
- Nicholas W Barendregt
- Department of Applied Mathematics, University of Colorado BoulderBoulderUnited States
| | - Joshua I Gold
- Department of Neuroscience, University of PennsylvaniaPhiladelphiaUnited States
| | - Krešimir Josić
- Department of Mathematics, University of HoustonHoustonUnited States
| | - Zachary P Kilpatrick
- Department of Applied Mathematics, University of Colorado BoulderBoulderUnited States
| |
Collapse
|
7
|
Fengler A, Bera K, Pedersen ML, Frank MJ. Beyond Drift Diffusion Models: Fitting a Broad Class of Decision and Reinforcement Learning Models with HDDM. J Cogn Neurosci 2022; 34:1780-1805. [PMID: 35939629 DOI: 10.1162/jocn_a_01902] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Computational modeling has become a central aspect of research in the cognitive neurosciences. As the field matures, it is increasingly important to move beyond standard models to quantitatively assess models with richer dynamics that may better reflect underlying cognitive and neural processes. For example, sequential sampling models (SSMs) are a general class of models of decision-making intended to capture processes jointly giving rise to RT distributions and choice data in n-alternative choice paradigms. A number of model variations are of theoretical interest, but empirical data analysis has historically been tied to a small subset for which likelihood functions are analytically tractable. Advances in methods designed for likelihood-free inference have recently made it computationally feasible to consider a much larger spectrum of SSMs. In addition, recent work has motivated the combination of SSMs with reinforcement learning models, which had historically been considered in separate literatures. Here, we provide a significant addition to the widely used HDDM Python toolbox and include a tutorial for how users can easily fit and assess a (user-extensible) wide variety of SSMs and how they can be combined with reinforcement learning models. The extension comes batteries included, including model visualization tools, posterior predictive checks, and ability to link trial-wise neural signals with model parameters via hierarchical Bayesian regression.
Collapse
|
8
|
Boehm U, Cox S, Gantner G, Stevenson R. Efficient numerical approximation of a non-regular Fokker-Planck equation associated with first-passage time distributions. BIT. NUMERICAL MATHEMATICS 2022; 62:1355-1382. [PMID: 36415672 PMCID: PMC9674775 DOI: 10.1007/s10543-022-00914-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 02/28/2022] [Indexed: 06/16/2023]
Abstract
In neuroscience, the distribution of a decision time is modelled by means of a one-dimensional Fokker-Planck equation with time-dependent boundaries and space-time-dependent drift. Efficient approximation of the solution to this equation is required, e.g., for model evaluation and parameter fitting. However, the prescribed boundary conditions lead to a strong singularity and thus to slow convergence of numerical approximations. In this article we demonstrate that the solution can be related to the solution of a parabolic PDE on a rectangular space-time domain with homogeneous initial and boundary conditions by transformation and subtraction of a known function. We verify that the solution of the new PDE is indeed more regular than the solution of the original PDE and proceed to discretize the new PDE using a space-time minimal residual method. We also demonstrate that the solution depends analytically on the parameters determining the boundaries as well as the drift. This justifies the use of a sparse tensor product interpolation method to approximate the PDE solution for various parameter ranges. The predicted convergence rates of the minimal residual method and that of the interpolation method are supported by numerical simulations.
Collapse
Affiliation(s)
- Udo Boehm
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| | - Sonja Cox
- Korteweg–de Vries (KdV) Institute for Mathematics, University of Amsterdam, PO Box 94248, 1090 GE Amsterdam, The Netherlands
| | - Gregor Gantner
- Institute of Analysis and Scientific Computing, TU Wien, Wiedner Hauptstraße 8-10, 1040 Vienna, Austria
| | - Rob Stevenson
- Korteweg–de Vries (KdV) Institute for Mathematics, University of Amsterdam, PO Box 94248, 1090 GE Amsterdam, The Netherlands
| |
Collapse
|
9
|
Think fast! The implications of emphasizing urgency in decision-making. Cognition 2021; 214:104704. [PMID: 33975126 DOI: 10.1016/j.cognition.2021.104704] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 03/10/2021] [Accepted: 03/22/2021] [Indexed: 11/23/2022]
Abstract
Evidence accumulation models (EAMs) have become the dominant explanation of how the decision-making process operates, proposing that decisions are the result of a process of evidence accumulation. The primary use of EAMs has been as "measurement tools" of the underlying decision-making process, where researchers apply EAMs to empirical data to estimate participants' task ability (i.e., the "drift rate"), response caution (i.e., the "decision threshold"), and the time taken for other processes (i.e., the "non-decision time"), making EAMs a powerful tool for discriminating between competing psychological theories. Recent studies have brought into question the mapping between the latent parameters of EAMs and the theoretical constructs that they are thought to represent, showing that emphasizing urgent responding - which intuitively should selectively influence decision threshold - may also influence drift rate and/or non-decision time. However, these findings have been mixed, leading to differences in opinion between experts in the field. The current study aims to provide a more conclusive answer to the implications of emphasizing urgent responding, providing a re-analysis of 6 data sets from previous studies using two different EAMs - the diffusion model and the linear ballistic accumulator (LBA) - with state-of-the-art methods for model selection based inference. The findings display clear evidence for a difference in conclusions between the two models, with the diffusion model suggesting that decision threshold and non-decision time decrease when urgency is emphasized, and the LBA suggesting that decision threshold and drift rate decrease when urgency is emphasized. Furthermore, although these models disagree regarding whether non-decision time or drift rate decrease under urgency emphasis, both show clear evidence that emphasizing urgency does not selectively influence decision threshold. These findings suggest that researchers should revise their assumptions about certain experimental manipulations, the specification of certain EAMs, or perhaps both.
Collapse
|