1
|
Embrey JR, Li AX, Liew SX, Newell BR. The effect of noninstrumental information on reward learning. Mem Cognit 2024; 52:1210-1227. [PMID: 38393534 PMCID: PMC11315740 DOI: 10.3758/s13421-024-01537-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2024] [Indexed: 02/25/2024]
Abstract
Investigations of information-seeking often highlight people's tendency to forgo financial reward in return for advance information about future outcomes. Most of these experiments use tasks in which reward contingencies are described to participants. The use of such descriptions leaves open the question of whether the opportunity to obtain such noninstrumental information influences people's ability to learn and represent the underlying reward structure of an experimental environment. In two experiments, participants completed a two-armed bandit task with monetary incentives where reward contingencies were learned via trial-by-trial experience. We find, akin to description-based tasks, that participants are willing to forgo financial reward to receive information about a delayed, unchangeable outcome. Crucially, however, there is little evidence this willingness to pay for information is driven by an inaccurate representation of the reward structure: participants' representations approximated the underlying reward structure regardless of the presence of advance noninstrumental information. The results extend previous conclusions regarding the intrinsic value of information to an experience-based domain and highlight challenges of probing participants' memories for experienced rewards.
Collapse
Affiliation(s)
- Jake R Embrey
- School of Psychology, UNSW Sydney, Kensington, Australia.
| | - Amy X Li
- School of Psychology, UNSW Sydney, Kensington, Australia
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Shi Xian Liew
- School of Psychology, UNSW Sydney, Kensington, Australia
- School of Psychological Sciences, University of Melbourne, Melbourne, Australia
| | - Ben R Newell
- School of Psychology, UNSW Sydney, Kensington, Australia
| |
Collapse
|
2
|
Chu J, Tenenbaum JB, Schulz LE. In praise of folly: flexible goals and human cognition. Trends Cogn Sci 2024; 28:628-642. [PMID: 38616478 DOI: 10.1016/j.tics.2024.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 03/13/2024] [Accepted: 03/13/2024] [Indexed: 04/16/2024]
Abstract
Humans often pursue idiosyncratic goals that appear remote from functional ends, including information gain. We suggest that this is valuable because goals (even prima facie foolish or unachievable ones) contain structured information that scaffolds thinking and planning. By evaluating hypotheses and plans with respect to their goals, humans can discover new ideas that go beyond prior knowledge and observable evidence. These hypotheses and plans can be transmitted independently of their original motivations, adapted across generations, and serve as an engine of cultural evolution. Here, we review recent empirical and computational research underlying goal generation and planning and discuss the ways that the flexibility of our motivational system supports cognitive gains for both individuals and societies.
Collapse
Affiliation(s)
- Junyi Chu
- Massachusetts Institute of Technology, Cambridge, MA, USA; Harvard University, Cambridge, MA, USA.
| | | | - Laura E Schulz
- Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
3
|
Wittek N, Sayin BS, Okur N, Wittek K, Gül N, Oeksuez F, Güntürkün O, Anselme P. Hungry pigeons prefer sooner rare food over later likely food or faster information. Front Psychol 2024; 15:1426434. [PMID: 38979068 PMCID: PMC11229172 DOI: 10.3389/fpsyg.2024.1426434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 06/13/2024] [Indexed: 07/10/2024] Open
Abstract
Introduction Making decisions and investing effort to obtain rewards may depend on various factors, such as the delay to reward, the probability of its occurrence, and the information that can be collected about it. As predicted by various theories, pigeons and other animals indeed mind these factors when deciding. Methods We now implemented a task in which pigeons were allowed to choose among three options and to peck at the chosen key to improve the conditions of reward delivery. Pecking more at a first color reduced the 12-s delay before food was delivered with a 33.3% chance, pecking more at a second color increased the initial 33.3% chance of food delivery but did not reduce the 12-s delay, and pecking more at a third color reduced the delay before information was provided whether the trial will be rewarded with a 33.3% chance after 12 s. Results Pigeons' preference (delay vs. probability, delay vs. information, and probability vs. information), as well as their pecking effort for the chosen option, were analyzed. Our results indicate that hungry pigeons preferred to peck for delay reduction but did not work more for that option than for probability increase, which was the most profitable alternative and did not induce more pecking effort. In this task, information was the least preferred and induced the lowest level of effort. Refed pigeons showed no preference for any option but did not drastically reduce the average amounts of effort invested. Discussion These results are discussed in the context of species-specific ecological conditions that could constrain current foraging theories.
Collapse
|
4
|
McDevitt MA, Pisklak JM, Dunn RM, Spetch ML. Temporal context effects on suboptimal choice. Psychon Bull Rev 2024:10.3758/s13423-024-02519-y. [PMID: 38760618 DOI: 10.3758/s13423-024-02519-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/22/2024] [Indexed: 05/19/2024]
Abstract
Choice can be driven both by rewards and stimuli that signal those rewards. Under certain conditions, pigeons will prefer options that lead to less probable reward when the reward is signaled. A recently quantified model, the Signal for Good News (SiGN) model, assumes that in the context of uncertainty, signals for a reduced delay to reward reinforce choice. The SiGN model provides an excellent fit to previous results from pigeons and the current studies are the first to test a priori quantitative predictions. Pigeons chose between a suboptimal alternative that led to signaled 20% food and an optimal alternative that led to 50% food. The duration of the choice period was manipulated across conditions in two experiments. Pigeons strongly preferred the suboptimal alternative at the shorter durations and strongly preferred the optimal alternative at the longer durations. The results from both experiments fit well with predictions from the SiGN model and show that altering the duration of the choice period has a dramatic effect in that it changes which of the two options pigeons prefer. More generally, these results suggest that the relative value of options is not fixed, but instead depends on the temporal context.
Collapse
|
5
|
González VV, Zhang Y, Ashikyan SA, Rickard A, Yassine I, Romero-Sosa JL, Blaisdell AP, Izquierdo A. A special role for anterior cingulate cortex, but not orbitofrontal cortex or basolateral amygdala, in choices involving information. Cereb Cortex 2024; 34:bhae135. [PMID: 38610085 PMCID: PMC11014886 DOI: 10.1093/cercor/bhae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 02/09/2024] [Accepted: 03/13/2024] [Indexed: 04/14/2024] Open
Abstract
Subjects are often willing to pay a cost for information. In a procedure that promotes paradoxical choices, animals choose between a richer option followed by a cue that is rewarded 50% of the time (No Info) vs. a leaner option followed by one of two cues that signal certain outcomes: one always rewarded (100%) and the other never rewarded, 0% (Info). Since decisions involve comparing the subjective value of options after integrating all their features, preference for information may rely on cortico-amygdalar circuitry. To test this, male and female rats were prepared with bilateral inhibitory Designer Receptors Exclusively Activated by Designer Drugs (DREADDs) in the anterior cingulate cortex, orbitofrontal cortex, basolateral amygdala, or null virus (control). We inhibited these regions after stable preference was acquired. We found that inhibition of the anterior cingulate cortex destabilized choice preference in female rats without affecting latency to choose or response rate to cues. A logistic regression fit revealed that previous choice predicted current choice in all conditions, however previously rewarded Info trials strongly predicted preference in all conditions except in female rats following anterior cingulate cortex inhibition. The results reveal a causal, sex-dependent role for the anterior cingulate cortex in decisions involving information.
Collapse
Affiliation(s)
- Valeria V González
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
| | - Yifan Zhang
- Department of Computer Science, University of Southern California, Salvatori Computer Science Center, 941 Bloom Walk, Los Angeles, CA 90089, United States
| | - Sonya A Ashikyan
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
| | - Anne Rickard
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
| | - Ibrahim Yassine
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
| | - Juan Luis Romero-Sosa
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
| | - Aaron P Blaisdell
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
- The Brain Research Institute, University of California-Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA 90095, United States
- Integrative Center for Learning and Memory, University of California-Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA 90095, United States
| | - Alicia Izquierdo
- Department of Psychology, University of California-Los Angeles, 502 Portola Plaza, Los Angeles, CA 90095, United States
- The Brain Research Institute, University of California-Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA 90095, United States
- Integrative Center for Learning and Memory, University of California-Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA 90095, United States
- Integrative Center for Addictions, University of California-Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA 90095, United States
| |
Collapse
|
6
|
Macías A, Machado A, Vasconcelos M. On the value of advanced information about delayed rewards. Anim Cogn 2024; 27:10. [PMID: 38429396 PMCID: PMC10907439 DOI: 10.1007/s10071-024-01856-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 07/21/2023] [Accepted: 10/18/2023] [Indexed: 03/03/2024]
Abstract
In a variety of laboratory preparations, several animal species prefer signaled over unsignaled outcomes. Here we examine whether pigeons prefer options that signal the delay to reward over options that do not and how this preference changes with the ratio of the delays. We offered pigeons repeated choices between two alternatives leading to a short or a long delay to reward. For one alternative (informative), the short and long delays were reliably signaled by different stimuli (e.g., SS for short delays, SL for long delays). For the other (non-informative), the delays were not reliably signaled by the stimuli presented (S1 and S2). Across conditions, we varied the durations of the short and long delays, hence their ratio, while keeping the average delay to reward constant. Pigeons preferred the informative over the non-informative option and this preference became stronger as the ratio of the long to the short delay increased. A modified version of the Δ-Σ hypothesis (González et al., J Exp Anal Behav 113(3):591-608. https://doi.org/10.1002/jeab.595 , 2020a) incorporating a contrast-like process between the immediacies to reward signaled by each stimulus accounted well for our findings. Functionally, we argue that a preference for signaled delays hinges on the potential instrumental advantage typically conveyed by information.
Collapse
Affiliation(s)
- Alejandro Macías
- William James Center for Research, University of Aveiro, Aveiro, Portugal.
- Animal Learning and Behavior Lab, School of Psychology, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal.
| | - Armando Machado
- William James Center for Research, University of Aveiro, Aveiro, Portugal
| | - Marco Vasconcelos
- William James Center for Research, University of Aveiro, Aveiro, Portugal
| |
Collapse
|
7
|
Macías A, González VV, Machado A, Vasconcelos M. Time, uncertainty, and suboptimal choice. Behav Processes 2024; 214:104982. [PMID: 38072037 DOI: 10.1016/j.beproc.2023.104982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/01/2023] [Accepted: 12/06/2023] [Indexed: 12/21/2023]
Abstract
Under certain conditions, pigeons prefer information about whether food will be forthcoming at the end of an interval to a higher chance of obtaining the food. In the typical protocol, choosing one option (Informative) is followed by one of two 10-s long terminal-link stimuli: SG always ending in food or SR never ending in food, with SG occurring only 20% of the trials. The other option (Non-informative) is also followed by one of two 10-s long terminal-link stimuli: SB or SY, both ending in food 50% of the trials. Although the Informative option yields food with a lower probability than the Non-informative (0.2 vs. 0.5), pigeons prefer it. To determine whether such preference occurs because SG and SR disambiguate the trial outcome immediately upon choice, we delayed the moment the disambiguation took place in two experiments. In Experiment 1, when the Informative option was chosen, SG always ensued for t seconds of the terminal-link, and then the standard contingencies followed. Experiment 2 was similar, except that SR always ensued for t seconds. Across conditions, t varied from 0 to 10 s. In both experiments, preference for the Informative option decreased with t, but the effect was stronger in Experiment 1. We discuss the implication of these findings for functional and mechanistic models of suboptimal choice.
Collapse
Affiliation(s)
- Alejandro Macías
- Department of Education and Psychology, University of Aveiro, Portugal.
| | | | - Armando Machado
- William James Center for Research, University of Aveiro, Portugal
| | | |
Collapse
|
8
|
González VV, Ashikyan SA, Zhang Y, Rickard A, Yassine I, Romero-Sosa JL, Blaisdell AP, Izquierdo A. A special role for anterior cingulate cortex, but not orbitofrontal cortex or basolateral amygdala, in choices involving information. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551514. [PMID: 37577596 PMCID: PMC10418268 DOI: 10.1101/2023.08.03.551514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Subjects often are willing to pay a cost for information. In a procedure that promotes paradoxical choices, animals choose between a richer option followed by a cue that is rewarded 50% of the time (No-info) vs a leaner option followed by one of two cues that signal certain outcomes: one always rewarded (100%), and the other never rewarded, 0% (Info). Since decisions involve comparing the subjective value of options after integrating all their features, preference for information may rely on cortico-amygdalar circuitry. To test this, male and female rats were prepared with bilateral inhibitory DREADDs in the anterior cingulate cortex (ACC), orbitofrontal cortex (OFC), basolateral amygdala (BLA), or null virus (control). We inhibited these regions after stable preference was acquired. We found that inhibition of ACC destabilized choice preference in female rats without affecting latency to choose or response rate to cues. A logistic regression fit revealed that the previous choice strongly predicted preference in control animals, but not in female rats following ACC inhibition. The results reveal a causal, sex-dependent role for ACC in decisions involving information.
Collapse
|
9
|
González VV, Blaisdell AP. Inhibition and paradoxical choice. Learn Behav 2023; 51:458-467. [PMID: 37145372 PMCID: PMC10716068 DOI: 10.3758/s13420-023-00584-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 05/06/2023]
Abstract
The present study evaluated the role of inhibition in paradoxical choice in pigeons. In a paradoxical choice procedure, pigeons receive a choice between two alternatives. Choosing the "suboptimal" alternative is followed 20% of the time by one cue (the S+) that is always reinforced, and 80% of the time by another cue (S-) that is never reinforced. Thus, this alternative leads to an overall reinforcement rate of 20%. Choosing the "optimal" alternative, however, is followed by one of two cues (S3 or S4), each reinforced 50% of the time. Thus, this alternative leads to an overall reinforcement rate of 50%. González and Blaisdell (2021) reported that development of paradoxical choice was positively correlated to the development of inhibition to the S- (signal that no food will be delivered on that trial) post-choice stimulus. The current experiment tested the hypothesis that inhibition to a post-choice stimulus is causally related to suboptimal preference. Following acquisition of suboptimal preference, pigeons received two manipulations: in one condition one of the cues in the optimal alternative (S4) was extinguished and, in another condition, the S- cue was partially reinforced. When tested on the choice task afterward, both manipulations resulted in a decrement in suboptimal preference. This result is paradoxical given that both manipulations made the suboptimal alternative the richer option. We discuss the implications of our results, arguing that inhibition of a post-choice cue increases attraction to or value of that choice.
Collapse
Affiliation(s)
- Valeria V González
- Department of Psychology, University of California, 1285 Franz Hall, Los Angeles, CA, 90095-1563, USA.
| | - Aaron P Blaisdell
- Department of Psychology, University of California, 1285 Franz Hall, Los Angeles, CA, 90095-1563, USA
| |
Collapse
|
10
|
MacGillavry T, Spezie G, Fusani L. When less is more: coy display behaviours and the temporal dynamics of animal courtship. Proc Biol Sci 2023; 290:20231684. [PMID: 37788700 PMCID: PMC10547558 DOI: 10.1098/rspb.2023.1684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 09/06/2023] [Indexed: 10/05/2023] Open
Abstract
Sexual selection research has been dominated by the notion that mate choice selects for the most vigorous displays that best reflect the quality of the courter. However, courtship displays are often temporally structured, containing different elements with varying degrees of intensity and conspicuousness. For example, highly intense movements are often coupled with more subtle components such as static postures or hiding displays. Here, we refer to such subtle display traits as 'coy', as they involve the withholding of information about maximal display capabilities. We examine the role of intensity variation within temporally dynamic displays, and discuss three hypotheses for the evolution of coy courtship behaviours. We first review the threat reduction hypothesis, which points to sexual coercion and sexual autonomy as important facets of sexual selection. We then suggest that variation in display magnitude exploits pre-existing perceptual biases for temporal contrast. Lastly, we propose that information withholding may leverage receivers' predispositions for filling gaps in information-the 'curiosity bias'. Overall, our goal is to draw attention to temporal variation in display magnitude, and to advocate possible scenarios for the evolution of courtship traits that regularly occur below performance maxima. Throughout, we highlight novel directions for empirical and theoretical investigations.
Collapse
Affiliation(s)
- Thomas MacGillavry
- Konrad Lorenz Institute of Ethology, University of Veterinary Medicine, Vienna, Austria
| | - Giovanni Spezie
- Konrad Lorenz Institute of Ethology, University of Veterinary Medicine, Vienna, Austria
| | - Leonida Fusani
- Konrad Lorenz Institute of Ethology, University of Veterinary Medicine, Vienna, Austria
- Department of Behavioural and Cognitive Biology, University of Vienna, Vienna, Austria
| |
Collapse
|
11
|
Matthews JR, Cooper PS, Bode S, Chong TTJ. The availability of non-instrumental information increases risky decision-making. Psychon Bull Rev 2023; 30:1975-1987. [PMID: 37038030 PMCID: PMC10716073 DOI: 10.3758/s13423-023-02279-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/20/2023] [Indexed: 04/12/2023]
Abstract
Contemporary models of decision-making under risk focus on estimating the final value of each alternative course of action. According to such frameworks, information that has no capacity to alter a future payoff (i.e., is "non-instrumental") should have little effect on one's preference for risk. Importantly, however, recent work has shown that information, despite being non-instrumental, may nevertheless exert a striking influence on behavior. Here, we tested whether the opportunity to passively observe the sequence of events following a decision could modulate risky behavior, even if that information could not possibly influence the final result. Across three experiments, 71 individuals chose to accept or reject gambles on a five-window slot machine. If a gamble was accepted, each window was sequentially revealed prior to the outcome being declared. Critically, we informed participants about which windows would subsequently provide veridical information about the gamble outcome, should that gamble be accepted. Our analyses revealed three key findings. First, the opportunity to observe the consequences of one's choice significantly increased the likelihood of gambling, despite that information being entirely non-instrumental. Second, this effect generalized across different stakes. Finally, choices were driven predominantly by the likelihood that information could result in an earlier resolution of uncertainty. These findings demonstrate the importance of anticipatory information to decision-making under risk. More broadly, we provide strong evidence for the utility of non-instrumental information, by demonstrating its capacity to modulate primary economic decisions that should be driven by more motivationally salient variables associated with risk and reward.
Collapse
Affiliation(s)
- Julian R Matthews
- Turner Institute for Brain and Mental Health, Monash University, Clayton, Victoria, 3800, Australia.
- RIKEN Center for Brain Science, Wakō-shi, Saitama, 351-0198, Japan.
| | - Patrick S Cooper
- Turner Institute for Brain and Mental Health, Monash University, Clayton, Victoria, 3800, Australia
- Melbourne School of Psychological Sciences, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Stefan Bode
- Melbourne School of Psychological Sciences, The University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Trevor T-J Chong
- Turner Institute for Brain and Mental Health, Monash University, Clayton, Victoria, 3800, Australia.
- Department of Neurology, Alfred Health, Melbourne, Victoria, 3004, Australia.
- Department of Clinical Neurosciences, St Vincent's Hospital, Fitzroy, Victoria, 3065, Australia.
| |
Collapse
|
12
|
Zentall TR. An Animal Model of Human Gambling Behavior. CURRENT RESEARCH IN BEHAVIORAL SCIENCES 2023. [DOI: 10.1016/j.crbeha.2023.100101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2023] Open
|
13
|
Barack DL, Bakkour A, Shohamy D, Salzman CD. Visuospatial information foraging describes search behavior in learning latent environmental features. Sci Rep 2023; 13:1126. [PMID: 36670132 PMCID: PMC9860038 DOI: 10.1038/s41598-023-27662-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 01/05/2023] [Indexed: 01/22/2023] Open
Abstract
In the real world, making sequences of decisions to achieve goals often depends upon the ability to learn aspects of the environment that are not directly perceptible. Learning these so-called latent features requires seeking information about them. Prior efforts to study latent feature learning often used single decisions, used few features, and failed to distinguish between reward-seeking and information-seeking. To overcome this, we designed a task in which humans and monkeys made a series of choices to search for shapes hidden on a grid. On our task, the effects of reward and information outcomes from uncovering parts of shapes could be disentangled. Members of both species adeptly learned the shapes and preferred to select tiles expected to be informative earlier in trials than previously rewarding ones, searching a part of the grid until their outcomes dropped below the average information outcome-a pattern consistent with foraging behavior. In addition, how quickly humans learned the shapes was predicted by how well their choice sequences matched the foraging pattern, revealing an unexpected connection between foraging and learning. This adaptive search for information may underlie the ability in humans and monkeys to learn latent features to support goal-directed behavior in the long run.
Collapse
Affiliation(s)
- David L Barack
- Department of Neuroscience, Columbia University, New York, USA.
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA.
| | - Akram Bakkour
- Department of Psychology, University of Chicago, Chicago, USA
| | - Daphna Shohamy
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA
- Department of Psychology, Columbia University, New York, USA
- Kavli Institute for Brain Sciences, Columbia University, New York, USA
| | - C Daniel Salzman
- Department of Neuroscience, Columbia University, New York, USA
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA
- Kavli Institute for Brain Sciences, Columbia University, New York, USA
- Department of Psychiatry, Columbia University, New York, USA
- New York State Psychiatric Institute, New York, USA
| |
Collapse
|
14
|
Ajuwon V, Ojeda A, Murphy RA, Monteiro T, Kacelnik A. Paradoxical choice and the reinforcing value of information. Anim Cogn 2023; 26:623-637. [PMID: 36306041 PMCID: PMC9950180 DOI: 10.1007/s10071-022-01698-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 09/07/2022] [Accepted: 10/01/2022] [Indexed: 11/01/2022]
Abstract
Signals that reduce uncertainty can be valuable because well-informed decision-makers can better align their preferences to opportunities. However, some birds and mammals display an appetite for informative signals that cannot be used to increase returns. We explore the role that reward-predictive stimuli have in fostering such preferences, aiming at distinguishing between two putative underlying mechanisms. The 'information hypothesis' proposes that reducing uncertainty is reinforcing per se, somewhat consistently with the concept of curiosity: a motivation to know in the absence of tractable extrinsic benefits. In contrast, the 'conditioned reinforcement hypothesis', an associative account, proposes asymmetries in secondarily acquired reinforcement: post-choice stimuli announcing forthcoming rewards (S+) reinforce responses more than stimuli signalling no rewards (S-) inhibit responses. In three treatments, rats faced two equally profitable options delivering food probabilistically after a fixed delay. In the informative option (Info), food or no food was signalled immediately after choice, whereas in the non-informative option (NoInfo) outcomes were uncertain until the delay lapsed. Subjects preferred Info when (1) both outcomes were explicitly signalled by salient auditory cues, (2) only forthcoming food delivery was explicitly signalled, and (3) only the absence of forthcoming reward was explicitly signalled. Acquisition was slower in (3), when food was not explicitly signalled, showing that signals for positive outcomes have a greater influence on the development of preference than signals for negative ones. Our results are consistent with an elaborated conditioned reinforcement account, and with the conjecture that both uncertainty reduction and conditioned reinforcement jointly act to generate preference.
Collapse
Affiliation(s)
- Victor Ajuwon
- Department of Biology, University of Oxford, Oxford, UK.
| | - Andrés Ojeda
- grid.4991.50000 0004 1936 8948Department of Biology, University of Oxford, Oxford, UK
| | - Robin A. Murphy
- grid.4991.50000 0004 1936 8948Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Tiago Monteiro
- grid.4991.50000 0004 1936 8948Department of Biology, University of Oxford, Oxford, UK ,grid.6583.80000 0000 9686 6466Domestication Lab, Department of Interdisciplinary Life Sciences, Konrad Lorenz Institute of Ethology, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Alex Kacelnik
- Department of Biology, University of Oxford, Oxford, UK.
| |
Collapse
|
15
|
Good news is better than bad news, but bad news is not worse than no news. Learn Behav 2022; 50:482-493. [PMID: 35023021 DOI: 10.3758/s13420-021-00489-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/23/2021] [Indexed: 12/30/2022]
Abstract
Under certain conditions, pigeons will reliably prefer an alternative that leads to a lower probability of food over an alternative that leads to a higher probability of food (i.e., demonstrate suboptimal choice). A critical aspect of the typical procedure is that the alternative associated with less food provides differential stimuli that signal trial outcomes, but the alternative associated with more food does not. Few studies have investigated how partial signaling of an alternative influences preference. In Experiments 1-3, pigeons chose between two alternatives that each led to food 60% of the time with partially signaled trial outcomes. One alternative occasionally provided a stimulus that always preceded food (i.e., "good news") and the other alternative occasionally provided a stimulus that always preceded no food ("bad news"). Experiments 2 and 3 also assessed preference in conditions in which alternatives were either completely unsignaled (provided no differential stimuli) or always led to food. Pigeons consistently preferred the "good news" alternative over the "bad news" alternative and preferred 100% food over the "bad news" alternative. The results from conditions in which pigeons chose between the "bad news" alternative and an unsignaled alternative were inconclusive, but suggestive of a preference for bad news. The results are used to evaluate and distinguish between competing explanations of suboptimal choice.
Collapse
|
16
|
Pavlovian processes may produce contrast leading to bias and suboptimal choice. Learn Behav 2022; 50:349-359. [DOI: 10.3758/s13420-022-00514-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2022] [Indexed: 11/08/2022]
|
17
|
Overmatching under food uncertainty in foraging pigeons. Behav Processes 2022; 201:104728. [DOI: 10.1016/j.beproc.2022.104728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 07/07/2022] [Accepted: 08/01/2022] [Indexed: 11/20/2022]
|
18
|
Abstract
The influence of single option or forced-exposure (FE) trials was studied in the suboptimal choice task. Pigeons chose between an optimal alternative that led to food half of the time and a suboptimal alternative that led to food 20% of the time. Choice of the suboptimal alternative was compared across groups of subjects that received different numbers of FE trials during training. In Experiment 1, subjects received 100% FE trials, 67% FE trials, or only choice trials. Pigeons in the two groups that had FE trials developed extreme preference for the signaled suboptimal alternative over the unsignaled optimal alternative, while pigeons that had no FE trials showed pronounced individual differences. Experiment 2 compared 10% and 90% FE trials. When neither alternative signaled trial outcomes, both groups of subjects strongly preferred the optimal alternative. When the suboptimal alternative provided differential signals, the subjects in the 90% FE group developed strong preference for the suboptimal alternative and subjects in the 10% FE group maintained preference for the optimal alternative. The results of both experiments demonstrate that FE trials can have substantial effects on the development of preference in the suboptimal choice task.
Collapse
|
19
|
Goh AXA, Bennett D, Bode S, Chong TTJ. Neurocomputational mechanisms underlying the subjective value of information. Commun Biol 2021; 4:1346. [PMID: 34903804 PMCID: PMC8669024 DOI: 10.1038/s42003-021-02850-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 11/04/2021] [Indexed: 11/09/2022] Open
Abstract
Humans have a striking desire to actively seek new information, even when it is devoid of any instrumental utility. However, the mechanisms that drive individuals' subjective preference for information remain unclear. Here, we used fMRI to examine the processing of subjective information value, by having participants decide how much effort they were willing to trade-off for non-instrumental information. We showed that choices were best described by a model that accounted for: (1) the variability in individuals' estimates of uncertainty, (2) their desire to reduce that uncertainty, and (3) their subjective preference for positively valenced information. Model-based analyses revealed the anterior cingulate as a key node that encodes the subjective value of information across multiple stages of decision-making - including when information was prospectively valued, and when the outcome was definitively delivered. These findings emphasise the multidimensionality of information value, and reveal the neurocomputational mechanisms underlying the variability in individuals' desire to physically pursue informative outcomes.
Collapse
Affiliation(s)
- Ariel X-A Goh
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, VIC, 3800, Australia
- School of Psychological Sciences, Monash University, Melbourne, VIC, 3800, Australia
| | - Daniel Bennett
- Department of Psychiatry, Monash University, Melbourne, VIC, 3800, Australia
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08540, USA
| | - Stefan Bode
- Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Trevor T-J Chong
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, VIC, 3800, Australia.
- School of Psychological Sciences, Monash University, Melbourne, VIC, 3800, Australia.
- Department of Neurology, Alfred Health, Melbourne, VIC, 3004, Australia.
- Department of Clinical Neurosciences, St Vincent's Hospital, Melbourne, VIC, 3065, Australia.
| |
Collapse
|
20
|
Pirrone A, Reina A, Stafford T, Marshall JAR, Gobet F. Magnitude-sensitivity: rethinking decision-making. Trends Cogn Sci 2021; 26:66-80. [PMID: 34750080 DOI: 10.1016/j.tics.2021.10.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 10/05/2021] [Accepted: 10/05/2021] [Indexed: 11/25/2022]
Abstract
Magnitude-sensitivity refers to the result that performance in decision-making, across domains and organisms, is affected by the total value of the possible alternatives. This simple result offers a window into fundamental issues in decision-making and has led to a reconsideration of ecological decision-making, prominent computational models of decision-making, and optimal decision-making. Moreover, magnitude-sensitivity has inspired the design of new robotic systems that exploit natural solutions and apply optimal decision-making policies. In this article, we review the key theoretical and empirical results about magnitude-sensitivity and highlight the importance that this phenomenon has for the understanding of decision-making. Furthermore, we discuss open questions and ideas for future research.
Collapse
Affiliation(s)
- Angelo Pirrone
- Centre for Philosophy of Natural and Social Science, London School of Economics and Political Science, London, UK.
| | - Andreagiovanni Reina
- Institute for Interdisciplinary Studies on Artificial Intelligence (IRIDIA), Université Libre de Bruxelles, Brussels, Belgium
| | - Tom Stafford
- Department of Psychology, University of Sheffield, Sheffield, UK
| | | | - Fernand Gobet
- Centre for Philosophy of Natural and Social Science, London School of Economics and Political Science, London, UK
| |
Collapse
|
21
|
Richman SK, Barker JL, Baek M, Papaj DR, Irwin RE, Bronstein JL. The Sensory and Cognitive Ecology of Nectar Robbing. Front Ecol Evol 2021. [DOI: 10.3389/fevo.2021.698137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Animals foraging from flowers must assess their environment and make critical decisions about which patches, plants, and flowers to exploit to obtain limiting resources. The cognitive ecology of plant-pollinator interactions explores not only the complex nature of pollinator foraging behavior and decision making, but also how cognition shapes pollination and plant fitness. Floral visitors sometimes depart from what we think of as typical pollinator behavior and instead exploit floral resources by robbing nectar (bypassing the floral opening and instead consuming nectar through holes or perforations made in floral tissue). The impacts of nectar robbing on plant fitness are well-studied; however, there is considerably less understanding, from the animal’s perspective, about the cognitive processes underlying nectar robbing. Examining nectar robbing from the standpoint of animal cognition is important for understanding the evolution of this behavior and its ecological and evolutionary consequences. In this review, we draw on central concepts of foraging ecology and animal cognition to consider nectar robbing behavior either when individuals use robbing as their only foraging strategy or when they switch between robbing and legitimate foraging. We discuss sensory and cognitive biases, learning, and the role of a variable environment in making decisions about robbing vs. foraging legitimately. We also discuss ways in which an understanding of the cognitive processes involved in nectar robbing can address questions about how plant-robber interactions affect patterns of natural selection and floral evolution. We conclude by highlighting future research directions on the sensory and cognitive ecology of nectar robbing.
Collapse
|
22
|
Effort-motivated behavior resolves paradoxes in appetitive conditioning. Behav Processes 2021; 193:104525. [PMID: 34601051 DOI: 10.1016/j.beproc.2021.104525] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 09/22/2021] [Accepted: 09/28/2021] [Indexed: 11/23/2022]
Abstract
Motivated behavior has long been studied by psychologists, ethologists, and neuroscientists. To date, many scientists agree with the view that cue and reward attraction is the product of a dopamine-dependent unconscious process called incentive salience or "wanting". This process allows the influence of multiple factors such as hunger and odors on motivational attraction. In some cases, however, the resulting motivated behavior differs from what the incentive salience hypothesis would predict. I argue that seeking behavior under reward uncertainty illustrates this situation: Organisms do not just "want" (appetite-based attraction) cues that are inconsistent or associated with reward occasionally, they "hope" that those cues will consistently predict reward procurement in the ongoing trial. Said otherwise, they become motivated to invest time and energy to find consistent cue-reward associations despite no guarantee of success (effort-based attraction). A multi-test comparison of performance between individuals trained under uncertainty and certainty reveals behavioral paradoxes suggesting that the concept of incentive salience cannot fully account for responding to inconsistent cues. A mathematical model explains how appetite-based and effort-based attractions might combine their effects.
Collapse
|
23
|
López-Tolsa GE, Orduña V. The role of contingency discriminability in suboptimal choice. Behav Processes 2021; 193:104511. [PMID: 34562512 DOI: 10.1016/j.beproc.2021.104511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/11/2021] [Accepted: 09/20/2021] [Indexed: 11/27/2022]
Abstract
Suboptimal choice consists of a preference for an alternative with a lower probability of reinforcement (suboptimal alternative) over another with a higher probability of reinforcement (optimal alternative) when the former has discriminative stimuli that signal in which trials a reinforcer will be delivered and in which trials it will not. Discriminating the contingencies of reinforcement associated with the stimuli of the suboptimal alternative is necessary to produce suboptimal choice, but the impact of different degrees of discriminability has not been systematically studied. The discriminability of the contingencies of reinforcement depends on the difference in the probability of reinforcement of the two stimuli; higher differences yield higher discriminability. Pigeons were exposed to a procedure that presented a choice between two alternatives, each associated with two stimuli. The contingency discriminability of the suboptimal alternative was manipulated across conditions, while the contingency discriminability of the optimal alternative was absent in all conditions. The overall probability of reinforcement of each alternative remained the same throughout the experiment (p = .2 and p = .5 for the suboptimal and optimal alternatives, respectively). The preference for the suboptimal alternative increased as its discriminability increased. There was a positive correlation between discrimination index and preference for the suboptimal alternative. These results highlight the importance of contingency discriminability to generate suboptimal choice.
Collapse
Affiliation(s)
- Gabriela E López-Tolsa
- Facultad de Psicología, Universidad Nacional Autónoma de México, Ciudad de México 04510, México
| | - Vladimir Orduña
- Facultad de Psicología, Universidad Nacional Autónoma de México, Ciudad de México 04510, México.
| |
Collapse
|
24
|
Moran R, Dayan P, Dolan RJ. Efficiency and prioritization of inference-based credit assignment. Curr Biol 2021; 31:2747-2756.e6. [PMID: 33887181 PMCID: PMC8279739 DOI: 10.1016/j.cub.2021.03.091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/11/2021] [Accepted: 03/29/2021] [Indexed: 11/16/2022]
Abstract
Organisms adapt to their environments by learning to approach states that predict rewards and avoid states associated with punishments. Knowledge about the affective value of states often relies on credit assignment (CA), whereby state values are updated on the basis of reward feedback. Remarkably, humans assign credit to states that are not observed but are instead inferred based on a cognitive map that represents structural knowledge of an environment. A pertinent example is authors attempting to infer the identity of anonymous reviewers to assign them credit or blame and, on this basis, inform future referee recommendations. Although inference is cognitively costly, it is unknown how it influences CA or how it is apportioned between hidden and observable states (for example, both anonymous and revealed reviewers). We addressed these questions in a task that provided choices between lotteries where each led to a unique pair of occasionally rewarding outcome states. On some trials, both states were observable (rendering inference nugatory), whereas on others, the identity of one of the states was concealed. Importantly, by exploiting knowledge of choice-state associations, subjects could infer the identity of this hidden state. We show that having to perform inference reduces state-value updates. Strikingly, and in violation of normative theories, this reduction in CA was selective for the observed outcome alone. These findings have implications for the operation of putative cognitive maps.
Collapse
Affiliation(s)
- Rani Moran
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London WC1B 5EH, UK; Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK.
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Max Planck-Ring 8, 72076 Tübingen, Germany; University of Tübingen, 72074 Tübingen, Germany
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London WC1B 5EH, UK; Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK
| |
Collapse
|
25
|
Abstract
We rely on gaze to guide subsequent steps during walking, more so when the terrain ahead is more uncertain. New research shows that the increased visual exploration during walking as the terrain becomes more uncertain reflects our preference for accuracy over effort in step choice.
Collapse
Affiliation(s)
- Shruthi Sukumar
- Department of Computer Science, University of Colorado Boulder, 1111 Engineering Drive, Boulder, CO 80309, USA.
| | - Alaa A Ahmed
- Department of Mechanical Engineering, University of Colorado Boulder, 1111 Engineering Drive, Boulder, CO 80309, USA
| |
Collapse
|
26
|
Vandaele Y, Lenoir M, Vouillac-Mendoza C, Guillem K, Ahmed SH. Probing the decision-making mechanisms underlying choice between drug and nondrug rewards in rats. eLife 2021; 10:e64993. [PMID: 33900196 PMCID: PMC8075577 DOI: 10.7554/elife.64993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 04/17/2021] [Indexed: 12/02/2022] Open
Abstract
Delineating the decision-making mechanisms underlying choice between drug and nondrug rewards remains a challenge. This study adopts an original approach to probe these mechanisms by comparing response latencies during sampling versus choice trials. While lengthening of latencies during choice is predicted in a deliberative choice model (DCM), the race-like response competition mechanism postulated by the Sequential choice model (SCM) predicts a shortening of latencies during choice compared to sampling. Here, we tested these predictions by conducting a retrospective analysis of cocaine-versus-saccharin choice experiments conducted in our laboratory. We found that rats engage deliberative decision-making mechanisms after limited training, but adopt a SCM-like response selection mechanism after more extended training, while their behavior is presumably habitual. Thus, the DCM and SCM may not be general models of choice, as initially formulated, but could be dynamically engaged to control choice behavior across early and extended training.
Collapse
Affiliation(s)
- Youna Vandaele
- Lausanne University Hospital, Department of PsychiatryPrillySwitzerland
| | - Magalie Lenoir
- Université de Bordeaux, Institut des Maladies NeurodégénérativesBordeauxFrance
- CNRS, Institut des Maladies NeurodégénérativesBordeauxFrance
| | - Caroline Vouillac-Mendoza
- Université de Bordeaux, Institut des Maladies NeurodégénérativesBordeauxFrance
- CNRS, Institut des Maladies NeurodégénérativesBordeauxFrance
| | - Karine Guillem
- Université de Bordeaux, Institut des Maladies NeurodégénérativesBordeauxFrance
- CNRS, Institut des Maladies NeurodégénérativesBordeauxFrance
| | - Serge H Ahmed
- Université de Bordeaux, Institut des Maladies NeurodégénérativesBordeauxFrance
- CNRS, Institut des Maladies NeurodégénérativesBordeauxFrance
| |
Collapse
|
27
|
Jiwa M, Cooper PS, Chong TTJ, Bode S. Choosing increases the value of non-instrumental information. Sci Rep 2021; 11:8780. [PMID: 33888764 PMCID: PMC8062497 DOI: 10.1038/s41598-021-88031-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 04/07/2021] [Indexed: 11/22/2022] Open
Abstract
Curiosity pervades all aspects of human behaviour and decision-making. Recent research indicates that the value of information is determined by its propensity to reduce uncertainty, and the hedonic value of the outcomes it predicts. Previous findings also indicate a preference for options that are freely chosen, compared to equivalently valued alternatives that are externally assigned. Here, we asked whether the value of information also varies as a function of self- or externally-imposed choices. Participants rated their preference for information that followed either a self-chosen decision, or an externally imposed condition. Our results showed that choosing a lottery significantly increased the subjective value of information about the outcome. Computational modelling indicated that this change in information-seeking behaviour was not due to changes in the subjective probability of winning, but instead reflected an independent effect of choosing on the value of resolving uncertainty. These results demonstrate that agency over a prospect is an important source of information value.
Collapse
Affiliation(s)
- Matthew Jiwa
- School of Psychological Sciences, University of Melbourne, Melbourne, 3010, Australia.
| | - Patrick S Cooper
- School of Psychological Sciences, University of Melbourne, Melbourne, 3010, Australia
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, 3800, Australia
| | - Trevor T-J Chong
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, 3800, Australia
- Alfred Health, Department of Neurology, Melbourne, 3004, Australia
- Department of Clinical Neurosciences, St Vincent's Hospital, Melbourne, 3065, Australia
| | - Stefan Bode
- School of Psychological Sciences, University of Melbourne, Melbourne, 3010, Australia
| |
Collapse
|
28
|
Domínguez-Zamora FJ, Marigold DS. Motives driving gaze and walking decisions. Curr Biol 2021; 31:1632-1642.e4. [PMID: 33600769 DOI: 10.1016/j.cub.2021.01.069] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 10/01/2020] [Accepted: 01/20/2021] [Indexed: 01/23/2023]
Abstract
To navigate complex environments, people must decide how to direct gaze to acquire relevant information and decide where, when, and how to move the body. Recent work supports the idea that gaze may be directed to reduce task-relevant environmental uncertainty and to ensure movement accuracy based on the cost (or effort) to move the body and maintain balance. During walking, these two factors may compete for gaze allocation and explain how we make decisions about where to step. Using a forced-choice walking paradigm, where we manipulated the visual uncertainty (simulating uncertain terrain characteristics) and motor cost associated with specific step-target choices, we examined the motives driving gaze and step decisions. We characterized each individual's distinct gaze behavior based on their sensitivity to changes in visual uncertainty, which predicted step-choice behavior when foot-placement accuracy was important to the task. We show that individuals who tended to look at both target choices as visual uncertainty increased prioritized stepping onto the more certain location after looking at it longer, even at the expense of increased motor cost. In contrast, individuals who tended to look at only one of the target choices as visual uncertainty increased preferred to step on the target that minimized motor cost. Overall, we demonstrate that how a person explores the environment with their eyes dictates where they step. These gaze and step decisions may relate to the value a person assigns to information gain, being certain of their actions, and conserving energy.
Collapse
Affiliation(s)
- F Javier Domínguez-Zamora
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Daniel S Marigold
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada.
| |
Collapse
|
29
|
González VV, Macías A, Machado A, Vasconcelos M. Testing the Δ-∑ hypothesis in the suboptimal choice task: Same delta with different probabilities of reinforcement. J Exp Anal Behav 2021; 114:233-247. [PMID: 33460139 DOI: 10.1002/jeab.621] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 06/11/2020] [Accepted: 07/06/2020] [Indexed: 11/08/2022]
Abstract
In a concurrent-chain procedure, pigeons choose between 2 initial-link stimuli; one is followed by terminal link stimuli that signal reliably whether food will be delivered after a delay; the other is followed by terminal link stimuli that do not signal whether food will be delivered after the delay. Pigeons prefer the former alternative even when it yields a lower overall probability of food. Recently, we proposed the Delta-Sigma (∆-∑) hypothesis to explain the effect: Preference depends on the difference (∆) between the reinforcement probabilities associated with the terminal link stimuli, and the overall probability of reinforcement (∑) associated with the alternative. The hypothesis predicts that, for constant ∑, animals should prefer alternatives with greater ∆ values regardless of the specific probabilities of reinforcement that determine ∆. In 2 experiments, we tested this prediction by comparing a ∆ = .5 against a ∆ = 0 alternative, with the former obtained with different pairs of reinforcement probabilities across conditions. The results supported the hypothesis when the 2 probabilities defining ∆ were significantly greater than 0, but not when one of them was close to 0. The results challenge our theoretical accounts of suboptimal choice and the variables considered to determine pigeons' preference.
Collapse
Affiliation(s)
| | - Alejandro Macías
- Department of Education and Psychology, University of Aveiro, Portugal
| | - Armando Machado
- Department of Education and Psychology, University of Aveiro, Portugal.,William James Center for Research, University of Aveiro, Portugal
| | - Marco Vasconcelos
- Department of Education and Psychology, University of Aveiro, Portugal.,William James Center for Research, University of Aveiro, Portugal
| |
Collapse
|
30
|
Gorzelańczyk EJ, Walecki P, Błaszczyszyn M, Laskowska E, Kawala-Sterniuk A. Evaluation of Risk Behavior in Gambling Addicted and Opioid Addicted Individuals. Front Neurosci 2021; 14:597524. [PMID: 33488346 PMCID: PMC7817611 DOI: 10.3389/fnins.2020.597524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 11/20/2020] [Indexed: 11/13/2022] Open
Abstract
Evidence suggests that both opioid addicted and gambling addicted individuals are characterized by higher levels of risky behavior in comparison to healthy people. It has been shown that the administration of substitution drugs can reduce cravings for opioids and the risky decisions made by individuals addicted to opioids. Although it is suggested that the neurobiological foundations of addiction are similar, it is possible that risk behaviors in opioid addicts may differ in detail from those addicted to gambling. The aim of this work was to compare the level of risk behavior in individuals addicted to opioid, with that of individuals addicted to gambling, using the Iowa Gambling Task (IGT). The score and response time during the task were measured. It was also observed, in the basis of the whole IGT test, that individuals addicted to gambling make riskier decisions in comparison to healthy individuals from the control group but less riskier decisions in comparison to individuals addicted to opioids, before administration of methadone and without any statistically significant difference after administration of methadone-as there has been growing evidence that methadone administration is strongly associated with a significant decrease in risky behavior.
Collapse
Affiliation(s)
- Edward J. Gorzelańczyk
- Department of Theoretical Basis of Bio-Medical Sciences and Medical Informatics, Nicolaus Copernicus University – Collegium Medicum, Bydgoszcz, Poland
- Institute of Philosophy, Kazimierz Wielki University, Bydgoszcz, Poland
- Babinski Specialist Psychiatric Healthcare Center, Outpatient Addiction Treatment, Lodz, Poland
- The Society for the Substitution Treatment of Addiction “Medically Assisted Recovery”, Bydgoszcz, Poland
| | - Piotr Walecki
- Department of Bioinformatics and Telemedicine, Jagiellonian University – Collegium Medicum, Krakow, Poland
| | - Monika Błaszczyszyn
- Faculty of Physical Education and Physiotherapy, Opole University of Technology, Opole, Poland
| | - Ewa Laskowska
- Faculty of Medicine, Nicolaus Copernicus University – Collegium Medicum, Bydgoszcz, Poland
| | - Aleksandra Kawala-Sterniuk
- Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, Opole, Poland
| |
Collapse
|
31
|
Sanabria F, Bell MC. Failure to find a distance effect in pigeon choice: Manipulating amount and delay of reinforcement. J Exp Anal Behav 2020; 114:276-290. [PMID: 33034054 DOI: 10.1002/jeab.627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 07/19/2020] [Accepted: 08/26/2020] [Indexed: 11/08/2022]
Abstract
The choice behavior of primates, including humans, displays a distance effect: Latency to choose between alternatives appears to increase with smaller differences in value. There is, so far, no demonstration of this effect in birds. Tests of distance effects in birds have been conducted in binary choice situations with a dominant alternative, where one alternative is superior to the other in all aspects that meaningfully contribute to value (e.g., provides access to the same reinforcer, but with a shorter delay). The present study considers the possibility that including dominant alternatives in choice tests precludes distance effects. Four pigeons were presented with binary choices between alternatives that varied in amount and delay. Some choices had a dominant alternative (smaller-sooner or larger-later vs. smaller-later) and some did not (smaller-sooner vs. larger-later). Across phases, only the delay to the smaller-sooner reinforcer varied. Distance effects were expected to be expressed as longer latencies as choice between smaller-sooner and larger-later reinforcers approached indifference. Despite the sensitivity of choice to differences in amount and delay, no distance effect was observed. Alternative explanations for the failure to find a distance effect in pigeon choice, including the Sequential Choice Model (SCM), are discussed.
Collapse
|
32
|
Understanding exploration in humans and machines by formalizing the function of curiosity. Curr Opin Behav Sci 2020. [DOI: 10.1016/j.cobeha.2020.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
33
|
FitzGibbon L, Lau JKL, Murayama K. The seductive lure of curiosity: information as a motivationally salient reward. Curr Opin Behav Sci 2020. [DOI: 10.1016/j.cobeha.2020.05.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
34
|
Monosov IE. How Outcome Uncertainty Mediates Attention, Learning, and Decision-Making. Trends Neurosci 2020; 43:795-809. [PMID: 32736849 PMCID: PMC8153236 DOI: 10.1016/j.tins.2020.06.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 06/16/2020] [Accepted: 06/24/2020] [Indexed: 01/24/2023]
Abstract
Animals and humans evolved sophisticated nervous systems that endowed them with the ability to form internal-models or beliefs and make predictions about the future to survive and flourish in a world in which future outcomes are often uncertain. Crucial to this capacity is the ability to adjust behavioral and learning policies in response to the level of uncertainty. Until recently, the neuronal mechanisms that could underlie such uncertainty-guided control have been largely unknown. In this review, I discuss newly discovered neuronal circuits in primates that represent uncertainty about future rewards and propose how they guide information-seeking, attention, decision-making, and learning to help us survive in an uncertain world. Lastly, I discuss the possible relevance of these findings to learning in artificial systems.
Collapse
Affiliation(s)
- Ilya E Monosov
- Department of Neuroscience and Neurosurgery, Washington University School of Medicine in St. Louis, MO, USA; Department of Biomedical Engineering, Washington University School of Medicine in St. Louis, MO, USA; Washington University Pain Center, Washington University School of Medicine in St. Louis, MO, USA.
| |
Collapse
|
35
|
Abstract
Animals will favor a risky option when a stimulus signaling reward bridges the choice and the outcome. The present experiments investigated signal-induced risky choices and reward-outcome expectations in rhesus and capuchin monkeys. Risky choice was assessed by preference for a large-probabilistic reward over a modest-certain reward. Outcome expectancy was assessed by providing a truncation-response to shorten the delay period. In Experiment 1 both species generally favored the risky option compared to a safe option when the outcomes were signaled and generally shortened the delays except when a signaled-loss stimulus was presented. The use of the delay-truncation response suggested that the monkeys were sensitive to the information conveyed by the stimulus. Experiments 2 and 3 were designed to investigate whether the delay-truncation response used by capuchin monkeys was strategically used reflecting explicit decision-making versus a conditioned response to reward stimuli. A perceptual judgment task was included and the selective use of the delay-truncation response on unsignaled correct trials may suggest the involvement of metacognitive processes. The capuchin monkeys generally truncated the delays except under conditions where reward would not be expected (risky-loss or incorrect-judgment). When the outcomes were unsignaled during the delay some capuchin monkeys were less likely to truncate the delay following an incorrect task response. Overall, the monkeys: (1) made more risky choices when the outcomes were signaled - consistent with gambling-like behavior. (2) selectively truncated the unsignaled delays when rewards could be anticipated (even when metacognitive-like awareness guided anticipation) - suggesting that delay truncation responses reflect explicit outcome expectancy.
Collapse
Affiliation(s)
- Travis R Smith
- Department of Psychological Sciences, Kansas State University, 492 Bluemont Hall, 1114 Mid-Campus Dr North, Manhattan, KS, 66506-5302, USA.
| | - Michael J Beran
- Language Research Center and Department of Psychology, Georgia State University, Atlanta, GA, USA
| |
Collapse
|
36
|
Macías A, González VV, Machado A, Vasconcelos M. The functional equivalence of two variants of the suboptimal choice task: choice proportion and response latency as measures of value. Anim Cogn 2020; 24:85-98. [PMID: 32772333 DOI: 10.1007/s10071-020-01418-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/28/2020] [Accepted: 07/30/2020] [Indexed: 10/23/2022]
Abstract
In the suboptimal-choice task, birds systematically choose the leaner but informative option (suboptimal) over the richer but non-informative option (optimal). The task has two variations. In the standard task, the optimal option includes two terminal link stimuli. In the original task, it includes a single terminal link stimulus. Two models, the temporal information account (Cunningham and Shahan, J Exp Psychol Anim Learn Cogn 44:1-22, 2018) and the ∆-∑ hypothesis (González et al., J Exp Anal Behav 113:591-608, 2020), presuppose that these procedures are equivalent, but no formal comparison is available. Here we test whether or not these procedures are functionally equivalent. One group of pigeons was trained with the standard procedure, another group with the original procedure, and a third group was trained with a hybrid of the other two (i.e., the two options were the optimal links of the standard and original procedures). Our findings indicate that the number of terminal link stimuli in the optimal option is inconsequential vis-à-vis choice. Moreover, our findings also indicate that latencies to respond are a sensitive metric of value and choice. As predicted by the Sequential Choice Model, we were able to predict simultaneous choices from the latencies of sequential choices and observed a substantial shortening of latencies during simultaneous choices.
Collapse
Affiliation(s)
- Alejandro Macías
- Animal Learning and Behavior Lab, School of Psychology, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal. .,Department of Education and Psychology, University of Aveiro, Aveiro, Portugal.
| | - Valeria V González
- Animal Learning and Behavior Lab, School of Psychology, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
| | - Armando Machado
- Department of Education and Psychology, University of Aveiro, Aveiro, Portugal.,William James Center for Research, University of Aveiro, Aveiro, Portugal
| | - Marco Vasconcelos
- Department of Education and Psychology, University of Aveiro, Aveiro, Portugal.,William James Center for Research, University of Aveiro, Aveiro, Portugal
| |
Collapse
|
37
|
González-Torres R, Flores J, Orduña V. Suboptimal choice by pigeons is eliminated when key-pecking behavior is replaced by treadle-pressing. Behav Processes 2020; 178:104157. [PMID: 32497555 DOI: 10.1016/j.beproc.2020.104157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 05/29/2020] [Accepted: 05/29/2020] [Indexed: 11/25/2022]
Abstract
In the study of suboptimal choice, a reliable result is that pigeons strongly prefer an alternative that signals whether a reinforcer will be delivered or not over another alternative without that information even if the first provides a lower probability of reinforcement. In the aforementioned research, key pecking has been the operant response and illuminated keys the discriminative stimuli. In the present study we modified both of these aspects of the procedure in order to analyze the generality of suboptimal preferences of pigeons and to investigate the effect of changes in the incentive salience of the discriminative stimuli. To accomplish this, we presented pigeons a choice situation with the same parameters of reinforcement than previous research, but with treadle pressing as the choice response and ambient lights as discriminative stimuli. Under these conditions, most of the pigeons showed optimal behavior and a high degree of discrimination of the stimuli associated with the discriminative alternative. A control condition with key pecking as choice response and keylights as discriminative stimuli showed that the same pigeons turned to be suboptimal, a result that discards the possibility that the optimality found in the main condition was a consequence of a particular characteristic of our sample of subjects or of our procedure. We discuss the influence that the attribution of incentive salience to the discriminative stimuli has on suboptimal choice in both pigeons and rats.
Collapse
Affiliation(s)
| | - Julio Flores
- Facultad de Psicología, Universidad Nacional Autónoma de México, México, DF, 04510, Mexico
| | - Vladimir Orduña
- Facultad de Psicología, Universidad Nacional Autónoma de México, México, DF, 04510, Mexico.
| |
Collapse
|
38
|
González VV, Macías A, Machado A, Vasconcelos M. The Δ-∑ hypothesis: How contrast and reinforcement rate combine to generate suboptimal choice. J Exp Anal Behav 2020; 113:591-608. [PMID: 32237091 DOI: 10.1002/jeab.595] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/27/2020] [Accepted: 02/27/2020] [Indexed: 11/06/2022]
Abstract
When given a choice between two alternatives, each offering food after the same delay with different but signaled probabilities, pigeons often prefer the low probability alternative. This preference is surprising because pigeons fail to maximize the rate of food intake; they exhibit a suboptimal preference. We advance a new explanation, the Δ-∑ hypothesis, in which the difference in probability of reinforcement within terminal links (Δ) and the overall reinforcement probability rate of each alternative (∑) are the key variables responsible for such suboptimal preference. We tested the Δ-∑ hypothesis in two experiments. In Experiment 1, we manipulated the Δs while maintaining constant all other parameters of the task, in particular the ∑s. We predicted a preference for the alternative with the larger Δ. In Experiment 2, we examined the effect of the overall reinforcement probabilities, the ∑s, while maintaining constant all other parameters of the task, in particular the Δs. We predicted a preference for the larger ∑. The results of both experiments support the Δ-∑ hypothesis.
Collapse
Affiliation(s)
| | - Alejandro Macías
- Department of Education and Psychology, University of Aveiro, Portugal
| | - Armando Machado
- Department of Education and Psychology, University of Aveiro, Portugal.,William James Center for Research, University of Aveiro, Portugal
| | - Marco Vasconcelos
- Department of Education and Psychology, University of Aveiro, Portugal.,William James Center for Research, University of Aveiro, Portugal
| |
Collapse
|
39
|
Lau JKL, Ozono H, Kuratomi K, Komiya A, Murayama K. Shared striatal activity in decisions to satisfy curiosity and hunger at the risk of electric shocks. Nat Hum Behav 2020; 4:531-543. [PMID: 32231281 DOI: 10.1038/s41562-020-0848-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/02/2020] [Indexed: 12/14/2022]
Abstract
Curiosity is often portrayed as a desirable feature of human faculty. However, curiosity may come at a cost that sometimes puts people in harmful situations. Here, using a set of behavioural and neuroimaging experiments with stimuli that strongly trigger curiosity (for example, magic tricks), we examine the psychological and neural mechanisms underlying the motivational effect of curiosity. We consistently demonstrate that across different samples, people are indeed willing to gamble, subjecting themselves to electric shocks to satisfy their curiosity for trivial knowledge that carries no apparent instrumental value. Also, this influence of curiosity shares common neural mechanisms with that of hunger for food. In particular, we show that acceptance (compared to rejection) of curiosity-driven or incentive-driven gambles is accompanied by enhanced activity in the ventral striatum when curiosity or hunger was elicited, which extends into the dorsal striatum when participants made a decision.
Collapse
Affiliation(s)
- Johnny King L Lau
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK.
| | - Hiroki Ozono
- Faculty of Law, Economics and Humanities, Kagoshima University, Kagoshima, Japan
| | - Kei Kuratomi
- Faculty of Psychology, Aichi Shukutoku University, Nagakute, Japan
| | - Asuka Komiya
- Graduate School of Integrated Arts and Sciences, Hiroshima University, Higashi-Hiroshima, Japan
| | - Kou Murayama
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK. .,Research Institute, Kochi University of Technology, Kochi, Japan.
| |
Collapse
|
40
|
The incentive salience of the stimuli biases rats’ preferences in the “suboptimal choice” procedure. Behav Processes 2020; 172:104057. [DOI: 10.1016/j.beproc.2020.104057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 01/13/2020] [Accepted: 01/14/2020] [Indexed: 11/24/2022]
|
41
|
Abstract
Many studies have shown that pigeons will sometimes behave suboptimally by choosing an option that provides food less frequently over one that provides food more frequently. The critical factor in driving suboptimal behavior in these procedures is that the delayed outcomes are differentially signaled on the suboptimal alternative, but not the optimal alternative. Although this procedure is frequently cited as potentially analogous to human gambling, there is little empirical data to evaluate this assertion. The present study tested both pigeon (Experiment 1) and human (Experiment 2) subjects with a suboptimal choice task. Subjects chose between a suboptimal alternative that provided a large reinforcer 20% of the time and an optimal alternative that always provided a small reinforcer. Stimuli presented during the delays signaled the outcomes on the suboptimal alternative in some conditions. When outcomes were signaled, pigeons chose the suboptimal alternative more frequently than did humans. When the outcomes were not signaled, pigeons' choices became more optimal, but humans' choices did not. Humans' suboptimal choice was unrelated to performance on a probability discounting task. Overall, these findings suggest that although both pigeons and humans can choose suboptimally, more research is needed in order to determine whether non-human performance on this task can serve as a model for human gambling.
Collapse
|
42
|
Kobayashi K, Ravaioli S, Baranès A, Woodford M, Gottlieb J. Diverse motives for human curiosity. Nat Hum Behav 2019; 3:587-595. [PMID: 30988479 DOI: 10.1038/s41562-019-0589-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Accepted: 03/12/2019] [Indexed: 12/29/2022]
Abstract
Curiosity-our desire to know-is a fundamental drive in human behaviour, but its mechanisms are poorly understood. A classical question concerns the curiosity motives. What drives individuals to become curious about some but not other sources of information?1 Here we show that curiosity about probabilistic events depends on multiple aspects of the distribution of these events. Participants (n = 257) performed a task in which they could demand advance information about only one of two randomly selected monetary prizes that contributed to their income. Individuals differed markedly in the extent to which they requested information as a function of the ex ante uncertainty or ex ante value of an individual prize. This heterogeneity was not captured by theoretical models describing curiosity as a desire to learn about the total rewards of a situation2,3. Instead, it could be explained by an extended model that allowed for attribute-specific anticipatory utility-the savouring of individual components of the eventual reward-and postulates that this utility increased nonlinearly with the certainty of receiving the reward. Parameter values fitting individual choices were consistent for information about gains or losses, suggesting that attribute-specific anticipatory utility captures fundamental heterogeneity in the determinants of curiosity.
Collapse
Affiliation(s)
- Kenji Kobayashi
- The Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| | - Silvio Ravaioli
- Sant'Anna School of Advanced Studies, Pisa, Italy.,Department of Economics, Columbia University, New York, NY, USA
| | - Adrien Baranès
- Department of Neuroscience, Columbia University, New York, NY, USA
| | | | - Jacqueline Gottlieb
- The Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.,Department of Neuroscience, Columbia University, New York, NY, USA.,The Kavli Institute for Brain Science, Columbia University, New York, NY, USA
| |
Collapse
|
43
|
Monkeys are curious about counterfactual outcomes. Cognition 2019; 189:1-10. [PMID: 30889493 DOI: 10.1016/j.cognition.2019.03.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 03/11/2019] [Accepted: 03/13/2019] [Indexed: 11/22/2022]
Abstract
Many non-human animals show exploratory behaviors. It remains unclear whether any possess human-like curiosity. We previously proposed three criteria for applying the term curiosity to animal behavior: (1) the subject is willing to sacrifice reward to obtain information, (2) the information provides no immediate instrumental or strategic benefit, and (3) the amount the subject is willing to pay depends systematically on the amount of information available. In previous work on information-seeking in animals, information generally predicts upcoming rewards, and animals' decisions may therefore be a byproduct of reinforcement processes. Here we get around this potential confound by taking advantage of macaques' ability to reason counterfactually (that is, about outcomes that could have occurred had the subject chosen differently). Specifically, macaques sacrificed fluid reward to obtain information about counterfactual outcomes. Moreover, their willingness to pay scaled with the information (Shannon entropy) offered by the counterfactual option. These results demonstrate the existence of human-like curiosity in non-human primates according to our criteria, which circumvent several confounds associated with less stringent criteria.
Collapse
|
44
|
Moran R, Keramati M, Dayan P, Dolan RJ. Retrospective model-based inference guides model-free credit assignment. Nat Commun 2019; 10:750. [PMID: 30765718 PMCID: PMC6375980 DOI: 10.1038/s41467-019-08662-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 01/17/2019] [Indexed: 11/09/2022] Open
Abstract
An extensive reinforcement learning literature shows that organisms assign credit efficiently, even under conditions of state uncertainty. However, little is known about credit-assignment when state uncertainty is subsequently resolved. Here, we address this problem within the framework of an interaction between model-free (MF) and model-based (MB) control systems. We present and support experimentally a theory of MB retrospective-inference. Within this framework, a MB system resolves uncertainty that prevailed when actions were taken thus guiding an MF credit-assignment. Using a task in which there was initial uncertainty about the lotteries that were chosen, we found that when participants' momentary uncertainty about which lottery had generated an outcome was resolved by provision of subsequent information, participants preferentially assigned credit within a MF system to the lottery they retrospectively inferred was responsible for this outcome. These findings extend our knowledge about the range of MB functions and the scope of system interactions.
Collapse
Affiliation(s)
- Rani Moran
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London, WC1B 5EH, UK. .,Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom.
| | - Mehdi Keramati
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London, WC1B 5EH, UK.,Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom.,Department of Psychology, City, University of London, London, EC1R 0JD, UK
| | - Peter Dayan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London, WC1B 5EH, UK.,Gatsby Computational Neuroscience Unit, University College London, London, W1T 4JG, UK.,Max Planck Institute for Biological Cybernetics, Max Plank-Ring 8, 72076, Tuebingen, Germany
| | - Raymond J Dolan
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London, WC1B 5EH, UK.,Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| |
Collapse
|
45
|
Orduña V, Alba R. Rats' optimal choice behavior in a gambling-like task. Behav Processes 2019; 162:104-111. [PMID: 30742885 DOI: 10.1016/j.beproc.2019.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 02/02/2019] [Accepted: 02/07/2019] [Indexed: 02/07/2023]
Abstract
Among the different procedures that model gambling behavior in non-human animals, the "suboptimal choice procedure" has been extensively employed for analyzing the impact of environmental cues on choice behavior. It has been repeatedly demonstrated that pigeons prefer an alternative that infrequently presents a stimulus that signals a larger amount of reinforcement, than another alternative that always presents a stimulus associated with a smaller amount of reinforcement, even though the net rate of reinforcement is lower in the former. In the present study, we tested rats in the magnitude version of the suboptimal choice procedure. Eight rats were given a choice between two alternatives: a) one in which a stimulus predicting the delivery of ten pellets was presented with probability (p) = 0.2 and a stimulus predicting zero pellets was presented with p = 0.8, and b) one in which either of two stimuli predicted the delivery of three pellets with p = 1.0. Contrary to the consistent and robust suboptimal behavior of pigeons, rats preferred the optimal alternative. This effect occurred despite the high index of discrimination of the stimuli associated with the different outcomes shown by the rats. The relevance of this result to the development of animal models of gambling behavior is discussed.
Collapse
Affiliation(s)
- Vladimir Orduña
- Facultad de Psicología, Universidad Nacional Autónoma de México, México D.F., 04510, Mexico.
| | - Rodrigo Alba
- Facultad de Psicología, Universidad Nacional Autónoma de México, México D.F., 04510, Mexico
| |
Collapse
|
46
|
Zentall TR, Smith AP, Beckmann J. Differences in rats and pigeons suboptimal choice may depend on where those stimuli are in their behavior system. Behav Processes 2019; 159:37-41. [DOI: 10.1016/j.beproc.2018.11.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 11/26/2018] [Accepted: 11/30/2018] [Indexed: 12/11/2022]
|
47
|
Rodriguez Cabrero JAM, Zhu JQ, Ludvig EA. Costly curiosity: People pay a price to resolve an uncertain gamble early. Behav Processes 2019; 160:20-25. [PMID: 30648613 DOI: 10.1016/j.beproc.2018.12.015] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 12/01/2018] [Accepted: 12/14/2018] [Indexed: 10/27/2022]
Abstract
Humans are inherently curious creatures, continuously seeking out information about future outcomes. Such advance information is often valuable, potentially allowing people to select better courses of action. In non-human animals, this drive for information can be so strong that they forego food or water to find out a few seconds earlier whether an uncertain option will provide a reward. Here, we assess whether people will exhibit a similar sub-optimal preference for advance information. Participants played a card-flipping task where they were probabilistically rewarded based on the pattern of 3 cards that were revealed after a 5-s delay. During this delay, participants could instead pay a cost to find out the next card's identity immediately. This choice to find out early did not influence the eventual outcome. Participants preferred to find out early about 80% of the time when the information was free; they were even willing to incur an expense to get advance information about the eventual outcome. The expected magnitude of the outcome, however, had little impact on the likelihood of finding out early. These results suggest that humans, like animals, value non-instrumental information and will pay a price for such information, independent of its utility.
Collapse
|
48
|
Schwartenbeck P, Passecker J, Hauser TU, FitzGerald THB, Kronbichler M, Friston KJ. Computational mechanisms of curiosity and goal-directed exploration. eLife 2019; 8:41703. [PMID: 31074743 PMCID: PMC6510535 DOI: 10.7554/elife.41703] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 04/17/2019] [Indexed: 01/27/2023] Open
Abstract
Successful behaviour depends on the right balance between maximising reward and soliciting information about the world. Here, we show how different types of information-gain emerge when casting behaviour as surprise minimisation. We present two distinct mechanisms for goal-directed exploration that express separable profiles of active sampling to reduce uncertainty. 'Hidden state' exploration motivates agents to sample unambiguous observations to accurately infer the (hidden) state of the world. Conversely, 'model parameter' exploration, compels agents to sample outcomes associated with high uncertainty, if they are informative for their representation of the task structure. We illustrate the emergence of these types of information-gain, termed active inference and active learning, and show how these forms of exploration induce distinct patterns of 'Bayes-optimal' behaviour. Our findings provide a computational framework for understanding how distinct levels of uncertainty systematically affect the exploration-exploitation trade-off in decision-making.
Collapse
Affiliation(s)
- Philipp Schwartenbeck
- Wellcome Centre for Human NeuroimagingUniversity College LondonLondonUnited Kingdom,Centre for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria,Neuroscience InstituteChristian-Doppler-Klinik, Paracelsus Medical University SalzburgSalzburgAustria,Oxford Centre for Functional MRI of the Brain, Nuffield Department of Clinical NeurosciencesUniversity of OxfordOxfordUnited Kingdom
| | - Johannes Passecker
- Department for Cognitive Neurobiology, Center for Brain ResearchMedical University ViennaViennaAustria,Mortimer B. Zuckerman Mind Brain and Behavior InstituteNew YorkUnited States
| | - Tobias U Hauser
- Wellcome Centre for Human NeuroimagingUniversity College LondonLondonUnited Kingdom,Max Planck University College London Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom
| | - Thomas HB FitzGerald
- Wellcome Centre for Human NeuroimagingUniversity College LondonLondonUnited Kingdom,Max Planck University College London Centre for Computational Psychiatry and Ageing ResearchLondonUnited Kingdom,Department of PsychologyUniversity of East AngliaNorwichUnited Kingdom
| | - Martin Kronbichler
- Centre for Cognitive NeuroscienceUniversity of SalzburgSalzburgAustria,Neuroscience InstituteChristian-Doppler-Klinik, Paracelsus Medical University SalzburgSalzburgAustria
| | - Karl J Friston
- Wellcome Centre for Human NeuroimagingUniversity College LondonLondonUnited Kingdom
| |
Collapse
|
49
|
Pisklak JM, McDevitt MA, Dunn RM, Spetch ML. Frequency and value both matter in the suboptimal choice procedure. J Exp Anal Behav 2018; 111:1-11. [PMID: 30569554 DOI: 10.1002/jeab.490] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 12/05/2018] [Indexed: 11/09/2022]
Abstract
Pigeons chose between two options on a concurrent-chains task with a single response requirement in the initial link. The suboptimal option ended with food 20% of the time whereas the optimal option ended with food 80% of the time. During a Sig-Both condition, terminal-link stimuli on both options signaled whether or not food would occur. During a Sig-Sub condition, terminal-link stimuli on the suboptimal option provided differential signals, but stimuli on the optimal option did not differentially signal the food and no food outcomes. Initial-link choices revealed a clear preference for the optimal option in the Sig-Both condition, but preference shifted toward suboptimality in the Sig-Sub condition. These findings show that pigeon suboptimal choice is not singularly driven by signal value, as has been suggested, but also by reinforcer frequency.
Collapse
|
50
|
Zentall TR, Andrews DM, Case JP. Contrast between what is expected and what occurs increases pigeon’s suboptimal choice. Anim Cogn 2018; 22:81-87. [DOI: 10.1007/s10071-018-1223-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 10/17/2018] [Accepted: 11/08/2018] [Indexed: 11/24/2022]
|