1
|
Zid M, Laurie VJ, Levine-Champagne A, Shourkeshti A, Harrell D, Herman AB, Ebitz RB. Humans forage for reward in reinforcement learning tasks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.08.602539. [PMID: 39026817 PMCID: PMC11257465 DOI: 10.1101/2024.07.08.602539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
How do we make good decisions in uncertain environments? In psychology and neuroscience, the classic answer is that we calculate the value of each option and then compare the values to choose the most rewarding, modulo some exploratory noise. An ethologist, conversely, would argue that we commit to one option until its value drops below a threshold, at which point we start exploring other options. In order to determine which view better describes human decision-making, we developed a novel, foraging-inspired sequential decision-making model and used it to ask whether humans compare to threshold ("Forage") or compare alternatives ("Reinforcement-Learn" [RL]). We found that the foraging model was a better fit for participant behavior, better predicted the participants' tendency to repeat choices, and predicted the existence of held-out participants with a pattern of choice that was almost impossible under RL. Together, these results suggest that humans use foraging computations, rather than RL, even in classic reinforcement learning tasks.
Collapse
Affiliation(s)
- Meriam Zid
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | - Veldon-James Laurie
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | | | - Akram Shourkeshti
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| | - Dameon Harrell
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Alexander B. Herman
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, 55455, USA
| | - R. Becket Ebitz
- Department of Neuroscience, University of Montreal, Montreal, QC , H3T 1J4, Canada
| |
Collapse
|
2
|
Shintaki R, Tanaka D, Suzuki S, Yoshimoto T, Sadato N, Chikazoe J, Jimura K. Continuous decision to wait for a future reward is guided by fronto-hippocampal anticipatory dynamics. Cereb Cortex 2024; 34:bhae217. [PMID: 38798003 DOI: 10.1093/cercor/bhae217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 05/02/2024] [Accepted: 05/08/2024] [Indexed: 05/29/2024] Open
Abstract
Deciding whether to wait for a future reward is crucial for surviving in an uncertain world. While seeking rewards, agents anticipate a reward in the present environment and constantly face a trade-off between staying in their environment or leaving it. It remains unclear, however, how humans make continuous decisions in such situations. Here, we show that anticipatory activity in the anterior prefrontal cortex, ventrolateral prefrontal cortex, and hippocampus underpins continuous stay-leave decision-making. Participants awaited real liquid rewards available after tens of seconds, and their continuous decision was tracked by dynamic brain activity associated with the anticipation of a reward. Participants stopped waiting more frequently and sooner after they experienced longer delays and received smaller rewards. When the dynamic anticipatory brain activity was enhanced in the anterior prefrontal cortex, participants remained in their current environment, but when this activity diminished, they left the environment. Moreover, while experiencing a delayed reward in a novel environment, the ventrolateral prefrontal cortex and hippocampus showed anticipatory activity. Finally, the activity in the anterior prefrontal cortex and ventrolateral prefrontal cortex was enhanced in participants adopting a leave strategy, whereas those remaining stationary showed enhanced hippocampal activity. Our results suggest that fronto-hippocampal anticipatory dynamics underlie continuous decision-making while anticipating a future reward.
Collapse
Affiliation(s)
- Reiko Shintaki
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Daiki Tanaka
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
| | - Shinsuke Suzuki
- Centre for Brain, Mind and Markets, The University of Melbourne, Grattan Street, Parkville, Victoria, 3010, Australia
- Faculty of Social Data Science and HIAS Brain Research Center, Hitotsubashi University, 2-1 Naka, Kunitachi, 186-8601, Japan
| | - Takaaki Yoshimoto
- Research Organization of Science and Technology, Ritsumeikan University, 1-1-1, Nojihigashi, Kusatsu, 525-8577, Japan
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
| | - Norihiro Sadato
- Research Organization of Science and Technology, Ritsumeikan University, 1-1-1, Nojihigashi, Kusatsu, 525-8577, Japan
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
| | - Junichi Chikazoe
- Section of Brain Function Information, Supportive Center for Brain Research, National Institute for Physiological Sciences, 38 Nishigonaka, Myodaiji, Okazaki, 444-8585, Japan
- Araya, Inc., 1-11 Kanda Sakuma-cho, Chiyoda, Tokyo, 101-0025, Japan
| | - Koji Jimura
- Department of Informatics, Gunma University, 4-2 Aramaki-machi, Maebashi, 371-8510, Japan
| |
Collapse
|
3
|
Alejandro RJ, Holroyd CB. Hierarchical control over foraging behavior by anterior cingulate cortex. Neurosci Biobehav Rev 2024; 160:105623. [PMID: 38490499 DOI: 10.1016/j.neubiorev.2024.105623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/14/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024]
Abstract
Foraging is a natural behavior that involves making sequential decisions to maximize rewards while minimizing the costs incurred when doing so. The prevalence of foraging across species suggests that a common brain computation underlies its implementation. Although anterior cingulate cortex is believed to contribute to foraging behavior, its specific role has been contentious, with predominant theories arguing either that it encodes environmental value or choice difficulty. Additionally, recent attempts to characterize foraging have taken place within the reinforcement learning framework, with increasingly complex models scaling with task complexity. Here we review reinforcement learning foraging models, highlighting the hierarchical structure of many foraging problems. We extend this literature by proposing that ACC guides foraging according to principles of model-based hierarchical reinforcement learning. This idea holds that ACC function is organized hierarchically along a rostral-caudal gradient, with rostral structures monitoring the status and completion of high-level task goals (like finding food), and midcingulate structures overseeing the execution of task options (subgoals, like harvesting fruit) and lower-level actions (such as grabbing an apple).
Collapse
Affiliation(s)
| | - Clay B Holroyd
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
4
|
Sukumar S, Shadmehr R, Ahmed AA. Effects of reward and effort history on decision making and movement vigor during foraging. J Neurophysiol 2024; 131:638-651. [PMID: 38056423 PMCID: PMC11305639 DOI: 10.1152/jn.00092.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 12/01/2023] [Accepted: 12/04/2023] [Indexed: 12/08/2023] Open
Abstract
During foraging, animals explore a site and harvest reward and then abandon that site and travel to the next opportunity. One aspect of this behavior involves decision making, and the other involves movement control. These two aspects of behavior may be linked via an underlying desire to maximize a single normative utility: the sum of all rewards acquired, minus all efforts expended, divided by time. According to this theory, the history of rewards, and not just its immediate availability, should dictate how long one should stay and harvest reward and how vigorously one should travel to the next opportunity. We tested this theory in a series of experiments in which humans used their hand to harvest tokens at a reward patch and then used their arm to reach toward another patch. After a history of high rewards, the subjects not only shortened their harvest duration but also moved more vigorously toward the next reward opportunity. In contrast, after a history of high effort they lengthened their harvest duration but reduced their movement vigor, reaching more slowly to the next reward site. Thus, a history of high reward or low effort biased decisions by promoting early abandonment of the reward site and biased movements by promoting vigor.NEW & NOTEWORTHY Much of life is spent foraging. Whereas previous work has focused on the decision regarding time spent harvesting from a reward patch, here we test the idea that both decision making and movement control are tuned to optimize the net rate of reward in an environment. Our results show that movement patterns reflect not just immediate expectations but also past experiences in the environment, providing fundamental insight into the factors governing volitional control of arm movements.
Collapse
Affiliation(s)
- Shruthi Sukumar
- Department of Computer Science, University of Colorado Boulder, Boulder, Colorado, United States
| | - Reza Shadmehr
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States
| | - Alaa A Ahmed
- Department of Mechanical Engineering, University of Colorado Boulder, Boulder, Colorado, United States
| |
Collapse
|
5
|
Webb J, Steffan P, Hayden BY, Lee D, Kemere C, McGinley M. Foraging Under Uncertainty Follows the Marginal Value Theorem with Bayesian Updating of Environment Representations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.30.587253. [PMID: 38585964 PMCID: PMC10996644 DOI: 10.1101/2024.03.30.587253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Foraging theory has been a remarkably successful approach to understanding the behavior of animals in many contexts. In patch-based foraging contexts, the marginal value theorem (MVT) shows that the optimal strategy is to leave a patch when the marginal rate of return declines to the average for the environment. However, the MVT is only valid in deterministic environments whose statistics are known to the forager; naturalistic environments seldom meet these strict requirements. As a result, the strategies used by foragers in naturalistic environments must be empirically investigated. We developed a novel behavioral task and a corresponding computational framework for studying patch-leaving decisions in head-fixed and freely moving mice. We varied between-patch travel time, as well as within-patch reward depletion rate, both deterministically and stochastically. We found that mice adopt patch residence times in a manner consistent with the MVT and not explainable by simple ethologically motivated heuristic strategies. Critically, behavior was best accounted for by a modified form of the MVT wherein environment representations were updated based on local variations in reward timing, captured by a Bayesian estimator and dynamic prior. Thus, we show that mice can strategically attend to, learn from, and exploit task structure on multiple timescales simultaneously, thereby efficiently foraging in volatile environments. The results provide a foundation for applying the systems neuroscience toolkit in freely moving and head-fixed mice to understand the neural basis of foraging under uncertainty.
Collapse
Affiliation(s)
- James Webb
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, USA
| | - Paul Steffan
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Benjamin Y. Hayden
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Daeyeol Lee
- The Zanvyl Krieger Mind/Brain Institute, The Solomon H Snyder Department of Neuroscience, Department of Psychological and Brain Sciences, Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Caleb Kemere
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA
| | - Matthew McGinley
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, USA
- Department of Electrical and Computer Engineering, Rice University, Houston, TX, USA
| |
Collapse
|
6
|
Nwakama CA, Durand-de Cuttoli R, Oketokoun ZM, Brown SO, Haller JE, Méndez A, Farshbaf MJ, Cho YZ, Ahmed S, Leng S, Ables JL, Sweis BM. Diabetes alters neuroeconomically dissociable forms of mental accounting. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.04.574210. [PMID: 38260368 PMCID: PMC10802482 DOI: 10.1101/2024.01.04.574210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Those with diabetes mellitus are at high-risk of developing psychiatric disorders, yet the link between hyperglycemia and alterations in motivated behavior has not been explored in detail. We characterized value-based decision-making behavior of a streptozocin-induced diabetic mouse model on a naturalistic neuroeconomic foraging paradigm called Restaurant Row. Mice made self-paced choices while on a limited time-budget accepting or rejecting reward offers as a function of cost (delays cued by tone-pitch) and subjective value (flavors), tested daily in a closed-economy system across months. We found streptozocin-treated mice disproportionately undervalued less-preferred flavors and inverted their meal-consumption patterns shifted toward a more costly strategy that overprioritized high-value rewards. We discovered these foraging behaviors were driven by impairments in multiple decision-making systems, including the ability to deliberate when engaged in conflict and cache the value of the passage of time in the form of sunk costs. Surprisingly, diabetes-induced changes in behavior depended not only on the type of choice being made but also the salience of reward-scarcity in the environment. These findings suggest complex relationships between glycemic regulation and dissociable valuation algorithms underlying unique cognitive heuristics and sensitivity to opportunity costs can disrupt fundamentally distinct computational processes and could give rise to psychiatric vulnerabilities.
Collapse
|
7
|
Barack DL, Parodi F, Ludwig V, Platt ML. Information gathering explains decision dynamics during human and monkey reward foraging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.14.562362. [PMID: 37905132 PMCID: PMC10614769 DOI: 10.1101/2023.10.14.562362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Foraging in humans and other animals requires a delicate balance between exploitation of current resources and exploration for new ones. The tendency to overharvest-lingering too long in depleting patches-is a routine behavioral deviation from predictions of optimal foraging theories. To characterize the computational mechanisms driving these deviations, we modeled foraging behavior using a virtual patch-leaving task with human participants and validated our findings in an analogous foraging task in two monkeys. Both humans and monkeys overharvested and stayed longer in patches with longer travel times compared to shorter ones. Critically, patch residence times in both species declined over the course of sessions, enhancing reward rates in humans. These decisions were best explained by a logistic transformation that integrated both current rewards and information about declining rewards. This parsimonious model demystifies both the occurrence and dynamics of overharvesting, highlighting the role of information gathering in foraging. Our findings provide insight into computational mechanisms shaped by ubiquitous foraging dilemmas, underscoring how behavioral modeling can reveal underlying motivations of seemingly irrational decisions.
Collapse
|
8
|
Bukwich M, Campbell MG, Zoltowski D, Kingsbury L, Tomov MS, Stern J, Kim HR, Drugowitsch J, Linderman SW, Uchida N. Competitive integration of time and reward explains value-sensitive foraging decisions and frontal cortex ramping dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.05.556267. [PMID: 37732217 PMCID: PMC10508756 DOI: 10.1101/2023.09.05.556267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
The ability to make advantageous decisions is critical for animals to ensure their survival. Patch foraging is a natural decision-making process in which animals decide when to leave a patch of depleting resources to search for a new one. To study the algorithmic and neural basis of patch foraging behavior in a controlled laboratory setting, we developed a virtual foraging task for head-fixed mice. Mouse behavior could be explained by ramp-to-threshold models integrating time and rewards antagonistically. Accurate behavioral modeling required inclusion of a slowly varying "patience" variable, which modulated sensitivity to time. To investigate the neural basis of this decision-making process, we performed dense electrophysiological recordings with Neuropixels probes broadly throughout frontal cortex and underlying subcortical areas. We found that decision variables from the reward integrator model were represented in neural activity, most robustly in frontal cortical areas. Regression modeling followed by unsupervised clustering identified a subset of neurons with ramping activity. These neurons' firing rates ramped up gradually in single trials over long time scales (up to tens of seconds), were inhibited by rewards, and were better described as being generated by a continuous ramp rather than a discrete stepping process. Together, these results identify reward integration via a continuous ramping process in frontal cortex as a likely candidate for the mechanism by which the mammalian brain solves patch foraging problems.
Collapse
Affiliation(s)
- Michael Bukwich
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
- Current address: Sainsbury Wellcome Centre, University College London, London, W1T 4JG, UK
| | - Malcolm G Campbell
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
| | - David Zoltowski
- Department of Statistics, Stanford University, Stanford, CA, 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| | - Lyle Kingsbury
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
| | - Momchil S Tomov
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
- Current address: Motional AD LLC, Boston, MA 02210
| | - Joshua Stern
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
| | - HyungGoo R Kim
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
- Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon 16419, Republic of Korea
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Jan Drugowitsch
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Scott W Linderman
- Department of Statistics, Stanford University, Stanford, CA, 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| | - Naoshige Uchida
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138
- Center for Brain Science, Harvard University, Cambridge, MA, 02138
| |
Collapse
|
9
|
Gancarz AM, Mitchell SH, George AM, Martin CD, Turk MC, Bool HM, Aktar F, Kwarteng F, Palmer AA, Meyer PJ, Richards JB, Dietz DM, Ishiwari K. Reward maximization assessed using a sequential patch depletion task in a large sample of heterogeneous stock rats. Sci Rep 2023; 13:7027. [PMID: 37120610 PMCID: PMC10148848 DOI: 10.1038/s41598-023-34179-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 04/25/2023] [Indexed: 05/01/2023] Open
Abstract
Choice behavior requires animals to evaluate both short- and long-term advantages and disadvantages of all potential alternatives. Impulsive choice is traditionally measured in laboratory tasks by utilizing delay discounting (DD), a paradigm that offers a choice between a smaller immediate reward, or a larger more delayed reward. This study tested a large sample of Heterogeneous Stock (HS) male (n = 896) and female (n = 898) rats, part of a larger genetic study, to investigate whether measures of reward maximization overlapped with traditional models of delay discounting via the patch depletion model using a Sequential Patch Depletion procedure. In this task, rats were offered a concurrent choice between two water "patches" and could elect to "stay" in the current patch or "leave" for an alternative patch. Staying in the current patch resulted in decreasing subsequent reward magnitudes, whereas the choice to leave a patch was followed by a delay and a resetting to the maximum reward magnitude. Based on the delay in a given session, different visit durations were necessary to obtain the maximum number of rewards. Visit duration may be analogous to an indifference point in traditional DD tasks. Males and females did not significantly differ on traditional measures of DD (e.g. delay gradient; AUC). When examining measures of patch utilization, females made fewer patch changes at all delays and spent more time in the patch before leaving for the alternative patch compared to males. Consistent with this, there was some evidence that females deviated from reward maximization more than males. However, when controlling for body weight, females had a higher normalized rate of reinforcement than males. Measures of reward maximization were only weakly associated with traditional DD measures and may represent distinctive underlying processes. Taken together, females performance differed from males with regard to reward maximization that were not observed utilizing traditional measures of DD, suggesting that the patch depletion model was more sensitive to modest sex differences when compared to traditional DD measures in a large sample of HS rats.
Collapse
Affiliation(s)
- Amy M Gancarz
- Department of Psychology, California State University, Bakersfield, Bakersfield, CA, 93311, USA.
| | - Suzanne H Mitchell
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, 97239, USA
- Department of Psychiatry, Oregon Health & Science University, Portland, OR, 97239, USA
- Oregon Institute for Occupational Health Sciences, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Anthony M George
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
| | - Connor D Martin
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Marisa C Turk
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Heather M Bool
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Fahmida Aktar
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Francis Kwarteng
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Abraham A Palmer
- Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Paul J Meyer
- Department of Psychology, University at Buffalo, Buffalo, NY, 14260, USA
| | - Jerry B Richards
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - David M Dietz
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA
| | - Keita Ishiwari
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY, 14203, USA.
- Department of Pharmacology and Toxicology, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, 14203, USA.
| |
Collapse
|
10
|
Harhen NC, Bornstein AM. Overharvesting in human patch foraging reflects rational structure learning and adaptive planning. Proc Natl Acad Sci U S A 2023; 120:e2216524120. [PMID: 36961923 PMCID: PMC10068834 DOI: 10.1073/pnas.2216524120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 02/11/2023] [Indexed: 03/26/2023] Open
Abstract
Patch foraging presents a sequential decision-making problem widely studied across organisms-stay with a current option or leave it in search of a better alternative? Behavioral ecology has identified an optimal strategy for these decisions, but, across species, foragers systematically deviate from it, staying too long with an option or "overharvesting" relative to this optimum. Despite the ubiquity of this behavior, the mechanism underlying it remains unclear and an object of extensive investigation. Here, we address this gap by approaching foraging as both a decision-making and learning problem. Specifically, we propose a model in which foragers 1) rationally infer the structure of their environment and 2) use their uncertainty over the inferred structure representation to adaptively discount future rewards. We find that overharvesting can emerge from this rational statistical inference and uncertainty adaptation process. In a patch-leaving task, we show that human participants adapt their foraging to the richness and dynamics of the environment in ways consistent with our model. These findings suggest that definitions of optimal foraging could be extended by considering how foragers reduce and adapt to uncertainty over representations of their environment.
Collapse
Affiliation(s)
- Nora C. Harhen
- Department of Cognitive Sciences, University of California, Irvine, CA92697
| | - Aaron M. Bornstein
- Department of Cognitive Sciences, University of California, Irvine, CA92697
- Center for the Neurobiology of Learning and Memory, University of California, Irvine, CA92697
| |
Collapse
|
11
|
Yáñez N, Bouzas A, Segura A. Effects of the Response Requirement on Rats’ Choice between Probabilistic Reinforcers. PSYCHOLOGICAL RECORD 2023. [DOI: 10.1007/s40732-023-00542-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
12
|
Levenstein D, Alvarez VA, Amarasingham A, Azab H, Chen ZS, Gerkin RC, Hasenstaub A, Iyer R, Jolivet RB, Marzen S, Monaco JD, Prinz AA, Quraishi S, Santamaria F, Shivkumar S, Singh MF, Traub R, Nadim F, Rotstein HG, Redish AD. On the Role of Theory and Modeling in Neuroscience. J Neurosci 2023; 43:1074-1088. [PMID: 36796842 PMCID: PMC9962842 DOI: 10.1523/jneurosci.1179-22.2022] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 12/14/2022] [Accepted: 12/18/2022] [Indexed: 02/18/2023] Open
Abstract
In recent years, the field of neuroscience has gone through rapid experimental advances and a significant increase in the use of quantitative and computational methods. This growth has created a need for clearer analyses of the theory and modeling approaches used in the field. This issue is particularly complex in neuroscience because the field studies phenomena that cross a wide range of scales and often require consideration at varying degrees of abstraction, from precise biophysical interactions to the computations they implement. We argue that a pragmatic perspective of science, in which descriptive, mechanistic, and normative models and theories each play a distinct role in defining and bridging levels of abstraction, will facilitate neuroscientific practice. This analysis leads to methodological suggestions, including selecting a level of abstraction that is appropriate for a given problem, identifying transfer functions to connect models and data, and the use of models themselves as a form of experiment.
Collapse
Affiliation(s)
- Daniel Levenstein
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
| | - Veronica A Alvarez
- Laboratory on Neurobiology of Compulsive Behaviors, National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, Bethesda, Maryland 20892
| | - Asohan Amarasingham
- Departments of Mathematics and Biology, City College and the Graduate Center, City University of New York, New York, New York 10032
| | - Habiba Azab
- Department of Neuroscience, Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, Minnesota 55455
| | - Zhe S Chen
- Department of Psychiatry, Neuroscience & Physiology, New York University School of Medicine, New York, New York, 10016
| | - Richard C Gerkin
- School of Life Sciences, Arizona State University, Tempe, Arizona 85281
| | - Andrea Hasenstaub
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California 94115
| | | | - Renaud B Jolivet
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, The Netherlands
| | - Sarah Marzen
- W. M. Keck Science Department, Pitzer, Scripps, and Claremont McKenna Colleges, Claremont, California 91711
| | - Joseph D Monaco
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland 21218
| | - Astrid A Prinz
- Department of Biology, Emory University, Atlanta, Georgia 30322
| | - Salma Quraishi
- Neuroscience, Developmental and Regnerative Biology Department, University of Texas at San Antonio, San Antonio, Texas 78249
| | - Fidel Santamaria
- Neuroscience, Developmental and Regnerative Biology Department, University of Texas at San Antonio, San Antonio, Texas 78249
| | - Sabyasachi Shivkumar
- Brain and Cognitive Sciences, University of Rochester, Rochester, New York 14627
| | - Matthew F Singh
- Department of Psychological & Brain Sciences, Department of Electrical & Systems Engineering, Washington University in St. Louis, St. Louis, Missouri 63112
| | - Roger Traub
- IBM T.J. Watson Research Center, AI Foundations, Yorktown Heights, New York 10598
| | - Farzan Nadim
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California 94115
| | - Horacio G Rotstein
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California 94115
| | - A David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|
13
|
Gancarz AM, Mitchell SH, George AM, Martin CD, Turk MC, Bool HM, Aktar F, Kwarteng F, Palmer AA, Meyer PJ, Richards JB, Dietz DM, Isiwari K. Reward Maximization Assessed Using a Sequential Patch Depletion Task in a Large Sample of Heterogeneous Stock Rats. RESEARCH SQUARE 2023:rs.3.rs-2525080. [PMID: 36778344 PMCID: PMC9915773 DOI: 10.21203/rs.3.rs-2525080/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Choice behavior requires animals to evaluate both short- and long-term advantages and disadvantages of all potential alternatives. Impulsive choice is traditionally measured in laboratory tasks by utilizing delay discounting (DD), a paradigm that offers a choice between a smaller immediate reward, or a larger more delayed reward. This study tested a large sample of Heterogeneous Stock (HS) male (n = 896) and female (n = 898) rats, part of a larger genetic study, to investigate whether measures of reward maximization overlapped with traditional models of delay discounting via the patch depletion model using a Sequential Patch Depletion procedure. In this task, rats were offered a concurrent choice between two water "patches" and could elect to "stay" in the current patch or "leave" for an alternative patch. Staying in the current patch resulted in decreasing subsequent reward magnitudes, whereas the choice to leave a patch was followed by a delay and a resetting to the maximum reward magnitude. Based on the delay in a given session, different visit durations were necessary to obtain the maximum number of rewards. Visit duration may be analogous to an indifference point in traditional DD tasks. While differences in traditional DD measures (e.g., delay gradient) have been detected between males and females, these effects were small and inconsistent. However, when examining measures of reward maximization, females made fewer patch changes at all delays and spent more time in the patch before leaving for the alternative patch compared to males. This pattern of choice resulted in males having a higher rate of reinforcement than females. Consistent with this, there was some evidence that females deviated from the optimal more, leading to less reward. Measures of reward maximization were only weakly associated with traditional DD measures and may represent distinctive underlying processes. Taken together, females performance differed from males with regard to reward maximization that were not observed utilizing traditional measures of DD, suggesting that the patch depletion model was more sensitive to modest sex differences when compared to traditional DD measures in a large sample of HS rats.
Collapse
|
14
|
Lind EB, Sweis BM, Asp AJ, Esguerra M, Silvis KA, David Redish A, Thomas MJ. A quadruple dissociation of reward-related behaviour in mice across excitatory inputs to the nucleus accumbens shell. Commun Biol 2023; 6:119. [PMID: 36717646 PMCID: PMC9886947 DOI: 10.1038/s42003-023-04429-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 01/05/2023] [Indexed: 02/01/2023] Open
Abstract
The nucleus accumbens shell (NAcSh) is critically important for reward valuations, yet it remains unclear how valuation information is integrated in this region to drive behaviour during reinforcement learning. Using an optogenetic spatial self-stimulation task in mice, here we show that contingent activation of different excitatory inputs to the NAcSh change expression of different reward-related behaviours. Our data indicate that medial prefrontal inputs support place preference via repeated actions, ventral hippocampal inputs consistently promote place preferences, basolateral amygdala inputs produce modest place preferences but as a byproduct of increased sensitivity to time investments, and paraventricular inputs reduce place preferences yet do not produce full avoidance behaviour. These findings suggest that each excitatory input provides distinct information to the NAcSh, and we propose that this reflects the reinforcement of different credit assignment functions. Our finding of a quadruple dissociation of NAcSh input-specific behaviours provides insights into how types of information carried by distinct inputs to the NAcSh could be integrated to help drive reinforcement learning and situationally appropriate behavioural responses.
Collapse
Affiliation(s)
- Erin B Lind
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Brian M Sweis
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
- Department of Psychiatry, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY, 10029, USA
| | - Anders J Asp
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Rehabilitation Medicine Research Center, Department of Physical Medicine and Rehabilitation, Mayo Clinic, 200 First St SW, Rochester, MN, 55905, USA
| | - Manuel Esguerra
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Keelia A Silvis
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - A David Redish
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA
| | - Mark J Thomas
- Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455, USA.
- Medical Discovery Team on Addiction, University of Minnesota, 3-432 McGuire Translational Research Facility, 2001 6th St SE, Minneapolis, MN, 55455, USA.
| |
Collapse
|
15
|
Redish AD, Abram SV, Cunningham PJ, Duin AA, Durand-de Cuttoli R, Kazinka R, Kocharian A, MacDonald AW, Schmidt B, Schmitzer-Torbert N, Thomas MJ, Sweis BM. Sunk cost sensitivity during change-of-mind decisions is informed by both the spent and remaining costs. Commun Biol 2022; 5:1337. [PMID: 36474069 PMCID: PMC9726928 DOI: 10.1038/s42003-022-04235-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 11/08/2022] [Indexed: 12/12/2022] Open
Abstract
Sunk cost sensitivity describes escalating decision commitment with increased spent resources. On neuroeconomic foraging tasks, mice, rats, and humans show similar escalations from sunk costs while quitting an ongoing countdown to reward. In a new analysis taken across computationally parallel foraging tasks across species and laboratories, we find that these behaviors primarily occur on choices that are economically inconsistent with the subject's other choices, and that they reflect not only the time spent, but also the time remaining, suggesting that these are change-of-mind re-evaluation processes. Using a recently proposed change-of-mind drift-diffusion model, we find that the sunk cost sensitivity in this model arises from decision-processes that directly take into account the time spent (costs sunk). Applying these new insights to experimental data, we find that sensitivity to sunk costs during re-evaluation decisions depends on the information provided to the subject about the time spent and the time remaining.
Collapse
Affiliation(s)
- A. David Redish
- grid.17635.360000000419368657Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455 USA
| | - Samantha V. Abram
- grid.410372.30000 0004 0419 2775San Francisco Veterans Affairs Medical Center, San Francisco, CA 94121 USA
| | - Paul J. Cunningham
- grid.17635.360000000419368657Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455 USA
| | - Anneke A. Duin
- grid.17635.360000000419368657Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455 USA ,grid.466656.10000 0004 0523 9811Present Address: Epic Systems, 1979 Milky Way, Verona, WI 53593 USA
| | - Romain Durand-de Cuttoli
- grid.59734.3c0000 0001 0670 2351Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| | - Rebecca Kazinka
- grid.17635.360000000419368657Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, MN 55454 USA
| | - Adrina Kocharian
- grid.17635.360000000419368657Graduate Program in Neuroscience and Medical Scientist Training Program, University of Minnesota, Minneapolis, MN 55455 USA
| | - Angus W. MacDonald
- grid.17635.360000000419368657Department of Psychology, University of Minnesota, Minneapolis, MN 55455 USA
| | - Brandy Schmidt
- grid.17635.360000000419368657Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455 USA
| | - Neil Schmitzer-Torbert
- grid.267959.60000 0000 9886 0607Department of Psychology, Wabash College, Crawfordsville, IN 47933 USA
| | - Mark J. Thomas
- grid.17635.360000000419368657Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455 USA
| | - Brian M. Sweis
- grid.59734.3c0000 0001 0670 2351Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| |
Collapse
|
16
|
Durand-de Cuttoli R, Martínez-Rivera FJ, Li L, Minier-Toribio A, Holt LM, Cathomas F, Yasmin F, Elhassa SO, Shaikh JF, Ahmed S, Russo SJ, Nestler EJ, Sweis BM. Distinct forms of regret linked to resilience versus susceptibility to stress are regulated by region-specific CREB function in mice. SCIENCE ADVANCES 2022; 8:eadd5579. [PMID: 36260683 PMCID: PMC9581472 DOI: 10.1126/sciadv.add5579] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 06/23/2022] [Accepted: 08/30/2022] [Indexed: 05/31/2023]
Abstract
Regret describes recognizing alternative actions could have led to better outcomes. It remains unclear whether regret derives from generalized mistake appraisal or instead comprises dissociable, action-specific processes. Using a neuroeconomic task, we found that mice were sensitive to fundamentally distinct types of regret following exposure to chronic social defeat stress or manipulations of CREB, a transcription factor implicated in stress action. Bias to make compensatory decisions after rejecting high-value offers (regret type I) was unique to stress-susceptible mice. Bias following the converse operation, accepting low-value offers (regret type II), was enhanced in stress-resilient mice and absent in stress-susceptible mice. CREB function in either the prefrontal cortex or nucleus accumbens was required to suppress regret type I but bidirectionally regulated regret type II. We provide insight into how maladaptive stress response traits relate to distinct forms of counterfactual thinking, which could steer therapy for mood disorders, such as depression, toward circuit-specific computations through a careful description of decision narrative.
Collapse
Affiliation(s)
- Romain Durand-de Cuttoli
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Freddyson J. Martínez-Rivera
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Long Li
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Angélica Minier-Toribio
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Leanne M. Holt
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Flurin Cathomas
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Farzana Yasmin
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Salma O. Elhassa
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jasmine F. Shaikh
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sanjana Ahmed
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Scott J. Russo
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Eric J. Nestler
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Brian M. Sweis
- Nash Family Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
17
|
Kane GA, James MH, Shenhav A, Daw ND, Cohen JD, Aston-Jones G. Rat Anterior Cingulate Cortex Continuously Signals Decision Variables in a Patch Foraging Task. J Neurosci 2022; 42:5730-5744. [PMID: 35688627 PMCID: PMC9302469 DOI: 10.1523/jneurosci.1940-21.2022] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 04/24/2022] [Accepted: 04/28/2022] [Indexed: 01/22/2023] Open
Abstract
In patch foraging tasks, animals must decide whether to remain with a depleting resource or to leave it in search of a potentially better source of reward. In such tasks, animals consistently follow the general predictions of optimal foraging theory (the marginal value theorem; MVT): to leave a patch when the reward rate in the current patch depletes to the average reward rate across patches. Prior studies implicate an important role for the anterior cingulate cortex (ACC) in foraging decisions based on MVT: within single trials, ACC activity increases immediately preceding foraging decisions, and across trials, these dynamics are modulated as the value of staying in the patch depletes to the average reward rate. Here, we test whether these activity patterns reflect dynamic encoding of decision-variables and whether these signals are directly involved in decision-making. We developed a leaky accumulator model based on the MVT that generates estimates of decision variables within and across trials, and tested model predictions against ACC activity recorded from male rats performing a patch foraging task. Model predicted changes in MVT decision variables closely matched rat ACC activity. Next, we pharmacologically inactivated ACC in male rats to test the contribution of these signals to decision-making. ACC inactivation had a profound effect on rats' foraging decisions and response times (RTs) yet rats still followed the MVT decision rule. These findings indicate that the ACC encodes foraging-related variables for reasons unrelated to patch-leaving decisions.SIGNIFICANCE STATEMENT The ability to make adaptive patch-foraging decisions, to remain with a depleting resource or search for better alternatives, is critical to animal well-being. Previous studies have found that anterior cingulate cortex (ACC) activity is modulated at different points in the foraging decision process, raising questions about whether the ACC guides ongoing decisions or serves a more general purpose of regulating cognitive control. To investigate the function of the ACC in foraging, the present study developed a dynamic model of behavior and neural activity, and tested model predictions using recordings and inactivation of ACC. Findings revealed that ACC continuously signals decision variables but that these signals are more likely used to monitor and regulate ongoing processes than to guide foraging decisions.
Collapse
Affiliation(s)
- Gary A Kane
- Department of Psychology and Neuroscience Institute, Princeton University, Princeton, New Jersey 08544
- Center for Systems Neuroscience, Boston University, Boston, Massachusetts 02155
| | - Morgan H James
- Department of Psychiatry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey 08854
- Brain Health Institute, Rutgers University, Pisccataway, New Jersey 08854
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, & Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02912
| | - Nathaniel D Daw
- Department of Psychology and Neuroscience Institute, Princeton University, Princeton, New Jersey 08544
| | - Jonathan D Cohen
- Department of Psychology and Neuroscience Institute, Princeton University, Princeton, New Jersey 08544
| | - Gary Aston-Jones
- Brain Health Institute, Rutgers University, Pisccataway, New Jersey 08854
| |
Collapse
|
18
|
Toro-Serey C, Kane GA, McGuire JT. Choices favoring cognitive effort in a foraging environment decrease when multiple forms of effort and delay are interleaved. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2022; 22:509-532. [PMID: 34850362 DOI: 10.3758/s13415-021-00972-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
Cognitive and physical effort are typically regarded as costly, but demands for effort also seemingly boost the appeal of prospects under certain conditions. One contextual factor that might influence choices for or against effort is the mix of different types of demand a decision maker encounters in a given environment. In two foraging experiments, participants encountered prospective rewards that required equally long intervals of cognitive effort, physical effort, or unfilled delay. Monetary offers varied per trial, and the two experiments differed in whether the type of effort or delay cost was the same on every trial, or varied across trials. When each participant faced only one type of cost, cognitive effort persistently produced the highest acceptance rate compared to trials with an equivalent period of either physical effort or unfilled delay. We theorized that if cognitive effort were intrinsically rewarding, we would observe the same pattern of preferences when participants foraged for varying cost types in addition to rewards. Contrary to this prediction, in the second experiment, an initially higher acceptance rate for cognitive effort trials disappeared over time amid an overall decline in acceptance rates as participants gained experience with all three conditions. Our results indicate that cognitive demands may reduce the discounting effect of delays, but not because decision makers assign intrinsic value to cognitive effort. Rather, the results suggest that a cognitive effort requirement might influence contextual factors such as subjective delay duration estimates, which can be recalibrated if multiple forms of demand are interleaved.
Collapse
Affiliation(s)
- Claudio Toro-Serey
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA.
- McLean Hospital, Harvard Medical School, 115 Mill St., MRC 3, MA, 02478, Belmont, USA.
| | - Gary A Kane
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA
- Center for Systems Neuroscience, Boston University, 677 Bacon St., Rm 212, Boston, MA, 02215, USA
| | - Joseph T McGuire
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA
- Center for Systems Neuroscience, Boston University, 677 Bacon St., Rm 212, Boston, MA, 02215, USA
| |
Collapse
|
19
|
The time, the path, its length and strenuousness in maze learning. PSIHOLOGIJA 2022. [DOI: 10.2298/psi210301005k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Previous findings show that rats in a maze tend to choose the shortest path
to reach food. But it is not clear whether this choice is based on path
length solely, or some other factors. The aim of this experiment was to
investigate which factor dominates the behavior in a maze: path (longer and
shorter), time (longer and shorter), or effort (more or less strenuous). The
experiment involved 40 mice (4 groups), learning a maze with two paths. Each
group went through only one of the situations within which we kept one
factor constant on two paths while the remaining two factors were varied.
Only in the fourth situation all factors were equalized. The results show
that there is a statistically significant difference in the maze path
preference between four situations. Preference between the paths is such
that mice always choose paths requiring less effort.
Collapse
|
20
|
Predictive and motivational factors influencing anticipatory contrast: A comparison of contextual and gustatory predictors in food restricted and free-fed rats. Physiol Behav 2021; 242:113603. [PMID: 34562439 PMCID: PMC8593211 DOI: 10.1016/j.physbeh.2021.113603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 08/11/2021] [Accepted: 09/19/2021] [Indexed: 11/23/2022]
Abstract
Using an anticipatory negative contrast (ANC) paradigm, food restricted animals can act selectively in their eating behavior. Contextual and gustatory predictors in a within-subject design are sufficient for anticipatory negative contrast development. Changes in reward palatability may underlie contextually-driven anticipatory negative contrast. An increase in premature port entries to the unavailable sipper – a second measure of ANC – in all groups reveals a direct influence of response competition on ANC development.
In anticipation of palatable food, rats can learn to restrict consumption of a less rewarding food type resulting in an increased consumption of the preferred food when it is made available. This construct is known as anticipatory negative contrast (ANC) and can help elucidate the processes that underlie binge-like behavior as well as self-control in rodent motivation models. In the current investigation we aimed to shed light on the ability of distinct predictors of a preferred food choice to generate contrast effects and the motivational processes that underlie this behavior. Using a novel set of rewarding solutions, we directly compared contextual and gustatory ANC predictors in both food restricted and free-fed Sprague-Dawley rats. Our results indicate that, despite being food restricted, rats are selective in their eating behavior and show strong contextually-driven ANC similar to free-fed animals. These differences mirrored changes in palatability for the less preferred solution across the different sessions as measured by lick microstructure analysis. In contrast to previous research, predictive cues in both food restricted and free-fed rats were sufficient for ANC to develop although flavor-driven ANC did not relate to a corresponding change in lick patterning. These differences in the lick microstructure between context- and flavor-driven ANC indicate that the motivational processes underlying ANC generated by the two predictor types are distinct. Moreover, an increase in premature port entries to the unavailable sipper – a second measure of ANC – in all groups reveals a direct influence of response competition on ANC development.
Collapse
|
21
|
Khalighinejad N, Garrett N, Priestley L, Lockwood P, Rushworth MFS. A habenula-insular circuit encodes the willingness to act. Nat Commun 2021; 12:6329. [PMID: 34732720 PMCID: PMC8566457 DOI: 10.1038/s41467-021-26569-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/07/2021] [Indexed: 11/08/2022] Open
Abstract
The decision that it is worth doing something rather than nothing is a core yet understudied feature of voluntary behaviour. Here we study "willingness to act", the probability of making a response given the context. Human volunteers encountered opportunities to make effortful actions in order to receive rewards, while watching a movie inside a 7 T MRI scanner. Reward and other context features determined willingness-to-act. Activity in the habenula tracked trial-by-trial variation in participants' willingness-to-act. The anterior insula encoded individual environment features that determined this willingness. We identify a multi-layered network in which contextual information is encoded in the anterior insula, converges on the habenula, and is then transmitted to the supplementary motor area, where the decision is made to either act or refrain from acting via the nigrostriatal pathway.
Collapse
Affiliation(s)
- Nima Khalighinejad
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3SR, UK.
| | - Neil Garrett
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3SR, UK
- School of Psychology, University of East Anglia, Norwich, UK
| | - Luke Priestley
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3SR, UK
| | - Patricia Lockwood
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3SR, UK
- Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK
| | - Matthew F S Rushworth
- Wellcome Centre for Integrative Neuroimaging, Department of Experimental Psychology, University of Oxford, Oxford, OX1 3SR, UK
| |
Collapse
|
22
|
Choice history effects in mice and humans improve reward harvesting efficiency. PLoS Comput Biol 2021; 17:e1009452. [PMID: 34606493 PMCID: PMC8516315 DOI: 10.1371/journal.pcbi.1009452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 10/14/2021] [Accepted: 09/15/2021] [Indexed: 12/04/2022] Open
Abstract
Choice history effects describe how future choices depend on the history of past choices. In experimental tasks this is typically framed as a bias because it often diminishes the experienced reward rates. However, in natural habitats, choices made in the past constrain choices that can be made in the future. For foraging animals, the probability of earning a reward in a given patch depends on the degree to which the animals have exploited the patch in the past. One problem with many experimental tasks that show choice history effects is that such tasks artificially decouple choice history from its consequences on reward availability over time. To circumvent this, we use a variable interval (VI) reward schedule that reinstates a more natural contingency between past choices and future reward availability. By examining the behavior of optimal agents in the VI task we discover that choice history effects observed in animals serve to maximize reward harvesting efficiency. We further distil the function of choice history effects by manipulating first- and second-order statistics of the environment. We find that choice history effects primarily reflect the growth rate of the reward probability of the unchosen option, whereas reward history effects primarily reflect environmental volatility. Based on observed choice history effects in animals, we develop a reinforcement learning model that explicitly incorporates choice history over multiple time scales into the decision process, and we assess its predictive adequacy in accounting for the associated behavior. We show that this new variant, known as the double trace model, has a higher performance in predicting choice data, and shows near optimal reward harvesting efficiency in simulated environments. These results suggests that choice history effects may be adaptive for natural contingencies between consumption and reward availability. This concept lends credence to a normative account of choice history effects that extends beyond its description as a bias. Animals foraging for food in natural habitats compete to obtain better quality food patches. To achieve this goal, animals can rely on memory and choose the same patches that have provided higher quality of food in the past. However, in natural habitats simply identifying better food patches may not be sufficient to successfully compete with their conspecifics, as food resources can grow over time. Therefore, it makes sense to visit from time to time those patches that were associated with lower food quality in the past. This demands optimal foraging animals to keep in memory not only which food patches provided the best food quality, but also which food patches they visited recently. To see if animals track their history of visits and use it to maximize the food harvesting efficiency, we subjected them to experimental conditions that mimicked natural foraging behavior. In our behavioral tasks, we replaced food foraging behavior with a two choice task that provided rewards to mice and humans. By developing a new computational model and subjecting animals to various behavioral manipulations, we demonstrate that keeping a memory of past visits helps the animals to optimize the efficiency with which they can harvest rewards.
Collapse
|
23
|
Ksander J, Katz DB, Miller P. A model of naturalistic decision making in preference tests. PLoS Comput Biol 2021; 17:e1009012. [PMID: 34555012 PMCID: PMC8491944 DOI: 10.1371/journal.pcbi.1009012] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 10/05/2021] [Accepted: 09/10/2021] [Indexed: 11/30/2022] Open
Abstract
Decisions as to whether to continue with an ongoing activity or to switch to an alternative are a constant in an animal’s natural world, and in particular underlie foraging behavior and performance in food preference tests. Stimuli experienced by the animal both impact the choice and are themselves impacted by the choice, in a dynamic back and forth. Here, we present model neural circuits, based on spiking neurons, in which the choice to switch away from ongoing behavior instantiates this back and forth, arising as a state transition in neural activity. We analyze two classes of circuit, which differ in whether state transitions result from a loss of hedonic input from the stimulus (an “entice to stay” model) or from aversive stimulus-input (a “repel to leave” model). In both classes of model, we find that the mean time spent sampling a stimulus decreases with increasing value of the alternative stimulus, a fact that we linked to the inclusion of depressing synapses in our model. The competitive interaction is much greater in “entice to stay” model networks, which has qualitative features of the marginal value theorem, and thereby provides a framework for optimal foraging behavior. We offer suggestions as to how our models could be discriminatively tested through the analysis of electrophysiological and behavioral data. Many decisions are of the ilk of whether to continue sampling a stimulus or to switch to an alternative, a key feature of foraging behavior. We produce two classes of model for such stay-switch decisions, which differ in how decisions to switch stimuli can arise. In an “entice-to-stay” model, a reduction in the necessary positive stimulus input causes switching decisions. In a “repel-to-leave” model, a rise in aversive stimulus input produces a switch decision. We find that in tasks where the sampling of one stimulus follows another, adaptive biological processes arising from a highly hedonic stimulus can reduce the time spent at the following stimulus, by up to ten-fold in the “entice-to-stay” models. Along with potentially observable behavioral differences that could distinguish the classes of networks, we also found signatures in neural activity, such as oscillation of neural firing rates and a rapid change in rates preceding the time of choice to leave a stimulus. In summary, our model findings lead to testable predictions and suggest a neural circuit-based framework for explaining foraging choices.
Collapse
Affiliation(s)
- John Ksander
- Volen National Center for Complex Systems, Brandeis University, Waltham, Massachusetts, United States of America
- Department of Psychology, Brandeis University, Waltham, Massachusetts, United States of America
| | - Donald B. Katz
- Volen National Center for Complex Systems, Brandeis University, Waltham, Massachusetts, United States of America
- Department of Psychology, Brandeis University, Waltham, Massachusetts, United States of America
| | - Paul Miller
- Volen National Center for Complex Systems, Brandeis University, Waltham, Massachusetts, United States of America
- Department of Biology, Brandeis University, Waltham, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
24
|
Delays to Reward Delivery Enhance the Preference for an Initially Less Desirable Option: Role for the Basolateral Amygdala and Retrosplenial Cortex. J Neurosci 2021; 41:7461-7478. [PMID: 34315810 DOI: 10.1523/jneurosci.0438-21.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 07/12/2021] [Accepted: 07/16/2021] [Indexed: 11/21/2022] Open
Abstract
Temporal costs influence reward-based decisions. This is commonly studied in temporal discounting tasks that involve choosing between cues signaling an imminent reward option or a delayed reward option. However, it is unclear whether the temporal delay before a reward can alter the value of that option. To address this, we identified the relative preference between different flavored rewards during a free-feeding test using male and female rats. Animals underwent training where either the initial preferred or the initial less preferred reward was delivered noncontingently. By manipulating the intertrial interval during training sessions, we could determine whether temporal delays impact reward preference in a subsequent free-feeding test. Rats maintained their initial preference if the same delays were used across all training sessions. When the initial less preferred option was delivered after short delays (high reward rate) and the initial preferred option was delivered after long delays (low reward rate), rats expectedly increased their preference for the initial less desirable option. However, rats also increased their preference for the initial less desirable option under the opposite training contingencies: delivering the initial less preferred reward after long delays and the initial preferred reward after short delays. These data suggest that sunk temporal costs enhance the preference for a less desirable reward option. Pharmacological and lesion experiments were performed to identify the neural systems responsible for this behavioral phenomenon. Our findings demonstrate the basolateral amygdala and retrosplenial cortex are required for temporal delays to enhance the preference for an initially less desirable reward.SIGNIFICANCE STATEMENT The goal of this study was to determine how temporal delays influence reward preference. We demonstrate that delivering an initially less desirable reward after long delays subsequently increases the consumption and preference for that reward. Furthermore, we identified the basolateral amygdala and the retrosplenial cortex as essential nuclei for mediating the change in reward preference elicited by sunk temporal costs.
Collapse
|
25
|
Kilpatrick ZP, Davidson JD, El Hady A. Uncertainty drives deviations in normative foraging decision strategies. J R Soc Interface 2021; 18:20210337. [PMID: 34255987 PMCID: PMC8277480 DOI: 10.1098/rsif.2021.0337] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Nearly all animals forage to acquire energy for survival through efficient search and resource harvesting. Patch exploitation is a canonical foraging behaviour, but there is a need for more tractable and understandable mathematical models describing how foragers deal with uncertainty. To provide such a treatment, we develop a normative theory of patch foraging decisions, proposing mechanisms by which foraging behaviours emerge in the face of uncertainty. Our model foragers statistically and sequentially infer patch resource yields using Bayesian updating based on their resource encounter history. A decision to leave a patch is triggered when the certainty of the patch type or the estimated yield of the patch falls below a threshold. The time scale over which uncertainty in resource availability persists strongly impacts behavioural variables like patch residence times and decision rules determining patch departures. When patch depletion is slow, as in habitat selection, departures are characterized by a reduction of uncertainty, suggesting that the forager resides in a low-yielding patch. Uncertainty leads patch-exploiting foragers to overharvest (underharvest) patches with initially low (high) resource yields in comparison with predictions of the marginal value theorem. These results extend optimal foraging theory and motivate a variety of behavioural experiments investigating patch foraging behaviour.
Collapse
Affiliation(s)
- Zachary P Kilpatrick
- Department of Applied Mathematics, University of Colorado, Boulder, CO 80309, USA.,Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Jacob D Davidson
- Department of Collective Behaviour, Max Planck Institute of Animal Behavior, 78464 Konstanz, Germany.,Department of Biology, University of Konstanz, 78464 Konstanz, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz, 78464 Konstanz, Germany
| | - Ahmed El Hady
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, USA
| |
Collapse
|
26
|
Kazinka R, MacDonald AW, Redish AD. Sensitivity to Sunk Costs Depends on Attention to the Delay. Front Psychol 2021; 12:604843. [PMID: 33692720 PMCID: PMC7937795 DOI: 10.3389/fpsyg.2021.604843] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 01/27/2021] [Indexed: 11/24/2022] Open
Abstract
In the WebSurf task, humans forage for videos paying costs in terms of wait times on a time-limited task. A variant of the task in which demands during the wait time were manipulated revealed the role of attention in susceptibility to sunk costs. Consistent with parallel tasks in rodents, previous studies have found that humans (undergraduates measured in lab) preferred shorter delays, but waited longer for more preferred videos, suggesting that they were treating the delays economically. In an Amazon Mechanical Turk (mTurk) sample, we replicated these predicted economic behaviors for a majority of participants. In the lab, participants showed susceptibility to sunk costs in this task, basing their decisions in part on time they have already waited, which we also observed in the subset of the mTurk sample that behaved economically. In another version of the task, we added an attention check to the wait phase of the delay. While that attention check further increased the proportion of subjects with predicted economic behaviors, it also removed the susceptibility to sunk costs. These findings have important implications for understanding how cognitive processes, such as the deployment of attention, are key to driving re-evaluation and susceptibility to sunk costs.
Collapse
Affiliation(s)
- Rebecca Kazinka
- Graduate Program in Clinical Science and Psychopathology Research, University of Minnesota, Minneapolis, MN, United States
| | - Angus W. MacDonald
- Psychology Department, University of Minnesota, Minneapolis, MN, United States
| | - A. David Redish
- Neuroscience Department, University of Minnesota, Minneapolis, MN, United States
| |
Collapse
|
27
|
Schneider NA, Ballintyn B, Katz D, Lisman J, Pi HJ. Parametric shift from rational to irrational decisions in mice. Sci Rep 2021; 11:480. [PMID: 33436782 PMCID: PMC7803778 DOI: 10.1038/s41598-020-79949-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 12/08/2020] [Indexed: 11/09/2022] Open
Abstract
In the classical view of economic choices, subjects make rational decisions evaluating the costs and benefits of options in order to maximize their overall income. Nonetheless, subjects often fail to reach optimal outcomes. The overt value of an option drives the direction of decisions, but covert factors such as emotion and sensitivity to sunk cost are thought to drive the observed deviations from optimality. Many questions remain to be answered as to (1) which contexts contribute the most to deviation from an optimal solution; and (2) the extent of these effects. In order to tackle these questions, we devised a decision-making task for mice, in which cost and benefit parameters could be independently and flexibly adjusted and for which a tractable optimal solution was known. Comparing mouse behavior with this optimal solution across parameter settings revealed that the factor most strongly contributing to suboptimal performance was the cost parameter. The quantification of sensitivity to sunk cost, a covert factor implicated in our task design, revealed it as another contributor to reduced optimality. In one condition where the large reward option was particularly unattractive and the small reward cost was low, the sensitivity to sunk cost and the cost-led suboptimality almost vanished. In this regime and this regime only, mice could be viewed as close to rational (here, 'rational' refers to a state in which an animal makes decisions basing on objective valuation, not covert factors). Taken together, our results suggest that "rationality" is a task-specific construct even in mice.
Collapse
Affiliation(s)
- Nathan A Schneider
- Volen Center for Complex Systems, Neuroscience Program, Department of Biology, Brandeis University, Waltham, MA, 02453, USA
| | - Benjamin Ballintyn
- Volen Center for Complex Systems, Neuroscience Program, Department of Biology, Brandeis University, Waltham, MA, 02453, USA
| | - Donald Katz
- Volen Center for Complex Systems, Neuroscience Program, Department of Psychology, Brandeis University, Waltham, MA, 02453, USA
| | - John Lisman
- Volen Center for Complex Systems, Neuroscience Program, Department of Biology, Brandeis University, Waltham, MA, 02453, USA
| | - Hyun-Jae Pi
- Volen Center for Complex Systems, Neuroscience Program, Department of Biology, Brandeis University, Waltham, MA, 02453, USA.
| |
Collapse
|
28
|
Neural signatures underlying deliberation in human foraging decisions. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2020; 19:1492-1508. [PMID: 31209734 DOI: 10.3758/s13415-019-00733-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Humans have a remarkable capacity to mentally project themselves far ahead in time. This ability, which entails the mental simulation of events, is thought to be fundamental to deliberative decision making, as it allows us to search through and evaluate possible choices. Many decisions that humans make are foraging decisions, in which one must decide whether an available offer is worth taking, when compared to unknown future possibilities (i.e., the background). Using a translational decision-making paradigm designed to reveal decision preferences in rats, we found that humans engaged in deliberation when making foraging decisions. A key feature of this task is that preferences (and thus, value) are revealed as a function of serial choices. Like rats, humans also took longer to respond when faced with difficult decisions near their preference boundary, which was associated with prefrontal and hippocampal activation, exemplifying cross-species parallels in deliberation. Furthermore, we found that voxels within the visual cortices encoded neural representations of the available possibilities specifically following regret-inducing experiences, in which the subject had previously rejected a good offer only to encounter a low-valued offer on the subsequent trial.
Collapse
|
29
|
Biased belief updating and suboptimal choice in foraging decisions. Nat Commun 2020; 11:3417. [PMID: 32647271 PMCID: PMC7347922 DOI: 10.1038/s41467-020-16964-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 05/27/2020] [Indexed: 11/08/2022] Open
Abstract
Deciding which options to engage, and which to forego, requires developing accurate beliefs about the overall distribution of prospects. Here we adapt a classic prey selection task from foraging theory to examine how individuals keep track of an environment’s reward rate and adjust choices in response to its fluctuations. Preference shifts were most pronounced when the environment improved compared to when it deteriorated. This is best explained by a trial-by-trial learning model in which participants estimate the reward rate with upward vs. downward changes controlled by separate learning rates. A failure to adjust expectations sufficiently when an environment becomes worse leads to suboptimal choices: options that are valuable given the environmental conditions are rejected in the false expectation that better options will materialize. These findings offer a previously unappreciated parallel in the serial choice setting of observations of asymmetric updating and resulting biased (often overoptimistic) estimates in other domains. In some types of decision-making, people must accept or forego an option without knowing what prospects might later be available. Here, the authors reveal how a key bias– asymmetric learning from negative versus positive outcomes – emerges in this type of decision.
Collapse
|
30
|
Kane GA, Bornstein AM, Shenhav A, Wilson RC, Daw ND, Cohen JD. Rats exhibit similar biases in foraging and intertemporal choice tasks. eLife 2019; 8:48429. [PMID: 31532391 PMCID: PMC6794087 DOI: 10.7554/elife.48429] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 09/17/2019] [Indexed: 12/05/2022] Open
Abstract
Animals, including humans, consistently exhibit myopia in two different contexts: foraging, in which they harvest locally beyond what is predicted by optimal foraging theory, and intertemporal choice, in which they exhibit a preference for immediate vs. delayed rewards beyond what is predicted by rational (exponential) discounting. Despite the similarity in behavior between these two contexts, previous efforts to reconcile these observations in terms of a consistent pattern of time preferences have failed. Here, via extensive behavioral testing and quantitative modeling, we show that rats exhibit similar time preferences in both contexts: they prefer immediate vs. delayed rewards and they are sensitive to opportunity costs of delays to future decisions. Further, a quasi-hyperbolic discounting model, a form of hyperbolic discounting with separate components for short- and long-term rewards, explains individual rats’ time preferences across both contexts, providing evidence for a common mechanism for myopic behavior in foraging and intertemporal choice. Often decisions have to be made on whether to stick with a resource or leave it behind to search for a better alternative. Should you book that hotel room or continue looking at others? Is it time to start searching for a new job, or even for a new partner? Animals face similar 'stick or twist' decisions when foraging for food. Knowing how to maximize the amount of food you obtain is key to survival. Studies have shown that most animals tend to stick with a food source for a little too long, a phenomenon known as 'overharvesting'. To find out why, Kane et al. designed carefully controlled experiments to compare foraging behavior in rats to another form of decision-making, known as intertemporal choice. The latter involves choosing between a small reward now versus a larger reward later. Given this choice, most rats opt to receive a smaller reward now rather than wait for the larger reward. This suggests that rats value rewards available in the future less than rewards they can get immediately. Kane et al. showed that this preference for short-term rewards can also explain why rats overharvest in foraging scenarios. By leaving one food source to go in search of another, rats must put up with a delay before they can access the new food supply. This delay, due to the time required to travel and search, reduces the value of the future reward. As a result, rats are more likely to stick with their current food source, even though leaving it would yield a greater reward in the long run. These findings in rats raise important questions about the mechanisms that lead to biases in thinking, and how factors like changes in the environment or specific disease states can influence these biases.
Collapse
Affiliation(s)
- Gary A Kane
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, United States.,Rowland Institute at Harvard, Harvard University, Cambridge, United States
| | - Aaron M Bornstein
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, United States.,Department of Cognitive Sciences, Center for the Neurobiology of Learning and Memory, University of California, Irvine, Irvine, United States
| | - Amitai Shenhav
- Department of Cognitive, Linguistic and Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, United States
| | - Robert C Wilson
- Department of Psychology, Cognitive Science Program, University of Arizona, Tucson, United States
| | - Nathaniel D Daw
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, United States
| | - Jonathan D Cohen
- Department of Psychology, Princeton Neuroscience Institute, Princeton University, Princeton, United States
| |
Collapse
|
31
|
Langdon AJ, Song M, Niv Y. Uncovering the 'state': Tracing the hidden state representations that structure learning and decision-making. Behav Processes 2019; 167:103891. [PMID: 31381985 DOI: 10.1016/j.beproc.2019.103891] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 05/23/2019] [Accepted: 06/21/2019] [Indexed: 02/02/2023]
Abstract
We review the abstract concept of a 'state' - an internal representation posited by reinforcement learning theories to be used by an agent, whether animal, human or artificial, to summarize the features of the external and internal environment that are relevant for future behavior on a particular task. Armed with this summary representation, an agent can make decisions and perform actions to interact effectively with the world. Here, we review recent findings from the neurobiological and behavioral literature to ask: 'what is a state?' with respect to the internal representations that organize learning and decision making across a range of tasks. We find that state representations include information beyond a straightforward summary of the immediate cues in the environment, providing timing or contextual information from the recent or more distant past, which allows these additional factors to influence decision making and other goal-directed behaviors in complex and perhaps unexpected ways.
Collapse
Affiliation(s)
- Angela J Langdon
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, 08544, United States.
| | - Mingyu Song
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, 08544, United States
| | - Yael Niv
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, 08544, United States.
| |
Collapse
|
32
|
Davidson JD, El Hady A. Foraging as an evidence accumulation process. PLoS Comput Biol 2019; 15:e1007060. [PMID: 31339878 PMCID: PMC6682163 DOI: 10.1371/journal.pcbi.1007060] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 08/05/2019] [Accepted: 04/30/2019] [Indexed: 11/21/2022] Open
Abstract
The patch-leaving problem is a canonical foraging task, in which a forager must decide to leave a current resource in search for another. Theoretical work has derived optimal strategies for when to leave a patch, and experiments have tested for conditions where animals do or do not follow an optimal strategy. Nevertheless, models of patch-leaving decisions do not consider the imperfect and noisy sampling process through which an animal gathers information, and how this process is constrained by neurobiological mechanisms. In this theoretical study, we formulate an evidence accumulation model of patch-leaving decisions where the animal averages over noisy measurements to estimate the state of the current patch and the overall environment. We solve the model for conditions where foraging decisions are optimal and equivalent to the marginal value theorem, and perform simulations to analyze deviations from optimal when these conditions are not met. By adjusting the drift rate and decision threshold, the model can represent different “strategies”, for example an incremental, decremental, or counting strategy. These strategies yield identical decisions in the limiting case but differ in how patch residence times adapt when the foraging environment is uncertain. To describe sub-optimal decisions, we introduce an energy-dependent marginal utility function that predicts longer than optimal patch residence times when food is plentiful. Our model provides a quantitative connection between ecological models of foraging behavior and evidence accumulation models of decision making. Moreover, it provides a theoretical framework for potential experiments which seek to identify neural circuits underlying patch-leaving decisions. Foraging is a ubiquitous animal behavior, performed by organisms as different as worms, birds, rats, and humans. Although the behavior has been extensively studied, it is not known how the brain processes information obtained during foraging activity to make subsequent foraging decisions. We form an evidence accumulation model of foraging decisions that describes the process through which an animal gathers information and uses it to make foraging decisions. By building on studies of the neural decision mechanisms within systems neuroscience, this model connects the foraging decision process with ecological models of patch-leaving decisions, such as the marginal value theorem. The model suggests the existence of different foraging strategies, which optimize for different environmental conditions and their potential implementation by neural decision making circuits. The model also shows how state-dependence, such as satiation level, can affect evidence accumulation to lead to sub-optimal foraging decisions. Our model provides a framework for future experimental studies which seek to elucidate how neural decision making mechanisms have been shaped by evolutionary forces in an animal’s surrounding environment.
Collapse
Affiliation(s)
- Jacob D Davidson
- Department Collective Behavior, Max Planck Institute for Animal Behavior, Konstanz, Germany.,Centre for the Advanced Study of Collective Behaviour, University of Konstanz, Konstanz, Germany.,Department of Biology, University of Konstanz, Konstanz, Germany
| | - Ahmed El Hady
- Princeton Neuroscience Institute, Princeton, New Jersey, United States of America.,Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| |
Collapse
|
33
|
Schmidt B, Duin AA, Redish AD. Disrupting the medial prefrontal cortex alters hippocampal sequences during deliberative decision making. J Neurophysiol 2019; 121:1981-2000. [PMID: 30892976 PMCID: PMC6620703 DOI: 10.1152/jn.00793.2018] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 01/25/2019] [Accepted: 03/15/2019] [Indexed: 01/10/2023] Open
Abstract
Current theories of deliberative decision making suggest that deliberative decisions arise from imagined simulations that require interactions between the prefrontal cortex and hippocampus. In rodent navigation experiments, hippocampal theta sequences advance from the location of the rat ahead to the subsequent goal. To examine the role of the medial prefrontal cortex (mPFC) on the hippocampus, we disrupted the mPFC with DREADDs (designer receptors exclusively activated by designer drugs). Using the Restaurant Row foraging task, we found that mPFC disruption resulted in decreased vicarious trial and error behavior, reduced the number of theta sequences, and impaired theta sequences in hippocampus. mPFC disruption led to larger changes in the initiation of the hippocampal theta sequences that represent the current location of the rat rather than to the later portions that represent the future outcomes. These data suggest that the mPFC likely provides an important component to the initiation of deliberative sequences and provides support for an episodic-future thinking, working memory interpretation of deliberation. NEW & NOTEWORTHY The medial prefrontal cortex (mPFC) and hippocampus interact during deliberative decision making. Disruption of the mPFC impaired hippocampal processes, including the local and nonlocal representations of space along each theta cycle and the initiation of hippocampal theta sequences, while sparing place cell firing characteristics and phase precession. mPFC disruption reduced the deliberative behavioral process vicarious trial and error and improved economic behaviors on this task.
Collapse
Affiliation(s)
- Brandy Schmidt
- Department of Neuroscience, University of Minnesota , Minneapolis, Minnesota
| | - Anneke A Duin
- Department of Neuroscience, University of Minnesota , Minneapolis, Minnesota
| | - A David Redish
- Department of Neuroscience, University of Minnesota , Minneapolis, Minnesota
| |
Collapse
|
34
|
Abram SV, Redish AD, MacDonald AW. Learning From Loss After Risk: Dissociating Reward Pursuit and Reward Valuation in a Naturalistic Foraging Task. Front Psychiatry 2019; 10:359. [PMID: 31231252 PMCID: PMC6561235 DOI: 10.3389/fpsyt.2019.00359] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Accepted: 05/08/2019] [Indexed: 11/13/2022] Open
Abstract
A fundamental feature of addiction is continued use despite high-cost losses. One possible driver of this feature is a dissociation between reward pursuit and reward valuation. To test for this dissociation, we employed a foraging paradigm with real-time delays and video rewards. Subjects made stay/skip choices on risky and non-risky offers; risky losses were operationalized as receipt of the longer delay after accepting a risky deal. We found that reward likability following risky losses predicted reward pursuit (i.e., subsequent choices), while there was no effect on reward valuation or reward pursuit in the absence of such losses. Individuals with high trait externalizing, who may be vulnerable to addiction, showed a dissociation between these phenomena: they liked videos more after risky losses but showed no decrease in choosing to stay on subsequent risky offers. This suggests that the inability to learn from mistakes is a potential component of risk for addiction.
Collapse
Affiliation(s)
- Samantha V. Abram
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, United States
- Sierra Pacific Mental Illness Research Education and Clinical Centers, San Francisco VA Medical Center, and the University of California, San Francisco, San Francisco, CA, United States
| | - A. David Redish
- Department of Neuroscience, University of Minnesota Twin Cities, Minneapolis, MN, United States
| | - Angus W. MacDonald
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, United States
- Department of Psychiatry, University of Minnesota Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
35
|
Wolf A, Ounjai K, Takahashi M, Kobayashi S, Matsuda T, Lauwereyns J. Evaluative Processing of Food Images: Longer Viewing for Indecisive Preference Formation. Front Psychol 2019; 10:608. [PMID: 30949106 PMCID: PMC6435591 DOI: 10.3389/fpsyg.2019.00608] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 03/05/2019] [Indexed: 11/13/2022] Open
Abstract
The well-known gaze cascade hypothesis proposes that as people look longer at an item, they tend to show an increased preference for it. However, using single food images as stimuli, we recently obtained results that clearly deviated from the general proposal that the gaze both expresses and influences preference formation. Instead, the pattern of data depended on the self-determination of exposure duration as well as the type of evaluation task. In order to disambiguate how the type of evaluation determines the relationship between viewing and liking we conducted the present follow-up study, with a fixed response set size as opposed to the varying set sizes in our previous study. In non-exclusive evaluation tasks, subjects were asked how much they liked individual food images. The recorded response was a number from 1 to 3. In exclusive evaluation tasks, subjects were asked for each individual food image to give one of three response options toward a limited selection: include it, exclude it, or defer the judgment. When subjects were able to determine the exposure duration, both the non-exclusive and exclusive evaluations produced inverted U-shaped trends such that the polar ends of the evaluation (the positive and negative extremes) were associated with relatively short viewing times, whereas the middle category had the longest viewing times. Thus, the data once again provided firm evidence against the notion that longer viewing facilitates preference formation. Moreover, the fact that non-exclusive and exclusive evaluation produced similar inverted U-shaped patterns suggests that the response set size is the critical factor that accounts for the observations here versus in our previous study. When keeping the response set size constant, with an equal opportunity to observe inverted U-shaped patterns, the findings are suggestive of a role for the level of decisiveness in determining the length of viewing time. For items that can be categorically identified as positive or negative, the evaluations are soon completed, with relatively brief viewing times. The prolonged visual inspection for the middle category may reflect doubt or uncertainty during the evaluative processing, possibly with an increased effort of information integration before reaching a conclusion.
Collapse
Affiliation(s)
- Alexandra Wolf
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan
| | - Kajornvut Ounjai
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan
| | | | | | | | - Johan Lauwereyns
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan.,Brain Science Institute, Tamagawa University, Tokyo, Japan.,Faculty of Arts and Science, Kyushu University, Fukuoka, Japan
| |
Collapse
|
36
|
Lukinova E, Wang Y, Lehrer SF, Erlich JC. Time preferences are reliable across time-horizons and verbal versus experiential tasks. eLife 2019; 8:e39656. [PMID: 30719974 PMCID: PMC6363390 DOI: 10.7554/elife.39656] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 01/16/2019] [Indexed: 12/15/2022] Open
Abstract
Individual differences in delay-discounting correlate with important real world outcomes, for example education, income, drug use, and criminality. As such, delay-discounting has been extensively studied by economists, psychologists and neuroscientists to reveal its behavioral and biological mechanisms in both human and non-human animal models. However, two major methodological differences hinder comparing results across species. Human studies present long time-horizon options verbally, whereas animal studies employ experiential cues and short delays. To bridge these divides, we developed a novel language-free experiential task inspired by animal decision-making studies. We found that the ranks of subjects' time-preferences were reliable across both verbal/experiential and second/day differences. Yet, discount factors scaled dramatically across the tasks, indicating a strong effect of temporal context. Taken together, this indicates that individuals have a stable, but context-dependent, time-preference that can be reliably assessed using different methods, providing a foundation to bridge studies of time-preferences across species. Editorial note This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Collapse
Affiliation(s)
- Evgeniya Lukinova
- NYU-ECNU Institute of Brain and Cognitive Science at NYU ShanghaiShanghaiChina
- NYU ShanghaiShanghaiChina
| | - Yuyue Wang
- NYU-ECNU Institute of Brain and Cognitive Science at NYU ShanghaiShanghaiChina
- NYU ShanghaiShanghaiChina
| | - Steven F Lehrer
- NYU-ECNU Institute of Brain and Cognitive Science at NYU ShanghaiShanghaiChina
- NYU ShanghaiShanghaiChina
- School of Policy Studies and Department of EconomicsQueen’s UniversityKingstonCanada
- The National Bureau of Economic ResearchCambridgeUnited States
| | - Jeffrey C Erlich
- NYU-ECNU Institute of Brain and Cognitive Science at NYU ShanghaiShanghaiChina
- NYU ShanghaiShanghaiChina
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education)East China Normal UniversityShanghaiChina
| |
Collapse
|
37
|
Sweis BM, Abram SV, Schmidt BJ, Seeland KD, MacDonald AW, Thomas MJ, Redish AD. Sensitivity to "sunk costs" in mice, rats, and humans. Science 2018; 361:178-181. [PMID: 30002252 PMCID: PMC6377599 DOI: 10.1126/science.aar8644] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2017] [Accepted: 05/29/2018] [Indexed: 11/30/2022]
Abstract
Sunk costs are irrecoverable investments that should not influence decisions, because decisions should be made on the basis of expected future consequences. Both human and nonhuman animals can show sensitivity to sunk costs, but reports from across species are inconsistent. In a temporal context, a sensitivity to sunk costs arises when an individual resists ending an activity, even if it seems unproductive, because of the time already invested. In two parallel foraging tasks that we designed, we found that mice, rats, and humans show similar sensitivities to sunk costs in their decision-making. Unexpectedly, sensitivity to time invested accrued only after an initial decision had been made. These findings suggest that sensitivity to temporal sunk costs lies in a vulnerability distinct from deliberation processes and that this distinction is present across species.
Collapse
Affiliation(s)
- Brian M Sweis
- Graduate Program in Neuroscience and Medical Scientist Training Program, University of Minnesota, Minneapolis, MN 55455, USA.,Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Samantha V Abram
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
| | - Brandy J Schmidt
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Kelsey D Seeland
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Angus W MacDonald
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
| | - Mark J Thomas
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA.,Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA
| | - A David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA.
| |
Collapse
|
38
|
Sweis BM, Thomas MJ, Redish AD. Beyond simple tests of value: measuring addiction as a heterogeneous disease of computation-specific valuation processes. ACTA ACUST UNITED AC 2018; 25:501-512. [PMID: 30115772 PMCID: PMC6097760 DOI: 10.1101/lm.047795.118] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Accepted: 07/06/2018] [Indexed: 12/13/2022]
Abstract
Addiction is considered to be a neurobiological disorder of learning and memory because addiction is capable of producing lasting changes in the brain. Recovering addicts chronically struggle with making poor decisions that ultimately lead to relapse, suggesting a view of addiction also as a neurobiological disorder of decision-making information processing. How the brain makes decisions depends on how decision-making processes access information stored as memories in the brain. Advancements in circuit-dissection tools and recent theories in neuroeconomics suggest that neurally dissociable valuation processes access distinct memories differently, and thus are uniquely susceptible as the brain changes during addiction. If addiction is to be considered a neurobiological disorder of memory, and thus decision-making, the heterogeneity with which information is both stored and processed must be taken into account in addiction studies. Addiction etiology can vary widely from person to person. We propose that addiction is not a single disease, nor simply a disorder of learning and memory, but rather a collection of symptoms of heterogeneous neurobiological diseases of distinct circuit-computation-specific decision-making processes.
Collapse
Affiliation(s)
- Brian M Sweis
- Graduate Program in Neuroscience and Medical Scientist Training Program, University of Minnesota, Minneapolis, Minnesota 55455, USA.,Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Mark J Thomas
- Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota 55455, USA.,Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - A David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
39
|
Wahab M, Panlilio LV, Solinas M. An improved within-session self-adjusting delay discounting procedure for the study of choice impulsivity in rats. Psychopharmacology (Berl) 2018; 235:2123-2135. [PMID: 29713789 DOI: 10.1007/s00213-018-4911-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Accepted: 04/17/2018] [Indexed: 02/06/2023]
Abstract
RATIONALE Delay-discounting procedures involving choice between small immediate rewards and large delayed rewards are used to study impulsivity in rodents. Improving existing procedures may provide new insights into the neurobiological mechanisms underlying decision-making processes. OBJECTIVES To develop a novel delay-discounting procedure that adjusts the delay value within individual sessions based on the rat's most recent choices. METHODS Compared to previously developed procedure, we required a more consistent demonstration of preference, five consecutive choices of the large or small reward, a criterion that is more likely to reflect deliberate choice by the animal, as opposed to two consecutive choices. In addition, delays were changed in steps of 5 s (rather than 1 s), because 5-s increments should be more easily discriminated and may produce a more distinct effect on choice. We characterized the procedure behaviorally by manipulating the duration of the session and the consecutive choice criterion, and we investigated the stability of the behavior upon interruption of training. We also characterized the procedure pharmacologically by investigating the effects of dopaminergic compounds. RESULTS Our procedures allowed obtaining two complementary measures of delay discounting: (1) the percentage of choices of the delay option and (2) the mean adjusting delay, an index of the delay that animals choose more frequently. We found that our procedure rapidly establishes a baseline of choice behavior that remains stable over time and is highly sensitive to manipulations of the dopaminergic system. CONCLUSIONS This procedure may provide a useful tool for investigating the neurobiology of inter-temporal choice and decision-making.
Collapse
Affiliation(s)
- Mejda Wahab
- INSERM, U1084, Laboratoire de Neurosciences Expérimentales et Cliniques, Université de Poitiers, Poitiers, France
| | - Leigh V Panlilio
- Real-World Assessment, Prediction and Treatment Unit, Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA
| | - Marcello Solinas
- INSERM, U1084, Laboratoire de Neurosciences Expérimentales et Cliniques, Université de Poitiers, Poitiers, France.
| |
Collapse
|
40
|
Sweis BM, Redish AD, Thomas MJ. Prolonged abstinence from cocaine or morphine disrupts separable valuations during decision conflict. Nat Commun 2018; 9:2521. [PMID: 29955073 PMCID: PMC6023899 DOI: 10.1038/s41467-018-04967-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 06/01/2018] [Indexed: 02/08/2023] Open
Abstract
Neuroeconomic theories propose changes in decision making drive relapse in recovering drug addicts, resulting in continued drug use despite stated wishes not to. Such conflict is thought to arise from multiple valuation systems dependent on separable neural components, yet many neurobiology of addiction studies employ only simple tests of value. Here, we tested in mice how prolonged abstinence from different drugs affects behavior in a neuroeconomic foraging task that reveals multiple tests of value. Abstinence from repeated cocaine and morphine disrupts separable decision-making processes. Cocaine alters deliberation-like behavior prior to choosing a preferred though economically unfavorable offer, while morphine disrupts re-evaluations after rapid initial decisions. These findings suggest that different drugs have long-lasting effects precipitating distinct decision-making vulnerabilities. Our approach can guide future refinement of decision-making behavioral paradigms and highlights how grossly similar behavioral maladaptations may mask multiple underlying, parallel, and dissociable processes that treatments for addiction could potentially target. Neuroeconomic theories suggest that conflict during decision, such as exhibited by relapsing drug addicts who continue drug use despite stated wishes not to, might arise from separable processes in decision making. Here the authors test mice in a foraging task designed to separate these processes and find that mice show alterations in separable components of decision conflict following abstinence from cocaine versus morphine.
Collapse
Affiliation(s)
- Brian M Sweis
- Graduate Program in Neuroscience & Medical Scientist Training Program, University of Minnesota, Minneapolis, MN, 55455, USA.,Department of Neuroscience, University of Minnesota, Minneapolis, MN, 55455, USA
| | - A David Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, 55455, USA.
| | - Mark J Thomas
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, 55455, USA. .,Department of Psychology, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
41
|
Altering gain of the infralimbic-to-accumbens shell circuit alters economically dissociable decision-making algorithms. Proc Natl Acad Sci U S A 2018; 115:E6347-E6355. [PMID: 29915034 DOI: 10.1073/pnas.1803084115] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The nucleus accumbens shell (NAcSh) is involved in reward valuation. Excitatory projections from infralimbic cortex (IL) to NAcSh undergo synaptic remodeling in rodent models of addiction and enable the extinction of disadvantageous behaviors. However, how the strength of synaptic transmission of the IL-NAcSh circuit affects decision-making information processing and reward valuation remains unknown, particularly because these processes can conflict within a given trial and particularly given recent data suggesting that decisions arise from separable information-processing algorithms. The approach of many neuromodulation studies is to disrupt information flow during on-going behaviors; however, this limits the interpretation of endogenous encoding of computational processes. Furthermore, many studies are limited by the use of simple behavioral tests of value which are unable to dissociate neurally distinct decision-making algorithms. We optogenetically altered the strength of synaptic transmission between glutamatergic IL-NAcSh projections in mice trained on a neuroeconomic task capable of separating multiple valuation processes. We found that induction of long-term depression in these synapses produced lasting changes in foraging processes without disrupting deliberative processes. Mice displayed inflated reevaluations to stay when deciding whether to abandon continued reward-seeking investments but displayed no changes during initial commitment decisions. We also developed an ensemble-level measure of circuit-specific plasticity that revealed individual differences in foraging valuation tendencies. Our results demonstrate that alterations in projection-specific synaptic strength between the IL and the NAcSh are capable of augmenting self-control economic valuations within a particular decision-making modality and suggest that the valuation mechanisms for these multiple decision-making modalities arise from different circuits.
Collapse
|
42
|
Wolf A, Ounjai K, Takahashi M, Kobayashi S, Matsuda T, Lauwereyns J. Evaluative Processing of Food Images: A Conditional Role for Viewing in Preference Formation. Front Psychol 2018; 9:936. [PMID: 29942273 PMCID: PMC6004500 DOI: 10.3389/fpsyg.2018.00936] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 05/22/2018] [Indexed: 11/25/2022] Open
Abstract
Previous research suggested a role of gaze in preference formation, not merely as an expression of preference, but also as a causal influence. According to the gaze cascade hypothesis, the longer subjects look at an item, the more likely they are to develop a preference for it. However, to date the connection between viewing and liking has been investigated predominately with self-paced viewing conditions in which the subjects were required to select certain items from simultaneously presented stimuli on the basis of perceived visual attractiveness. Such conditions might promote a default, but non-mandatory connection between viewing and liking. To explore whether the connection is separable, we examined the evaluative processing of single naturalistic food images in a 2 × 2 design, conducted completely within subjects, in which we varied both the type of exposure (self-paced versus time-controlled) and the type of evaluation (non-exclusive versus exclusive). In the self-paced exclusive evaluation, longer viewing was associated with a higher likelihood of a positive evaluation. However, in the self-paced non-exclusive evaluation, the trend reversed such that longer viewing durations were associated with lesser ratings. Furthermore, in the time-controlled tasks, both with non-exclusive and exclusive evaluation, there was no significant relationship between the viewing duration and the evaluation. The overall pattern of results was consistent for viewing times measured in terms of exposure duration (i.e., the duration of stimulus presentation on the screen) and in terms of actual gaze duration (i.e., the amount of time the subject effectively gazed at the stimulus on the screen). The data indicated that viewing does not intrinsically lead to a higher evaluation when evaluating single food images; instead, the relationship between viewing duration and evaluation depends on the type of task. We suggest that self-determination of exposure duration may be a prerequisite for any influence from viewing time on evaluative processing, regardless of whether the influence is facilitative. Moreover, the purported facilitative link between viewing and liking appears to be limited to exclusive evaluation, when only a restricted number of items can be included in a chosen set.
Collapse
Affiliation(s)
- Alexandra Wolf
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan
| | - Kajornvut Ounjai
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan
| | | | | | | | - Johan Lauwereyns
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan.,Brain Science Institute, Tamagawa University, Tokyo, Japan.,Faculty of Arts and Science, Kyushu University, Fukuoka, Japan
| |
Collapse
|
43
|
Lottem E, Banerjee D, Vertechi P, Sarra D, Lohuis MO, Mainen ZF. Activation of serotonin neurons promotes active persistence in a probabilistic foraging task. Nat Commun 2018. [PMID: 29520000 PMCID: PMC5843608 DOI: 10.1038/s41467-018-03438-y] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The neuromodulator serotonin (5-HT) has been implicated in a variety of functions that involve patience or impulse control. Many of these effects are consistent with a long-standing theory that 5-HT promotes behavioral inhibition, a motivational bias favoring passive over active behaviors. To further test this idea, we studied the impact of 5-HT in a probabilistic foraging task, in which mice must learn the statistics of the environment and infer when to leave a depleted foraging site for the next. Critically, mice were required to actively nose-poke in order to exploit a given site. We show that optogenetic activation of 5-HT neurons in the dorsal raphe nucleus increases the willingness of mice to actively attempt to exploit a reward site before giving up. These results indicate that behavioral inhibition is not an adequate description of 5-HT function and suggest that a unified account must be based on a higher-order function.
Collapse
Affiliation(s)
- Eran Lottem
- Champalimaud Research, Champalimaud Centre for the Unknown, 1400-038, Lisbon, Portugal
| | - Dhruba Banerjee
- School of Medicine, University of California, Irvine, CA, 92697-3950, USA
| | - Pietro Vertechi
- Champalimaud Research, Champalimaud Centre for the Unknown, 1400-038, Lisbon, Portugal
| | - Dario Sarra
- Champalimaud Research, Champalimaud Centre for the Unknown, 1400-038, Lisbon, Portugal
| | - Matthijs Oude Lohuis
- Swammerdam Institute for Life Sciences, Center for Neuroscience, Faculty of Science, University of Amsterdam, 1098XH, Amsterdam, The Netherlands
| | - Zachary F Mainen
- Champalimaud Research, Champalimaud Centre for the Unknown, 1400-038, Lisbon, Portugal.
| |
Collapse
|
44
|
Beeler JA, Mourra D. To Do or Not to Do: Dopamine, Affordability and the Economics of Opportunity. Front Integr Neurosci 2018; 12:6. [PMID: 29487508 PMCID: PMC5816947 DOI: 10.3389/fnint.2018.00006] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 01/26/2018] [Indexed: 12/21/2022] Open
Abstract
Five years ago, we introduced the thrift hypothesis of dopamine (DA), suggesting that the primary role of DA in adaptive behavior is regulating behavioral energy expenditure to match the prevailing economic conditions of the environment. Here we elaborate that hypothesis with several new ideas. First, we introduce the concept of affordability, suggesting that costs must necessarily be evaluated with respect to the availability of resources to the organism, which computes a value not only for the potential reward opportunity, but also the value of resources expended. Placing both costs and benefits within the context of the larger economy in which the animal is functioning requires consideration of the different timescales against which to compute resource availability, or average reward rate. Appropriate windows of computation for tracking resources requires corresponding neural substrates that operate on these different timescales. In discussing temporal patterns of DA signaling, we focus on a neglected form of DA plasticity and adaptation, changes in the physical substrate of the DA system itself, such as up- and down-regulation of receptors or release probability. We argue that changes in the DA substrate itself fundamentally alter its computational function, which we propose mediates adaptations to longer temporal horizons and economic conditions. In developing our hypothesis, we focus on DA D2 receptors (D2R), arguing that D2R implements a form of “cost control” in response to the environmental economy, serving as the “brain’s comptroller”. We propose that the balance between the direct and indirect pathway, regulated by relative expression of D1 and D2 DA receptors, implements affordability. Finally, as we review data, we discuss limitations in current approaches that impede fully investigating the proposed hypothesis and highlight alternative, more semi-naturalistic strategies more conducive to neuroeconomic investigations on the role of DA in adaptive behavior.
Collapse
Affiliation(s)
- Jeff A Beeler
- Department of Psychology, Queens College, City University of New York, New York, NY, United States.,CUNY Neuroscience Consortium, The Graduate Center, City University of New York, New York, NY, United States
| | - Devry Mourra
- Department of Psychology, Queens College, City University of New York, New York, NY, United States.,CUNY Neuroscience Consortium, The Graduate Center, City University of New York, New York, NY, United States
| |
Collapse
|
45
|
Juavinett AL, Erlich JC, Churchland AK. Decision-making behaviors: weighing ethology, complexity, and sensorimotor compatibility. Curr Opin Neurobiol 2017; 49:42-50. [PMID: 29179005 DOI: 10.1016/j.conb.2017.11.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Revised: 10/31/2017] [Accepted: 11/01/2017] [Indexed: 01/15/2023]
Abstract
Rodent decision-making research aims to uncover the neural circuitry underlying the ability to evaluate alternatives and select appropriate actions. Designing behavioral paradigms that provide a solid foundation to ask questions about decision-making computations and mechanisms is a difficult and often underestimated challenge. Here, we propose three dimensions on which we can consider rodent decision-making tasks: ethological validity, task complexity, and stimulus-response compatibility. We review recent research through this lens, and provide practical guidance for researchers in the decision-making field.
Collapse
Affiliation(s)
| | - Jeffrey C Erlich
- NYU-ECNU Institute of Brain and Cognitive Science, New York University Shanghai, Shanghai, China
| | - Anne K Churchland
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, United States.
| |
Collapse
|
46
|
Kolling N, Akam T. (Reinforcement?) Learning to forage optimally. Curr Opin Neurobiol 2017; 46:162-169. [PMID: 28918312 DOI: 10.1016/j.conb.2017.08.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/06/2017] [Accepted: 08/17/2017] [Indexed: 11/24/2022]
Abstract
Foraging effectively is critical to the survival of all animals and this imperative is thought to have profoundly shaped brain evolution. Decisions made by foraging animals often approximate optimal strategies, but the learning and decision mechanisms generating these choices remain poorly understood. Recent work with laboratory foraging tasks in humans suggest their behaviour is poorly explained by model-free reinforcement learning, with simple heuristic strategies better describing behaviour in some tasks, and in others evidence of prospective prediction of the future state of the environment. We suggest that model-based average reward reinforcement learning may provide a common framework for understanding these apparently divergent foraging strategies.
Collapse
Affiliation(s)
- Nils Kolling
- Department of Experimental Psychology, University of Oxford, United Kingdom
| | - Thomas Akam
- Department of Experimental Psychology, University of Oxford, United Kingdom; Champalimaud Neuroscience Program, Champalimaud Center for the Unknown, Portugal.
| |
Collapse
|
47
|
Social resource foraging is guided by the principles of the Marginal Value Theorem. Sci Rep 2017; 7:11274. [PMID: 28900299 PMCID: PMC5596022 DOI: 10.1038/s41598-017-11763-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 08/30/2017] [Indexed: 12/02/2022] Open
Abstract
Optimality principles guide how animals adapt to changing environments. During foraging for nonsocial resources such as food and water, species across taxa obey a strategy that maximizes resource harvest rate. However, it remains unknown whether foraging for social resources also obeys such a strategic principle. We investigated how primates forage for social information conveyed by conspecific facial expressions using the framework of optimal foraging theory. We found that the canonical principle of Marginal Value Theorem (MVT) also applies to social resources. Consistent with MVT, rhesus macaques (Macaca mulatta) spent more time foraging for social information when alternative sources of information were farther away compared to when they were closer by. A comparison of four models of patch-leaving behavior confirmed that the MVT framework provided the best fit to the observed foraging behavior. This analysis further demonstrated that patch-leaving decisions were not driven simply by the declining value of the images in the patch, but instead were dependent upon both the instantaneous social value intake rate and current time in the patch.
Collapse
|
48
|
Solomon RB, Conover K, Shizgal P. Valuation of opportunity costs by rats working for rewarding electrical brain stimulation. PLoS One 2017; 12:e0182120. [PMID: 28841663 PMCID: PMC5571941 DOI: 10.1371/journal.pone.0182120] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 07/12/2017] [Indexed: 11/29/2022] Open
Abstract
Pursuit of one goal typically precludes simultaneous pursuit of another. Thus, each exclusive activity entails an “opportunity cost:” the forgone benefits from the next-best activity eschewed. The present experiment estimates, in laboratory rats, the function that maps objective opportunity costs into subjective ones. In an operant chamber, rewarding electrical brain stimulation was delivered when the cumulative time a lever had been depressed reached a criterion duration. The value of the activities forgone during this duration is the opportunity cost of the electrical reward. We determined which of four functions best describes how objective opportunity costs, expressed as the required duration of lever depression, are translated into their subjective equivalents. The simplest account is the identity function, which equates subjective and objective opportunity costs. A variant of this function called the “sigmoidal-slope function,” converges on the identity function at longer durations but deviates from it at shorter durations. The sigmoidal-slope function has the form of a hockey stick. The flat “blade” denotes a range over which opportunity costs are subjectively equivalent; these durations are too short to allow substitution of more beneficial activities. The blade extends into an upward-curving portion over which costs become discriminable and finally into the straight “handle,” over which objective and subjective costs match. The two remaining functions are based on hyperbolic and exponential temporal discounting, respectively. The results are best described by the sigmoidal-slope function. That this is so suggests that different principles of intertemporal choice are involved in the evaluation of time spent working for a reward or waiting for its delivery. The subjective opportunity-cost function plays a key role in the evaluation and selection of goals. An accurate description of its form and parameters is essential to successful modeling and prediction of instrumental performance and reward-related decision making.
Collapse
Affiliation(s)
- Rebecca Brana Solomon
- Centre for Studies in Behavioural Neurobiology / Groupe de recherche en neurobiologie comportementale, Department of Psychology, Concordia University, Montréal, Québec, Canada
| | - Kent Conover
- Centre for Studies in Behavioural Neurobiology / Groupe de recherche en neurobiologie comportementale, Department of Psychology, Concordia University, Montréal, Québec, Canada
| | - Peter Shizgal
- Centre for Studies in Behavioural Neurobiology / Groupe de recherche en neurobiologie comportementale, Department of Psychology, Concordia University, Montréal, Québec, Canada
- * E-mail:
| |
Collapse
|
49
|
Chronic and Acute Stress Promote Overexploitation in Serial Decision Making. J Neurosci 2017; 37:5681-5689. [PMID: 28483979 DOI: 10.1523/jneurosci.3618-16.2017] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 03/16/2017] [Accepted: 04/10/2017] [Indexed: 11/21/2022] Open
Abstract
Many decisions that humans make resemble foraging problems in which a currently available, known option must be weighed against an unknown alternative option. In such foraging decisions, the quality of the overall environment can be used as a proxy for estimating the value of future unknown options against which current prospects are compared. We hypothesized that such foraging-like decisions would be characteristically sensitive to stress, a physiological response that tracks biologically relevant changes in environmental context. Specifically, we hypothesized that stress would lead to more exploitative foraging behavior. To test this, we investigated how acute and chronic stress, as measured by changes in cortisol in response to an acute stress manipulation and subjective scores on a questionnaire assessing recent chronic stress, relate to performance in a virtual sequential foraging task. We found that both types of stress bias human decision makers toward overexploiting current options relative to an optimal policy. These findings suggest a possible computational role of stress in decision making in which stress biases judgments of environmental quality.SIGNIFICANCE STATEMENT Many of the most biologically relevant decisions that we make are foraging-like decisions about whether to stay with a current option or search the environment for a potentially better one. In the current study, we found that both acute physiological and chronic subjective stress are associated with greater overexploitation or staying at current options for longer than is optimal. These results suggest a domain-general way in which stress might bias foraging decisions through changing one's appraisal of the overall quality of the environment. These novel findings not only have implications for understanding how this important class of foraging decisions might be biologically implemented, but also for understanding the computational role of stress in behavior and cognition more broadly.
Collapse
|
50
|
Horvath G, Liszli P, Kekesi G, Büki A, Benedek G. Characterization of exploratory activity and learning ability of healthy and “schizophrenia-like” rats in a square corridor system (AMBITUS). Physiol Behav 2017; 169:155-164. [DOI: 10.1016/j.physbeh.2016.11.039] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 11/30/2016] [Accepted: 11/30/2016] [Indexed: 12/28/2022]
|