Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ohta H, Satori K, Takarada Y, Arake M, Ishizuka T, Morimoto Y, Takahashi T. The asymmetric learning rates of murine exploratory behavior in sparse reward environments. Neural Netw 2021;143:218-229. [PMID: 34157646 DOI: 10.1016/j.neunet.2021.05.030] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 04/16/2021] [Accepted: 05/26/2021] [Indexed: 11/29/2022]

For:	Ohta H, Satori K, Takarada Y, Arake M, Ishizuka T, Morimoto Y, Takahashi T. The asymmetric learning rates of murine exploratory behavior in sparse reward environments. Neural Netw 2021;143:218-229. [PMID: 34157646 DOI: 10.1016/j.neunet.2021.05.030] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 04/16/2021] [Accepted: 05/26/2021] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Ohta H, Nozawa T, Nakano T, Morimoto Y, Ishizuka T. Nonlinear age-related differences in probabilistic learning in mice: A 5-armed bandit task study. Neurobiol Aging 2024;142:8-16. [PMID: 39029360 DOI: 10.1016/j.neurobiolaging.2024.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 06/17/2024] [Accepted: 06/19/2024] [Indexed: 07/21/2024]

Gong L, Pasqualetti F, Papouin T, Ching S. Astrocytes as a mechanism for contextually-guided network dynamics and function. PLoS Comput Biol 2024;20:e1012186. [PMID: 38820533 PMCID: PMC11168681 DOI: 10.1371/journal.pcbi.1012186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 06/12/2024] [Accepted: 05/21/2024] [Indexed: 06/02/2024] Open

Sazhin D, Dachs A, Smith DV. Meta-Analysis Reveals That Explore-Exploit Decisions are Dissociable by Activation in the Dorsal Lateral Prefrontal Cortex and the Anterior Cingulate Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.21.563317. [PMID: 37961286 PMCID: PMC10634720 DOI: 10.1101/2023.10.21.563317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]

Jin F, Yang L, Yang L, Li J, Li M, Shang Z. Dynamics Learning Rate Bias in Pigeons: Insights from Reinforcement Learning and Neural Correlates. Animals (Basel) 2024;14:489. [PMID: 38338131 PMCID: PMC10854969 DOI: 10.3390/ani14030489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 01/23/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open

Ihara K, Shikano Y, Kato S, Yagishita S, Tanaka KF, Takata N. A reinforcement learning model with choice traces for a progressive ratio schedule. Front Behav Neurosci 2024;17:1302842. [PMID: 38268795 PMCID: PMC10806202 DOI: 10.3389/fnbeh.2023.1302842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 12/13/2023] [Indexed: 01/26/2024] Open

Abstract

The progressive ratio (PR) lever-press task serves as a benchmark for assessing goal-oriented motivation. However, a well-recognized limitation of the PR task is that only a single data point, known as the breakpoint, is obtained from an entire session as a barometer of motivation. Because the breakpoint is defined as the final ratio of responses achieved in a PR session, variations in choice behavior during the PR task cannot be captured. We addressed this limitation by constructing four reinforcement learning models: a simple Q-learning model, an asymmetric model with two learning rates, a perseverance model with choice traces, and a perseverance model without learning. These models incorporated three behavioral choices: reinforced and non-reinforced lever presses and void magazine nosepokes, because we noticed that male mice performed frequent magazine nosepokes during PR tasks. The best model was the perseverance model, which predicted a gradual reduction in amplitudes of reward prediction errors (RPEs) upon void magazine nosepokes. We confirmed the prediction experimentally with fiber photometry of extracellular dopamine (DA) dynamics in the ventral striatum of male mice using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m). We verified application of the model by acute intraperitoneal injection of low-dose methamphetamine (METH) before a PR task, which increased the frequency of magazine nosepokes during the PR session without changing the breakpoint. The perseverance model captured behavioral modulation as a result of increased initial action values, which are customarily set to zero and disregarded in reinforcement learning analysis. Our findings suggest that the perseverance model reveals the effects of psychoactive drugs on choice behaviors during PR tasks.

Collapse

Chierchia G, Soukupová M, Kilford EJ, Griffin C, Leung J, Palminteri S, Blakemore SJ. Confirmatory reinforcement learning changes with age during adolescence. Dev Sci 2023;26:e13330. [PMID: 36194156 PMCID: PMC7615280 DOI: 10.1111/desc.13330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 07/26/2022] [Accepted: 09/20/2022] [Indexed: 11/26/2022]

Abstract

Understanding how learning changes during human development has been one of the long-standing objectives of developmental science. Recently, advances in computational biology have demonstrated that humans display a bias when learning to navigate novel environments through rewards and punishments: they learn more from outcomes that confirm their expectations than from outcomes that disconfirm them. Here, we ask whether confirmatory learning is stable across development, or whether it might be attenuated in developmental stages in which exploration is beneficial, such as in adolescence. In a reinforcement learning (RL) task, 77 participants aged 11-32 years (four men, mean age = 16.26) attempted to maximize monetary rewards by repeatedly sampling different pairs of novel options, which varied in their reward/punishment probabilities. Mixed-effect models showed an age-related increase in accuracy as long as learning contingencies remained stable across trials, but less so when they reversed halfway through the trials. Age was also associated with a greater tendency to stay with an option that had just delivered a reward, more than to switch away from an option that had just delivered a punishment. At the computational level, a confirmation model provided increasingly better fit with age. This model showed that age differences are captured by decreases in noise or exploration, rather than in the magnitude of the confirmation bias. These findings provide new insights into how learning changes during development and could help better tailor learning environments to people of different ages. RESEARCH HIGHLIGHTS: Reinforcement learning shows age-related improvement during adolescence, but more in stable learning environments compared with volatile learning environments. People tend to stay with an option after a win more than they shift from an option after a loss, and this asymmetry increases with age during adolescence. Computationally, these changes are captured by a developing confirmatory learning style, in which people learn more from outcomes that confirm rather than disconfirm their choices. Age-related differences in confirmatory learning are explained by decreases in stochasticity, rather than changes in the magnitude of the confirmation bias.

Collapse

Doya K, Friston K, Sugiyama M, Tenenbaum J. Neural Networks special issue on Artificial Intelligence and Brain Science. Neural Netw 2022;155:328-329. [PMID: 36099665 DOI: 10.1016/j.neunet.2022.08.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Palminteri S, Lebreton M. The computational roots of positivity and confirmation biases in reinforcement learning. Trends Cogn Sci 2022;26:607-621. [PMID: 35662490 DOI: 10.1016/j.tics.2022.04.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 12/16/2022]