1
|
Chalmers E, Luczak A. A bio-inspired reinforcement learning model that accounts for fast adaptation after punishment. Neurobiol Learn Mem 2024; 215:107974. [PMID: 39209018 DOI: 10.1016/j.nlm.2024.107974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 08/14/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Humans and animals can quickly learn a new strategy when a previously-rewarding strategy is punished. It is difficult to model this with reinforcement learning methods, because they tend to perseverate on previously-learned strategies - a hallmark of impaired response to punishment. Past work has addressed this by augmenting conventional reinforcement learning equations with ad hoc parameters or parallel learning systems. This produces reinforcement learning models that account for reversal learning, but are more abstract, complex, and somewhat detached from neural substrates. Here we use a different approach: we generalize a recently-discovered neuron-level learning rule, on the assumption that it captures a basic principle of learning that may occur at the whole-brain-level. Surprisingly, this gives a new reinforcement learning rule that accounts for adaptation and lose-shift behavior, and uses only the same parameters as conventional reinforcement learning equations. In the new rule, the normal reward prediction errors that drive reinforcement learning are scaled by the likelihood the agent assigns to the action that triggered a reward or punishment. The new rule demonstrates quick adaptation in card sorting and variable Iowa gambling tasks, and also exhibits a human-like paradox-of-choice effect. It will be useful for experimental researchers modeling learning and behavior.
Collapse
Affiliation(s)
- Eric Chalmers
- Department of Mathematics and Computing, Mount Royal University, 4825 Mt Royal Gate SW, Calgary, AB T3E 6K6, Canada.
| | - Artur Luczak
- Canadian Center for Behavioral Neuroscience, University of Lethbridge4401 University Dr W, Lethbridge, AB T1K 3M4, Canada.
| |
Collapse
|
2
|
Yang L, Jin F, Yang L, Li J, Li Z, Li M, Shang Z. The Hippocampus in Pigeons Contributes to the Model-Based Valuation and the Relationship between Temporal Context States. Animals (Basel) 2024; 14:431. [PMID: 38338074 PMCID: PMC10854895 DOI: 10.3390/ani14030431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/25/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Model-based decision-making guides organism behavior by the representation of the relationships between different states. Previous studies have shown that the mammalian hippocampus (Hp) plays a key role in learning the structure of relationships among experiences. However, the hippocampal neural mechanisms of birds for model-based learning have rarely been reported. Here, we trained six pigeons to perform a two-step task and explore whether their Hp contributes to model-based learning. Behavioral performance and hippocampal multi-channel local field potentials (LFPs) were recorded during the task. We estimated the subjective values using a reinforcement learning model dynamically fitted to the pigeon's choice of behavior. The results show that the model-based learner can capture the behavioral choices of pigeons well throughout the learning process. Neural analysis indicated that high-frequency (12-100 Hz) power in Hp represented the temporal context states. Moreover, dynamic correlation and decoding results provided further support for the high-frequency dependence of model-based valuations. In addition, we observed a significant increase in hippocampal neural similarity at the low-frequency band (1-12 Hz) for common temporal context states after learning. Overall, our findings suggest that pigeons use model-based inferences to learn multi-step tasks, and multiple LFP frequency bands collaboratively contribute to model-based learning. Specifically, the high-frequency (12-100 Hz) oscillations represent model-based valuations, while the low-frequency (1-12 Hz) neural similarity is influenced by the relationship between temporal context states. These results contribute to our understanding of the neural mechanisms underlying model-based learning and broaden the scope of hippocampal contributions to avian behavior.
Collapse
Affiliation(s)
- Lifang Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Fuli Jin
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Long Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Jiajia Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhihui Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| | - Mengmeng Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhigang Shang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| |
Collapse
|