Ahilan S, Solomon RB, Breton YA, Conover K, Niyogi RK, Shizgal P, Dayan P. Learning to use past evidence in a sophisticated world model.
PLoS Comput Biol 2019;
15:e1007093. [PMID:
31233559 PMCID:
PMC6611652 DOI:
10.1371/journal.pcbi.1007093]
[Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 07/05/2019] [Accepted: 05/09/2019] [Indexed: 12/02/2022] Open
Abstract
Humans and other animals are able to discover underlying statistical structure in their environments and exploit it to achieve efficient and effective performance. However, such structure is often difficult to learn and use because it is obscure, involving long-range temporal dependencies. Here, we analysed behavioural data from an extended experiment with rats, showing that the subjects learned the underlying statistical structure, albeit suffering at times from immediate inferential imperfections as to their current state within it. We accounted for their behaviour using a Hidden Markov Model, in which recent observations are integrated with evidence from the past. We found that over the course of training, subjects came to track their progress through the task more accurately, a change that our model largely attributed to improved integration of past evidence. This learning reflected the structure of the task, decreasing reliance on recent observations, which were potentially misleading.
Humans and other animals possess the remarkable ability to find and exploit patterns and structures in their experience of a complex and varied world. However, such structures are often temporally extended and latent or hidden, being only partially correlated with immediate observations of the world. This makes it essential to integrate current and historical information, and creates a challenging statistical and computational problem. Here, we examine the behaviour of rats facing a version of this challenge posed by a brain-stimulation reward task. We find that subjects learned the general structure of the task, but struggled when immediate observations were misleading. We captured this behaviour with a model in which subjects integrated evidence from recent observations together with evidence from the past. The subjects’ performance improved markedly over successive sessions, allowing them to overcome misleading observations. According to our model, this was made possible by more effective usage of past evidence to better determine the true state of the world.
Collapse