1
|
Hackel LM, Kalkstein DA, Mende-Siedlecki P. Simplifying social learning. Trends Cogn Sci 2024; 28:428-440. [PMID: 38331595 DOI: 10.1016/j.tics.2024.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/16/2024] [Accepted: 01/17/2024] [Indexed: 02/10/2024]
Abstract
Social learning is complex, but people often seem to navigate social environments with ease. This ability creates a puzzle for traditional accounts of reinforcement learning (RL) that assume people negotiate a tradeoff between easy-but-simple behavior (model-free learning) and complex-but-difficult behavior (e.g., model-based learning). We offer a theoretical framework for resolving this puzzle: although social environments are complex, people have social expertise that helps them behave flexibly with low cognitive cost. Specifically, by using familiar concepts instead of focusing on novel details, people can turn hard learning problems into simpler ones. This ability highlights social learning as a prototype for studying cognitive simplicity in the face of environmental complexity and identifies a role for conceptual knowledge in everyday reward learning.
Collapse
Affiliation(s)
- Leor M Hackel
- University of Southern California, Los Angeles, CA 90089, USA.
| | | | | |
Collapse
|
2
|
Barack DL, Bakkour A, Shohamy D, Salzman CD. Visuospatial information foraging describes search behavior in learning latent environmental features. Sci Rep 2023; 13:1126. [PMID: 36670132 PMCID: PMC9860038 DOI: 10.1038/s41598-023-27662-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 01/05/2023] [Indexed: 01/22/2023] Open
Abstract
In the real world, making sequences of decisions to achieve goals often depends upon the ability to learn aspects of the environment that are not directly perceptible. Learning these so-called latent features requires seeking information about them. Prior efforts to study latent feature learning often used single decisions, used few features, and failed to distinguish between reward-seeking and information-seeking. To overcome this, we designed a task in which humans and monkeys made a series of choices to search for shapes hidden on a grid. On our task, the effects of reward and information outcomes from uncovering parts of shapes could be disentangled. Members of both species adeptly learned the shapes and preferred to select tiles expected to be informative earlier in trials than previously rewarding ones, searching a part of the grid until their outcomes dropped below the average information outcome-a pattern consistent with foraging behavior. In addition, how quickly humans learned the shapes was predicted by how well their choice sequences matched the foraging pattern, revealing an unexpected connection between foraging and learning. This adaptive search for information may underlie the ability in humans and monkeys to learn latent features to support goal-directed behavior in the long run.
Collapse
Affiliation(s)
- David L Barack
- Department of Neuroscience, Columbia University, New York, USA.
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA.
| | - Akram Bakkour
- Department of Psychology, University of Chicago, Chicago, USA
| | - Daphna Shohamy
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA
- Department of Psychology, Columbia University, New York, USA
- Kavli Institute for Brain Sciences, Columbia University, New York, USA
| | - C Daniel Salzman
- Department of Neuroscience, Columbia University, New York, USA
- Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, USA
- Kavli Institute for Brain Sciences, Columbia University, New York, USA
- Department of Psychiatry, Columbia University, New York, USA
- New York State Psychiatric Institute, New York, USA
| |
Collapse
|
3
|
Gatti D, Marelli M, Vecchi T, Rinaldi L. Spatial Representations Without Spatial Computations. Psychol Sci 2022; 33:1947-1958. [PMID: 36201754 DOI: 10.1177/09567976221094863] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Cognitive maps are assumed to be fundamentally spatial and grounded only in perceptual processes, as supported by the discovery of functionally dedicated cell types in the human brain, which tile the environment in a maplike fashion. Challenging this view, we demonstrate that spatial representations-such as large-scale geographical maps-can be as well retrieved with high confidence from natural language through cognitively plausible artificial-intelligence models on the basis of nonspatial associative-learning mechanisms. More critically, we show that linguistic information accounts for the specific distortions observed in tasks when college-age adults have to judge the geographical positions of cities, even when these positions are estimated on real maps. These findings indicate that language experience can encode and reproduce cognitive maps without the need for a dedicated spatial-representation system, thus suggesting that the formation of these maps is the result of a strict interplay between spatial- and nonspatial-learning principles.
Collapse
Affiliation(s)
- Daniele Gatti
- Department of Brain and Behavioral Sciences, University of Pavia
| | - Marco Marelli
- Department of Psychology, University of Milano-Bicocca.,NeuroMI, Milan Center for Neuroscience, Milano, Italy
| | - Tomaso Vecchi
- Department of Brain and Behavioral Sciences, University of Pavia.,Cognitive Psychology Unit, IRCCS Mondino Foundation, Pavia, Italy
| | - Luca Rinaldi
- Department of Brain and Behavioral Sciences, University of Pavia.,Cognitive Psychology Unit, IRCCS Mondino Foundation, Pavia, Italy
| |
Collapse
|
4
|
Son JY, Bhandari A, FeldmanHall O. Cognitive maps of social features enable flexible inference in social networks. Proc Natl Acad Sci U S A 2021; 118:e2021699118. [PMID: 34518372 PMCID: PMC8488581 DOI: 10.1073/pnas.2021699118] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/01/2021] [Indexed: 11/18/2022] Open
Abstract
In order to navigate a complex web of relationships, an individual must learn and represent the connections between people in a social network. However, the sheer size and complexity of the social world makes it impossible to acquire firsthand knowledge of all relations within a network, suggesting that people must make inferences about unobserved relationships to fill in the gaps. Across three studies (n = 328), we show that people can encode information about social features (e.g., hobbies, clubs) and subsequently deploy this knowledge to infer the existence of unobserved friendships in the network. Using computational models, we test various feature-based mechanisms that could support such inferences. We find that people's ability to successfully generalize depends on two representational strategies: a simple but inflexible similarity heuristic that leverages homophily, and a complex but flexible cognitive map that encodes the statistical relationships between social features and friendships. Together, our studies reveal that people can build cognitive maps encoding arbitrary patterns of latent relations in many abstract feature spaces, allowing social networks to be represented in a flexible format. Moreover, these findings shed light on open questions across disciplines about how people learn and represent social networks and may have implications for generating more human-like link prediction in machine learning algorithms.
Collapse
Affiliation(s)
- Jae-Young Son
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912
| | - Apoorva Bhandari
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912
| | - Oriel FeldmanHall
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912;
- Carney Institute for Brain Sciences, Brown University, Providence, RI 02912
| |
Collapse
|
5
|
Margraf L, Krause D, Weigelt M. Valence-dependent Neural Correlates of Augmented Feedback Processing in Extensive Motor Sequence Learning - Part I: Practice-related Changes of Feedback Processing. Neuroscience 2021; 486:4-19. [PMID: 33945843 DOI: 10.1016/j.neuroscience.2021.04.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Revised: 04/17/2021] [Accepted: 04/19/2021] [Indexed: 11/15/2022]
Abstract
Several event-related potentials (ERPs) are associated with the processing of valence-dependent augmented feedback during the practice of motor tasks. In this study, 38 students learned a sequential arm-movement-task with 192 trials in each of five practice sessions (960 practice trials in total), to examine practice-related changes in neural feedback processing. Electroencephalogram (EEG) was recorded in the first and last practice session. An adaptive bandwidth for movement accuracy led to equal amounts of positive and negative feedback. A frontal located negative deflection in the time window of the feedback-related negativity (FRN) was more negative for negative feedback and might reflect reward prediction errors in reinforcement learning. This negativity increased after extensive practice, which might indicate that smaller errors are harder to identify in the later phase. The late fronto-central positivity (LFCP) was more positive for negative feedback and is assumed to be associated with supervised learning and behavioral adaptations based on feedback with higher complexity. No practice-related changes of the LFCP were observed, which suggests that complex feedback is processed independent from the practice phase. The P300 displayed a more positive activation for positive feedback, which might be interpreted as the higher significance of positive feedback for the updating of internal models in this setting. A valence-independent increase of the P300 amplitude after practice might reflect an improved ability to update the internal representation based on feedback information. These results demonstrate that valence-dependent neural feedback processing changes with extensive practice of a novel motor task. Dissociating changes in latencies of different components support the assumption that they are related to distinct mechanisms of feedback-dependent learning.
Collapse
Affiliation(s)
- Linda Margraf
- Psychology and Movement Science, Department of Sport and Health, Paderborn University, Germany.
| | - Daniel Krause
- Psychology and Movement Science, Department of Sport and Health, Paderborn University, Germany
| | - Matthias Weigelt
- Psychology and Movement Science, Department of Sport and Health, Paderborn University, Germany
| |
Collapse
|
6
|
Eckstein MK, Collins AGE. Computational evidence for hierarchically structured reinforcement learning in humans. Proc Natl Acad Sci U S A 2020; 117:29381-29389. [PMID: 33229518 PMCID: PMC7703642 DOI: 10.1073/pnas.1912330117] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Humans have the fascinating ability to achieve goals in a complex and constantly changing world, still surpassing modern machine-learning algorithms in terms of flexibility and learning speed. It is generally accepted that a crucial factor for this ability is the use of abstract, hierarchical representations, which employ structure in the environment to guide learning and decision making. Nevertheless, how we create and use these hierarchical representations is poorly understood. This study presents evidence that human behavior can be characterized as hierarchical reinforcement learning (RL). We designed an experiment to test specific predictions of hierarchical RL using a series of subtasks in the realm of context-based learning and observed several behavioral markers of hierarchical RL, such as asymmetric switch costs between changes in higher-level versus lower-level features, faster learning in higher-valued compared to lower-valued contexts, and preference for higher-valued compared to lower-valued contexts. We replicated these results across three independent samples. We simulated three models-a classic RL, a hierarchical RL, and a hierarchical Bayesian model-and compared their behavior to human results. While the flat RL model captured some aspects of participants' sensitivity to outcome values, and the hierarchical Bayesian model captured some markers of transfer, only hierarchical RL accounted for all patterns observed in human behavior. This work shows that hierarchical RL, a biologically inspired and computationally simple algorithm, can capture human behavior in complex, hierarchical environments and opens the avenue for future research in this field.
Collapse
Affiliation(s)
- Maria K Eckstein
- Department of Psychology, University of California, Berkeley, CA 94704
| | - Anne G E Collins
- Department of Psychology, University of California, Berkeley, CA 94704
| |
Collapse
|
7
|
Zhou D, Lydon-Staley DM, Zurn P, Bassett DS. The growth and form of knowledge networks by kinesthetic curiosity. Curr Opin Behav Sci 2020; 35:125-134. [PMID: 34355045 PMCID: PMC8330694 DOI: 10.1016/j.cobeha.2020.09.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Throughout life, we might seek a calling, companions, skills, entertainment, truth, self-knowledge, beauty, and edification. The practice of curiosity can be viewed as an extended and open-ended search for valuable information with hidden identity and location in a complex space of interconnected information. Despite its importance, curiosity has been challenging to computationally model because the practice of curiosity often flourishes without specific goals, external reward, or immediate feedback. Here, we show how network science, statistical physics, and philosophy can be integrated into an approach that coheres with and expands the psychological taxonomies of specific-diversive and perceptual-epistemic curiosity. Using this interdisciplinary approach, we distill functional modes of curious information seeking as searching movements in information space. The kinesthetic model of curiosity offers a vibrant counterpart to the deliberative predictions of model-based reinforcement learning. In doing so, this model unearths new computational opportunities for identifying what makes curiosity curious.
Collapse
Affiliation(s)
- Dale Zhou
- Neuroscience Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - David M. Lydon-Staley
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania
- Annenberg School for Communication, University of Pennsylvania
- Leonard Davis Institute of Health Economics, University of Pennsylvania
| | - Perry Zurn
- Department of Philosophy & Religion, American University, Washington, D.C
| | - Danielle S. Bassett
- Department of Bioengineering, School of Engineering and Applied Sciences, University of Pennsylvania
- Department of Physics & Astronomy, College of Arts and Sciences, University of Pennsylvania
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania
- Department of Electrical & Systems Engineering, School of Engineering and Applied Sciences, University of Pennsylvania
- Santa Fe Institute, Santa Fe, NM 87501 USA
| |
Collapse
|
8
|
Momennejad I. Learning Structures: Predictive Representations, Replay, and Generalization. Curr Opin Behav Sci 2020; 32:155-166. [DOI: 10.1016/j.cobeha.2020.02.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
9
|
Franklin NT, Frank MJ. Generalizing to generalize: Humans flexibly switch between compositional and conjunctive structures during reinforcement learning. PLoS Comput Biol 2020; 16:e1007720. [PMID: 32282795 PMCID: PMC7179934 DOI: 10.1371/journal.pcbi.1007720] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 04/23/2020] [Accepted: 02/11/2020] [Indexed: 12/02/2022] Open
Abstract
Humans routinely face novel environments in which they have to generalize in order to act adaptively. However, doing so involves the non-trivial challenge of deciding which aspects of a task domain to generalize. While it is sometimes appropriate to simply re-use a learned behavior, often adaptive generalization entails recombining distinct components of knowledge acquired across multiple contexts. Theoretical work has suggested a computational trade-off in which it can be more or less useful to learn and generalize aspects of task structure jointly or compositionally, depending on previous task statistics, but it is unknown whether humans modulate their generalization strategy accordingly. Here we develop a series of navigation tasks that separately manipulate the statistics of goal values ("what to do") and state transitions ("how to do it") across contexts and assess whether human subjects generalize these task components separately or conjunctively. We find that human generalization is sensitive to the statistics of the previously experienced task domain, favoring compositional or conjunctive generalization when the task statistics are indicative of such structures, and a mixture of the two when they are more ambiguous. These results support a normative "meta-generalization" account and suggests that people not only generalize previous task components but also generalize the statistical structure most likely to support generalization.
Collapse
Affiliation(s)
- Nicholas T. Franklin
- Department of Psychology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
| | - Michael J. Frank
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
10
|
Colino FL, Heath M, Hassall CD, Krigolson OE. Electroencephalographic evidence for a reinforcement learning advantage during motor skill acquisition. Biol Psychol 2020; 151:107849. [DOI: 10.1016/j.biopsycho.2020.107849] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Revised: 01/12/2020] [Accepted: 01/19/2020] [Indexed: 01/12/2023]
|
11
|
Schulz E, Franklin NT, Gershman SJ. Finding structure in multi-armed bandits. Cogn Psychol 2020; 119:101261. [PMID: 32059133 DOI: 10.1016/j.cogpsych.2019.101261] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 11/10/2019] [Accepted: 12/02/2019] [Indexed: 12/24/2022]
Abstract
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, which require participants to trade off exploration and exploitation. Standard multi-armed bandits assume that each option has an independent reward distribution. However, learning about options independently is unrealistic, since in the real world options often share an underlying structure. We study a class of structured bandit tasks, which we use to probe how generalization guides exploration. In a structured multi-armed bandit, options have a correlation structure dictated by a latent function. We focus on bandits in which rewards are linear functions of an option's spatial position. Across 5 experiments, we find evidence that participants utilize functional structure to guide their exploration, and also exhibit a learning-to-learn effect across rounds, becoming progressively faster at identifying the latent function. Our experiments rule out several heuristic explanations and show that the same findings obtain with non-linear functions. Comparing several models of learning and decision making, we find that the best model of human behavior in our tasks combines three computational mechanisms: (1) function learning, (2) clustering of reward distributions across rounds, and (3) uncertainty-guided exploration. Our results suggest that human reinforcement learning can utilize latent structure in sophisticated ways to improve efficiency.
Collapse
|
12
|
Bejjani C, Egner T. Spontaneous Task Structure Formation Results in a Cost to Incidental Memory of Task Stimuli. Front Psychol 2019; 10:2833. [PMID: 31920866 PMCID: PMC6929588 DOI: 10.3389/fpsyg.2019.02833] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2019] [Accepted: 12/02/2019] [Indexed: 11/24/2022] Open
Abstract
Humans are characterized by their ability to leverage rules for classifying and linking stimuli to context-appropriate actions. Previous studies have shown that when humans learn stimulus-response associations for two-dimensional stimuli, they implicitly form and generalize hierarchical rule structures (task-sets). However, the cognitive processes underlying structure formation are poorly understood. Across four experiments, we manipulated how trial-unique images mapped onto responses to bias spontaneous task-set formation and investigated structure learning through the lens of incidental stimulus encoding. Participants performed a learning task designed to either promote task-set formation (by “motor-clustering” possible stimulus-action rules), or to discourage it (by using arbitrary category-response mappings). We adjudicated between two hypotheses: Structure learning may promote attention to task stimuli, thus resulting in better subsequent memory. Alternatively, building task-sets might impose cognitive demands (for instance, on working memory) that divert attention away from stimulus encoding. While the clustering manipulation affected task-set formation, there were also substantial individual differences. Importantly, structure learning incurred a cost: spontaneous task-set formation was associated with diminished stimulus encoding. Thus, spontaneous hierarchical task-set formation appears to involve cognitive demands that divert attention away from encoding of task stimuli during structure learning.
Collapse
Affiliation(s)
- Christina Bejjani
- Department of Psychology and Neuroscience, Duke University, Durham, NC, United States.,Center for Cognitive Neuroscience, Duke University, Durham, NC, United States
| | - Tobias Egner
- Department of Psychology and Neuroscience, Duke University, Durham, NC, United States.,Center for Cognitive Neuroscience, Duke University, Durham, NC, United States
| |
Collapse
|
13
|
|
14
|
Deterministic response strategies in a trial-and-error learning task. PLoS Comput Biol 2018; 14:e1006621. [PMID: 30496285 PMCID: PMC6289466 DOI: 10.1371/journal.pcbi.1006621] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 12/11/2018] [Accepted: 11/02/2018] [Indexed: 01/22/2023] Open
Abstract
Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploited to increase learning efficiency. Previous studies have shown that in settings featuring such dependencies, humans typically engage high-level cognitive processes and employ advanced learning strategies to improve their learning efficiency. Here we analyze in detail the initial learning phase of a sample of human subjects (N = 85) performing a trial-and-error learning task with deterministic feedback and hidden stimulus-response dependencies. Using computational modeling, we find that the standard Q-learning model cannot sufficiently explain human learning strategies in this setting. Instead, newly introduced deterministic response models, which are theoretically optimal and transform stimulus sequences unambiguously into response sequences, provide the best explanation for 50.6% of the subjects. Most of the remaining subjects either show a tendency towards generic optimal learning (21.2%) or at least partially exploit stimulus-response dependencies (22.3%), while a few subjects (5.9%) show no clear preference for any of the employed models. After the initial learning phase, asymptotic learning performance during the subsequent practice phase is best explained by the standard Q-learning model. Our results show that human learning strategies in the presented trial-and-error learning task go beyond merely associating stimuli and responses via incremental reinforcement. Specifically during initial learning, high-level cognitive processes support sophisticated learning strategies that increase learning efficiency while keeping memory demands and computational efforts bounded. The good asymptotic fit of the Q-learning model indicates that these cognitive processes are successively replaced by the formation of stimulus-response associations over the course of learning.
Collapse
|
15
|
Flesch T, Balaguer J, Dekker R, Nili H, Summerfield C. Comparing continual task learning in minds and machines. Proc Natl Acad Sci U S A 2018; 115:E10313-E10322. [PMID: 30322916 PMCID: PMC6217400 DOI: 10.1073/pnas.1800755115] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Humans can learn to perform multiple tasks in succession over the lifespan ("continual" learning), whereas current machine learning systems fail. Here, we investigated the cognitive mechanisms that permit successful continual learning in humans and harnessed our behavioral findings for neural network design. Humans categorized naturalistic images of trees according to one of two orthogonal task rules that were learned by trial and error. Training regimes that focused on individual rules for prolonged periods (blocked training) improved human performance on a later test involving randomly interleaved rules, compared with control regimes that trained in an interleaved fashion. Analysis of human error patterns suggested that blocked training encouraged humans to form "factorized" representation that optimally segregated the tasks, especially for those individuals with a strong prior bias to represent the stimulus space in a well-structured way. By contrast, standard supervised deep neural networks trained on the same tasks suffered catastrophic forgetting under blocked training, due to representational interference in the deeper layers. However, augmenting deep networks with an unsupervised generative model that allowed it to first learn a good embedding of the stimulus space (similar to that observed in humans) reduced catastrophic forgetting under blocked training. Building artificial agents that first learn a model of the world may be one promising route to solving continual task performance in artificial intelligence research.
Collapse
Affiliation(s)
- Timo Flesch
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, United Kingdom;
| | - Jan Balaguer
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, United Kingdom
- DeepMind, EC4A 3TW London, United Kingdom
| | - Ronald Dekker
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, United Kingdom
| | - Hamed Nili
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, United Kingdom
| | - Christopher Summerfield
- Department of Experimental Psychology, University of Oxford, OX2 6BW Oxford, United Kingdom
- DeepMind, EC4A 3TW London, United Kingdom
| |
Collapse
|
16
|
Frontal Cortex and the Hierarchical Control of Behavior. Trends Cogn Sci 2017; 22:170-188. [PMID: 29229206 DOI: 10.1016/j.tics.2017.11.005] [Citation(s) in RCA: 314] [Impact Index Per Article: 44.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 11/03/2017] [Accepted: 11/16/2017] [Indexed: 12/23/2022]
Abstract
The frontal lobes are important for cognitive control, yet their functional organization remains controversial. An influential class of theory proposes that the frontal lobes are organized along their rostrocaudal axis to support hierarchical cognitive control. Here, we take an updated look at the literature on hierarchical control, with particular focus on the functional organization of lateral frontal cortex. Our review of the evidence supports neither a unitary model of lateral frontal function nor a unidimensional abstraction gradient. Rather, separate frontal networks interact via local and global hierarchical structure to support diverse task demands.
Collapse
|
17
|
Alexander WH, Brown JW, Collins AGE, Hayden BY, Vassena E. Prefrontal Cortex in Control: Broadening the Scope to Identify Mechanisms. J Cogn Neurosci 2017; 30:1061-1065. [PMID: 28562208 DOI: 10.1162/jocn_a_01154] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Sometime in the past two decades, neuroimaging and behavioral research converged on pFC as an important locus of cognitive control and decision-making, and that seems to be the last thing anyone has agreed on since. Every year sees an increase in the number of roles and functions attributed to distinct subregions within pFC, roles that may explain behavior and neural activity in one context but might fail to generalize across the many behaviors in which each region is implicated. Emblematic of this ongoing proliferation of functions is dorsal ACC (dACC). Novel tasks that activate dACC are followed by novel interpretations of dACC function, and each new interpretation adds to the number of functionally specific processes contained within the region. This state of affairs, a recurrent and persistent behavior followed by an illusory and transient relief, can be likened to behavioral pathology. In Journal of Cognitive Neuroscience, 29:10 we collect contributed articles that seek to move the conversation beyond specific functions of subregions of pFC, focusing instead on general roles that support pFC involvement in a wide variety of behaviors and across a variety of experimental paradigms.
Collapse
|