1
|
Webb TW, Frankland SM, Altabaa A, Segert S, Krishnamurthy K, Campbell D, Russin J, Giallanza T, O'Reilly R, Lafferty J, Cohen JD. The relational bottleneck as an inductive bias for efficient abstraction. Trends Cogn Sci 2024; 28:829-843. [PMID: 38729852 DOI: 10.1016/j.tics.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 05/12/2024]
Abstract
A central challenge for cognitive science is to explain how abstract concepts are acquired from limited experience. This has often been framed in terms of a dichotomy between connectionist and symbolic cognitive models. Here, we highlight a recently emerging line of work that suggests a novel reconciliation of these approaches, by exploiting an inductive bias that we term the relational bottleneck. In that approach, neural networks are constrained via their architecture to focus on relations between perceptual inputs, rather than the attributes of individual inputs. We review a family of models that employ this approach to induce abstractions in a data-efficient manner, emphasizing their potential as candidate models for the acquisition of abstract concepts in the human mind and brain.
Collapse
|
2
|
Rao RPN. A sensory-motor theory of the neocortex. Nat Neurosci 2024; 27:1221-1235. [PMID: 38937581 DOI: 10.1038/s41593-024-01673-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 04/26/2024] [Indexed: 06/29/2024]
Abstract
Recent neurophysiological and neuroanatomical studies suggest a close interaction between sensory and motor processes across the neocortex. Here, I propose that the neocortex implements active predictive coding (APC): each cortical area estimates both latent sensory states and actions (including potentially abstract actions internal to the cortex), and the cortex as a whole predicts the consequences of actions at multiple hierarchical levels. Feedback from higher areas modulates the dynamics of state and action networks in lower areas. I show how the same APC architecture can explain (1) how we recognize an object and its parts using eye movements, (2) why perception seems stable despite eye movements, (3) how we learn compositional representations, for example, part-whole hierarchies, (4) how complex actions can be planned using simpler actions, and (5) how we form episodic memories of sensory-motor experiences and learn abstract concepts such as a family tree. I postulate a mapping of the APC model to the laminar architecture of the cortex and suggest possible roles for cortico-cortical and cortico-subcortical pathways.
Collapse
Affiliation(s)
- Rajesh P N Rao
- Center for Neurotechnology, University of Washington, Seattle, WA, USA.
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
3
|
Alamia A, VanRullen R. A Traveling Waves Perspective on Temporal Binding. J Cogn Neurosci 2024; 36:721-729. [PMID: 37172133 DOI: 10.1162/jocn_a_02004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Brain oscillations are involved in many cognitive processes, and several studies have investigated their role in cognition. In particular, the phase of certain oscillations has been related to temporal binding and integration processes, with some authors arguing that perception could be an inherently rhythmic process. However, previous research on oscillations mostly overlooked their spatial component: how oscillations propagate through the brain as traveling waves, with systematic phase delays between brain regions. Here, we argue that interpreting oscillations as traveling waves is a useful paradigm shift to understand their role in temporal binding and address controversial results. After a brief definition of traveling waves, we propose an original view on temporal integration that considers this new perspective. We first focus on cortical dynamics, then speculate about the role of thalamic nuclei in modulating the waves, and on the possible consequences for rhythmic temporal binding. In conclusion, we highlight the importance of considering oscillations as traveling waves when investigating their role in cognitive functions.
Collapse
Affiliation(s)
- Andrea Alamia
- CNRS Centre de Recherche Cerveau et Cognition (CERCO, UMR 5549), Toulouse, France
| | - Rufin VanRullen
- CNRS Centre de Recherche Cerveau et Cognition (CERCO, UMR 5549), Toulouse, France
| |
Collapse
|
4
|
Heitmeier M, Chuang YY, Baayen RH. How trial-to-trial learning shapes mappings in the mental lexicon: Modelling lexical decision with linear discriminative learning. Cogn Psychol 2023; 146:101598. [PMID: 37716109 PMCID: PMC10589761 DOI: 10.1016/j.cogpsych.2023.101598] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 08/23/2023] [Accepted: 09/02/2023] [Indexed: 09/18/2023]
Abstract
Trial-to-trial effects have been found in a number of studies, indicating that processing a stimulus influences responses in subsequent trials. A special case are priming effects which have been modelled successfully with error-driven learning (Marsolek, 2008), implying that participants are continuously learning during experiments. This study investigates whether trial-to-trial learning can be detected in an unprimed lexical decision experiment. We used the Discriminative Lexicon Model (DLM; Baayen et al., 2019), a model of the mental lexicon with meaning representations from distributional semantics, which models error-driven incremental learning with the Widrow-Hoff rule. We used data from the British Lexicon Project (BLP; Keuleers et al., 2012) and simulated the lexical decision experiment with the DLM on a trial-by-trial basis for each subject individually. Then, reaction times were predicted with Generalized Additive Models (GAMs), using measures derived from the DLM simulations as predictors. We extracted measures from two simulations per subject (one with learning updates between trials and one without), and used them as input to two GAMs. Learning-based models showed better model fit than the non-learning ones for the majority of subjects. Our measures also provide insights into lexical processing and individual differences. This demonstrates the potential of the DLM to model behavioural data and leads to the conclusion that trial-to-trial learning can indeed be detected in unprimed lexical decision. Our results support the possibility that our lexical knowledge is subject to continuous changes.
Collapse
|
5
|
Chapman GW, Hasselmo ME. Predictive learning by a burst-dependent learning rule. Neurobiol Learn Mem 2023; 205:107826. [PMID: 37696414 DOI: 10.1016/j.nlm.2023.107826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 08/05/2023] [Accepted: 09/03/2023] [Indexed: 09/13/2023]
Abstract
Humans and other animals are able to quickly generalize latent dynamics of spatiotemporal sequences, often from a minimal number of previous experiences. Additionally, internal representations of external stimuli must remain stable, even in the presence of sensory noise, in order to be useful for informing behavior. In contrast, typical machine learning approaches require many thousands of samples, and generalize poorly to unexperienced examples, or fail completely to predict at long timescales. Here, we propose a novel neural network module which incorporates hierarchy and recurrent feedback terms, constituting a simplified model of neocortical microcircuits. This microcircuit predicts spatiotemporal trajectories at the input layer using a temporal error minimization algorithm. We show that this module is able to predict with higher accuracy into the future compared to traditional models. Investigating this model we find that successive predictive models learn representations which are increasingly removed from the raw sensory space, namely as successive temporal derivatives of the positional information. Next, we introduce a spiking neural network model which implements the rate-model through the use of a recently proposed biological learning rule utilizing dual-compartment neurons. We show that this network performs well on the same tasks as the mean-field models, by developing intrinsic dynamics that follow the dynamics of the external stimulus, while coordinating transmission of higher-order dynamics. Taken as a whole, these findings suggest that hierarchical temporal abstraction of sequences, rather than feed-forward reconstruction, may be responsible for the ability of neural systems to quickly adapt to novel situations.
Collapse
Affiliation(s)
- G William Chapman
- Center for Systems Neuroscience, Boston University, Boston, MA, USA.
| | | |
Collapse
|
6
|
Zheng Y, Liu XL, Nishiyama S, Ranganath C, O’Reilly RC. Correcting the hebbian mistake: Toward a fully error-driven hippocampus. PLoS Comput Biol 2022; 18:e1010589. [PMID: 36219613 PMCID: PMC9586412 DOI: 10.1371/journal.pcbi.1010589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 10/21/2022] [Accepted: 09/19/2022] [Indexed: 11/18/2022] Open
Abstract
The hippocampus plays a critical role in the rapid learning of new episodic memories. Many computational models propose that the hippocampus is an autoassociator that relies on Hebbian learning (i.e., "cells that fire together, wire together"). However, Hebbian learning is computationally suboptimal as it does not learn in a way that is driven toward, and limited by, the objective of achieving effective retrieval. Thus, Hebbian learning results in more interference and a lower overall capacity. Our previous computational models have utilized a powerful, biologically plausible form of error-driven learning in hippocampal CA1 and entorhinal cortex (EC) (functioning as a sparse autoencoder) by contrasting local activity states at different phases in the theta cycle. Based on specific neural data and a recent abstract computational model, we propose a new model called Theremin (Total Hippocampal ERror MINimization) that extends error-driven learning to area CA3-the mnemonic heart of the hippocampal system. In the model, CA3 responds to the EC monosynaptic input prior to the EC disynaptic input through dentate gyrus (DG), giving rise to a temporal difference between these two activation states, which drives error-driven learning in the EC→CA3 and CA3↔CA3 projections. In effect, DG serves as a teacher to CA3, correcting its patterns into more pattern-separated ones, thereby reducing interference. Results showed that Theremin, compared with our original Hebbian-based model, has significantly increased capacity and learning speed. The model makes several novel predictions that can be tested in future studies.
Collapse
Affiliation(s)
- Yicong Zheng
- Department of Psychology, University of California, Davis, California, United States of America
- Center for Neuroscience, University of California, Davis, California, United States of America
| | - Xiaonan L. Liu
- Department of Psychology, The Chinese University of Hong Kong, Hong Kong, People’s Republic of China
| | - Satoru Nishiyama
- Graduate School of Education, Kyoto University, Kyoto, Japan
- Japan Society for the Promotion of Science, Tokyo, Japan
| | - Charan Ranganath
- Department of Psychology, University of California, Davis, California, United States of America
- Center for Neuroscience, University of California, Davis, California, United States of America
| | - Randall C. O’Reilly
- Department of Psychology, University of California, Davis, California, United States of America
- Center for Neuroscience, University of California, Davis, California, United States of America
- Department of Computer Science, University of California, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
7
|
Wang MB, Halassa MM. Thalamocortical contribution to flexible learning in neural systems. Netw Neurosci 2022; 6:980-997. [PMID: 36875011 PMCID: PMC9976647 DOI: 10.1162/netn_a_00235] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 01/19/2022] [Indexed: 11/04/2022] Open
Abstract
Animal brains evolved to optimize behavior in dynamic environments, flexibly selecting actions that maximize future rewards in different contexts. A large body of experimental work indicates that such optimization changes the wiring of neural circuits, appropriately mapping environmental input onto behavioral outputs. A major unsolved scientific question is how optimal wiring adjustments, which must target the connections responsible for rewards, can be accomplished when the relation between sensory inputs, action taken, and environmental context with rewards is ambiguous. The credit assignment problem can be categorized into context-independent structural credit assignment and context-dependent continual learning. In this perspective, we survey prior approaches to these two problems and advance the notion that the brain's specialized neural architectures provide efficient solutions. Within this framework, the thalamus with its cortical and basal ganglia interactions serves as a systems-level solution to credit assignment. Specifically, we propose that thalamocortical interaction is the locus of meta-learning where the thalamus provides cortical control functions that parametrize the cortical activity association space. By selecting among these control functions, the basal ganglia hierarchically guide thalamocortical plasticity across two timescales to enable meta-learning. The faster timescale establishes contextual associations to enable behavioral flexibility, while the slower one enables generalization to new contexts.
Collapse
Affiliation(s)
- Mien Brabeeba Wang
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael M. Halassa
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
8
|
Lin CHS, Garrido MI. Towards a cross-level understanding of Bayesian inference in the brain. Neurosci Biobehav Rev 2022; 137:104649. [PMID: 35395333 DOI: 10.1016/j.neubiorev.2022.104649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 02/28/2022] [Accepted: 03/29/2022] [Indexed: 10/18/2022]
Abstract
Perception emerges from unconscious probabilistic inference, which guides behaviour in our ubiquitously uncertain environment. Bayesian decision theory is a prominent computational model that describes how people make rational decisions using noisy and ambiguous sensory observations. However, critical questions have been raised about the validity of the Bayesian framework in explaining the mental process of inference. Firstly, some natural behaviours deviate from Bayesian optimum. Secondly, the neural mechanisms that support Bayesian computations in the brain are yet to be understood. Taking Marr's cross level approach, we review the recent progress made in addressing these challenges. We first review studies that combined behavioural paradigms and modelling approaches to explain both optimal and suboptimal behaviours. Next, we evaluate the theoretical advances and the current evidence for ecologically feasible algorithms and neural implementations in the brain, which may enable probabilistic inference. We argue that this cross-level approach is necessary for the worthwhile pursuit to uncover mechanistic accounts of human behaviour.
Collapse
Affiliation(s)
- Chin-Hsuan Sophie Lin
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia; Australian Research Council for Integrative Brain Function, Australia.
| | - Marta I Garrido
- Melbourne School of Psychological Sciences, The University of Melbourne, Australia; Australian Research Council for Integrative Brain Function, Australia
| |
Collapse
|
9
|
O’Reilly RC, Ranganath C, Russin JL. The Structure of Systematicity in the Brain. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2022; 31:124-130. [PMID: 35785023 PMCID: PMC9246245 DOI: 10.1177/09637214211049233] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A hallmark of human intelligence is the ability to adapt to new situations, by applying learned rules to new content (systematicity) and thereby enabling an open-ended number of inferences and actions (generativity). Here, we propose that the human brain accomplishes these feats through pathways in the parietal cortex that encode the abstract structure of space, events, and tasks, and pathways in the temporal cortex that encode information about specific people, places, and things (content). Recent neural network models show how the separation of structure and content might emerge through a combination of architectural biases and learning, and these networks show dramatic improvements in the ability to capture systematic, generative behavior. We close by considering how the hippocampal formation may form integrative memories that enable rapid learning of new structure and content representations.
Collapse
Affiliation(s)
| | - Charan Ranganath
- Department of Psychology
- Center for Neuroscience, University of California, Davis
| | - Jacob L. Russin
- Department of Psychology
- Center for Neuroscience, University of California, Davis
| |
Collapse
|
10
|
Cortes N, Abbas Farishta R, Ladret HJ, Casanova C. Corticothalamic Projections Gate Alpha Rhythms in the Pulvinar. Front Cell Neurosci 2021; 15:787170. [PMID: 34938163 PMCID: PMC8685293 DOI: 10.3389/fncel.2021.787170] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 11/04/2021] [Indexed: 11/30/2022] Open
Abstract
Two types of corticothalamic (CT) terminals reach the pulvinar nucleus of the thalamus, and their distribution varies according to the hierarchical level of the cortical area they originate from. While type 2 terminals are more abundant at lower hierarchical levels, terminals from higher cortical areas mostly exhibit type 1 axons. Such terminals also evoke different excitatory postsynaptic potential dynamic profiles, presenting facilitation for type 1 and depression for type 2. As the pulvinar is involved in the oscillatory regulation between intercortical areas, fundamental questions about the role of these different terminal types in the neuronal communication throughout the cortical hierarchy are yielded. Our theoretical results support that the co-action of the two types of terminals produces different oscillatory rhythms in pulvinar neurons. More precisely, terminal types 1 and 2 produce alpha-band oscillations at a specific range of connectivity weights. Such oscillatory activity is generated by an unstable transition of the balanced state network's properties that it is found between the quiescent state and the stable asynchronous spike response state. While CT projections from areas 17 and 21a are arranged in the model as the empirical proportion of terminal types 1 and 2, the actions of these two cortical connections are antagonistic. As area 17 generates low-band oscillatory activity, cortical area 21a shifts pulvinar responses to stable asynchronous spiking activity and vice versa when area 17 produces an asynchronous state. To further investigate such oscillatory effects through corticothalamo-cortical projections, the transthalamic pathway, we created a cortical feedforward network of two cortical areas, 17 and 21a, with CT connections to a pulvinar-like network with two cortico-recipient compartments. With this model, the transthalamic pathway propagates alpha waves from the pulvinar to area 21a. This oscillatory transfer ceases when reciprocal connections from area 21a reach the pulvinar, closing the CT loop. Taken together, results of our model suggest that the pulvinar shows a bi-stable spiking activity, oscillatory or regular asynchronous spiking, whose responses are gated by the different activation of cortico-pulvinar projections from lower to higher-order areas such as areas 17 and 21a.
Collapse
Affiliation(s)
- Nelson Cortes
- Laboratoire des Neurosciences de la Vision, École d’optométrie, Université de Montréal, Montreal, QC, Canada
| | - Reza Abbas Farishta
- Laboratoire des Neurosciences de la Vision, École d’optométrie, Université de Montréal, Montreal, QC, Canada
| | - Hugo J. Ladret
- Laboratoire des Neurosciences de la Vision, École d’optométrie, Université de Montréal, Montreal, QC, Canada
- Institut de Neurosciences de la Timone, UMR 7289, CNRS and Aix-Marseille Université, Marseille, France
| | - Christian Casanova
- Laboratoire des Neurosciences de la Vision, École d’optométrie, Université de Montréal, Montreal, QC, Canada
| |
Collapse
|
11
|
Foucault C, Meyniel F. Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments. eLife 2021; 10:71801. [PMID: 34854377 PMCID: PMC8735865 DOI: 10.7554/elife.71801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 12/01/2021] [Indexed: 11/13/2022] Open
Abstract
From decision making to perception to language, predicting what is coming next is crucial. It is also challenging in stochastic, changing, and structured environments; yet the brain makes accurate predictions in many situations. What computational architecture could enable this feat? Bayesian inference makes optimal predictions but is prohibitively difficult to compute. Here, we show that a specific recurrent neural network architecture enables simple and accurate solutions in several environments. This architecture relies on three mechanisms: gating, lateral connections, and recurrent weight training. Like the optimal solution and the human brain, such networks develop internal representations of their changing environment (including estimates of the environment’s latent variables and the precision of these estimates), leverage multiple levels of latent structure, and adapt their effective learning rate to changes without changing their connection weights. Being ubiquitous in the brain, gated recurrence could therefore serve as a generic building block to predict in real-life environments.
Collapse
Affiliation(s)
- Cédric Foucault
- INSERM, CEA, Université Paris-Saclay, Gif sur Yvette, France
| | | |
Collapse
|