1
|
Akbarinia A. Exploring the categorical nature of colour perception: Insights from artificial networks. Neural Netw 2025; 181:106758. [PMID: 39368278 DOI: 10.1016/j.neunet.2024.106758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 08/26/2024] [Accepted: 09/23/2024] [Indexed: 10/07/2024]
Abstract
The electromagnetic spectrum of light from a rainbow is a continuous signal, yet we perceive it vividly in several distinct colour categories. The origins and underlying mechanisms of this phenomenon remain partly unexplained. We investigate categorical colour perception in artificial neural networks (ANNs) using the odd-one-out paradigm. In the first experiment, we compared unimodal vision networks (e.g., ImageNet object recognition) to multimodal vision-language models (e.g., CLIP text-image matching). Our results show that vision networks predict a significant portion of human data (approximately 80%), while vision-language models account for the remaining unexplained data, even in non-linguistic experiments. These findings suggest that categorical colour perception is a language-independent representation, though it is partly shaped by linguistic colour terms during its development. In the second experiment, we explored how the visual task influences the colour categories of an ANN by examining twenty-four Taskonomy networks. Our results indicate that human-like colour categories are task-dependent, predominantly emerging in semantic and 3D tasks, with a notable absence in low-level tasks. To explain this difference, we analysed kernel responses before the winner-takes-all stage, observing that networks with mismatching colour categories may still align in underlying continuous representations. Our findings quantify the dual influence of visual signals and linguistic factors in categorical colour perception and demonstrate the task-dependent nature of this phenomenon, suggesting that categorical colour perception emerges to facilitate certain visual tasks.
Collapse
Affiliation(s)
- Arash Akbarinia
- Department of Experimental Psychology, University of Giessen, Germany.
| |
Collapse
|
2
|
Han Z, Sereno AB. Exploring neural architectures for simultaneously recognizing multiple visual attributes. Sci Rep 2024; 14:30036. [PMID: 39627268 PMCID: PMC11615371 DOI: 10.1038/s41598-024-80679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 11/21/2024] [Indexed: 12/06/2024] Open
Abstract
Much experimental evidence in neuroscience has suggested a division of higher visual processing into a ventral pathway specialized for object recognition and a dorsal pathway specialized for spatial recognition. Previous computational studies have suggested that neural networks with two segregated pathways (branches) have better performance in visual recognition tasks than neural networks with a single pathway (branch). One previously proposed possibility is that two pathways increase the learning efficiency of a network by allowing separate networks to process information about different visual attributes separately. However, most of these previous studies were limited, considering recognition of only two visual attributes, identity and location, simultaneously with a restricted number of classes in each attribute. We investigate whether it is always advantageous to use two-pathway networks when recognizing other visual attributes as well as examine whether the advantage of using two-pathway networks would be different when there are a different number of classes in each attribute. We find that it is always advantageous to use segregated pathways to process different visual attributes separately, with this advantage increasing with a greater number of classes. Thus, using a computational approach, we demonstrate that it is computationally advantageous to have separate pathways if the amount of variations of a given visual attribute is high or that attribute needs to be finely discriminated. Hence, when the size of the computer vision model is limited, designing a segregated pathway (branch) for a given visual attribute should only be used when it is computationally advantageous to do so.
Collapse
Affiliation(s)
- Zhixian Han
- Department of Psychological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| | - Anne B Sereno
- Department of Psychological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, 47907, USA.
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
| |
Collapse
|
3
|
Conwell C, Prince JS, Kay KN, Alvarez GA, Konkle T. A large-scale examination of inductive biases shaping high-level visual representation in brains and machines. Nat Commun 2024; 15:9383. [PMID: 39477923 PMCID: PMC11526138 DOI: 10.1038/s41467-024-53147-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 10/01/2024] [Indexed: 11/02/2024] Open
Abstract
The rapid release of high-performing computer vision models offers new potential to study the impact of different inductive biases on the emergent brain alignment of learned representations. Here, we perform controlled comparisons among a curated set of 224 diverse models to test the impact of specific model properties on visual brain predictivity - a process requiring over 1.8 billion regressions and 50.3 thousand representational similarity analyses. We find that models with qualitatively different architectures (e.g. CNNs versus Transformers) and task objectives (e.g. purely visual contrastive learning versus vision- language alignment) achieve near equivalent brain predictivity, when other factors are held constant. Instead, variation across visual training diets yields the largest, most consistent effect on brain predictivity. Many models achieve similarly high brain predictivity, despite clear variation in their underlying representations - suggesting that standard methods used to link models to brains may be too flexible. Broadly, these findings challenge common assumptions about the factors underlying emergent brain alignment, and outline how we can leverage controlled model comparison to probe the common computational principles underlying biological and artificial visual systems.
Collapse
Affiliation(s)
- Colin Conwell
- Department of Psychology, Harvard University, Cambridge, MA, USA.
| | - Jacob S Prince
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Kendrick N Kay
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, MN, USA
| | - George A Alvarez
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, MA, USA.
- Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Kempner Institute for Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
4
|
Lin B, Kriegeskorte N. The topology and geometry of neural representations. Proc Natl Acad Sci U S A 2024; 121:e2317881121. [PMID: 39374397 PMCID: PMC11494346 DOI: 10.1073/pnas.2317881121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 07/24/2024] [Indexed: 10/09/2024] Open
Abstract
A central question for neuroscience is how to characterize brain representations of perceptual and cognitive content. An ideal characterization should distinguish different functional regions with robustness to noise and idiosyncrasies of individual brains that do not correspond to computational differences. Previous studies have characterized brain representations by their representational geometry, which is defined by the representational dissimilarity matrix (RDM), a summary statistic that abstracts from the roles of individual neurons (or responses channels) and characterizes the discriminability of stimuli. Here, we explore a further step of abstraction: from the geometry to the topology of brain representations. We propose topological representational similarity analysis, an extension of representational similarity analysis that uses a family of geotopological summary statistics that generalizes the RDM to characterize the topology while de-emphasizing the geometry. We evaluate this family of statistics in terms of the sensitivity and specificity for model selection using both simulations and functional MRI (fMRI) data. In the simulations, the ground truth is a data-generating layer representation in a neural network model and the models are the same and other layers in different model instances (trained from different random seeds). In fMRI, the ground truth is a visual area and the models are the same and other areas measured in different subjects. Results show that topology-sensitive characterizations of population codes are robust to noise and interindividual variability and maintain excellent sensitivity to the unique representational signatures of different neural network layers and brain regions.
Collapse
Affiliation(s)
- Baihan Lin
- Department of Artificial Intelligence and Human Health, Hasso Plattner Institute for Digital Health, Icahn School of Medicine at Mount Sinai, New York, NY10029
- Department of Psychiatry, Center for Computational Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY10029
- Department of Neuroscience, Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY10029
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY10027
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY10027
- Department of Psychology, Columbia University, New York, NY10027
- Department of Neuroscience, Columbia University, New York, NY10027
| |
Collapse
|
5
|
Faghel-Soubeyrand S, Richoz AR, Waeber D, Woodhams J, Caldara R, Gosselin F, Charest I. Neural computations in prosopagnosia. Cereb Cortex 2024; 34:bhae211. [PMID: 38795358 PMCID: PMC11127037 DOI: 10.1093/cercor/bhae211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 04/30/2024] [Accepted: 05/03/2024] [Indexed: 05/27/2024] Open
Abstract
We report an investigation of the neural processes involved in the processing of faces and objects of brain-lesioned patient PS, a well-documented case of pure acquired prosopagnosia. We gathered a substantial dataset of high-density electrophysiological recordings from both PS and neurotypicals. Using representational similarity analysis, we produced time-resolved brain representations in a format that facilitates direct comparisons across time points, different individuals, and computational models. To understand how the lesions in PS's ventral stream affect the temporal evolution of her brain representations, we computed the temporal generalization of her brain representations. We uncovered that PS's early brain representations exhibit an unusual similarity to later representations, implying an excessive generalization of early visual patterns. To reveal the underlying computational deficits, we correlated PS' brain representations with those of deep neural networks (DNN). We found that the computations underlying PS' brain activity bore a closer resemblance to early layers of a visual DNN than those of controls. However, the brain representations in neurotypicals became more akin to those of the later layers of the model compared to PS. We confirmed PS's deficits in high-level brain representations by demonstrating that her brain representations exhibited less similarity with those of a DNN of semantics.
Collapse
Affiliation(s)
- Simon Faghel-Soubeyrand
- Département de psychologie, Université de Montréal, 90 av. Vincent D’indy, Montreal, H2V 2S9, Canada
- Department of Experimental Psychology, University of Oxford, Anna Watts Building, Woodstock Rd, Oxford OX2 6GG
| | - Anne-Raphaelle Richoz
- Département de psychologie, Université de Fribourg, RM 01 bu. C-3.117Rue P.A. de Faucigny 21700 Fribourg, Switzerland
| | - Delphine Waeber
- Département de psychologie, Université de Fribourg, RM 01 bu. C-3.117Rue P.A. de Faucigny 21700 Fribourg, Switzerland
| | - Jessica Woodhams
- School of Psychology, University of Birmingham, Hills Building, Edgbaston Park Rd, Birmingham B15 2TT, UK
| | - Roberto Caldara
- Département de psychologie, Université de Fribourg, RM 01 bu. C-3.117Rue P.A. de Faucigny 21700 Fribourg, Switzerland
| | - Frédéric Gosselin
- Département de psychologie, Université de Montréal, 90 av. Vincent D’indy, Montreal, H2V 2S9, Canada
| | - Ian Charest
- Département de psychologie, Université de Montréal, 90 av. Vincent D’indy, Montreal, H2V 2S9, Canada
| |
Collapse
|
6
|
Cadena SA, Willeke KF, Restivo K, Denfield G, Sinz FH, Bethge M, Tolias AS, Ecker AS. Diverse task-driven modeling of macaque V4 reveals functional specialization towards semantic tasks. PLoS Comput Biol 2024; 20:e1012056. [PMID: 38781156 PMCID: PMC11115319 DOI: 10.1371/journal.pcbi.1012056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 04/08/2024] [Indexed: 05/25/2024] Open
Abstract
Responses to natural stimuli in area V4-a mid-level area of the visual ventral stream-are well predicted by features from convolutional neural networks (CNNs) trained on image classification. This result has been taken as evidence for the functional role of V4 in object classification. However, we currently do not know if and to what extent V4 plays a role in solving other computational objectives. Here, we investigated normative accounts of V4 (and V1 for comparison) by predicting macaque single-neuron responses to natural images from the representations extracted by 23 CNNs trained on different computer vision tasks including semantic, geometric, 2D, and 3D types of tasks. We found that V4 was best predicted by semantic classification features and exhibited high task selectivity, while the choice of task was less consequential to V1 performance. Consistent with traditional characterizations of V4 function that show its high-dimensional tuning to various 2D and 3D stimulus directions, we found that diverse non-semantic tasks explained aspects of V4 function that are not captured by individual semantic tasks. Nevertheless, jointly considering the features of a pair of semantic classification tasks was sufficient to yield one of our top V4 models, solidifying V4's main functional role in semantic processing and suggesting that V4's selectivity to 2D or 3D stimulus properties found by electrophysiologists can result from semantic functional goals.
Collapse
Affiliation(s)
- Santiago A. Cadena
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
| | - Konstantin F. Willeke
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Kelli Restivo
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - George Denfield
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
| | - Fabian H. Sinz
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- International Max Planck Research School for Intelligent Systems, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Institute for Theoretical Physics and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
| | - Andreas S. Tolias
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Alexander S. Ecker
- Institute of Computer Science and Campus Institute Data Science, University of Göttingen, Göttingen, Germany
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| |
Collapse
|
7
|
Tomasello R, Carriere M, Pulvermüller F. The impact of early and late blindness on language and verbal working memory: A brain-constrained neural model. Neuropsychologia 2024; 196:108816. [PMID: 38331022 DOI: 10.1016/j.neuropsychologia.2024.108816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/26/2024] [Accepted: 02/04/2024] [Indexed: 02/10/2024]
Abstract
Neural circuits related to language exhibit a remarkable ability to reorganize and adapt in response to visual deprivation. Particularly, early and late blindness induce distinct neuroplastic changes in the visual cortex, repurposing it for language and semantic processing. Interestingly, these functional changes provoke a unique cognitive advantage - enhanced verbal working memory, particularly in early blindness. Yet, the underlying neuromechanisms and the impact on language and memory-related circuits remain not fully understood. Here, we applied a brain-constrained neural network mimicking the structural and functional features of the frontotemporal-occipital cortices, to model conceptual acquisition in early and late blindness. The results revealed differential expansion of conceptual-related neural circuits into deprived visual areas depending on the timing of visual loss, which is most prominent in early blindness. This neural recruitment is fundamentally governed by the biological principles of neural circuit expansion and the absence of uncorrelated sensory input. Critically, the degree of these changes is constrained by the availability of neural matter previously allocated to visual experiences, as in the case of late blindness. Moreover, we shed light on the implication of visual deprivation on the neural underpinnings of verbal working memory, revealing longer reverberatory neural activity in 'blind models' as compared to the sighted ones. These findings provide a better understanding of the interplay between visual deprivations, neuroplasticity, language processing and verbal working memory.
Collapse
Affiliation(s)
- Rosario Tomasello
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4 Freie Universität Berlin, 14195, Berlin, Germany; Cluster of Excellence' Matters of Activity. Image Space Material', Humboldt Universität zu Berlin, 10099, Berlin, Germany.
| | - Maxime Carriere
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4 Freie Universität Berlin, 14195, Berlin, Germany
| | - Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4 Freie Universität Berlin, 14195, Berlin, Germany; Cluster of Excellence' Matters of Activity. Image Space Material', Humboldt Universität zu Berlin, 10099, Berlin, Germany; Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10117, Berlin, Germany; Einstein Center for Neurosciences, 10117, Berlin, Germany
| |
Collapse
|
8
|
Noda T, Aschauer DF, Chambers AR, Seiler JPH, Rumpel S. Representational maps in the brain: concepts, approaches, and applications. Front Cell Neurosci 2024; 18:1366200. [PMID: 38584779 PMCID: PMC10995314 DOI: 10.3389/fncel.2024.1366200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/08/2024] [Indexed: 04/09/2024] Open
Abstract
Neural systems have evolved to process sensory stimuli in a way that allows for efficient and adaptive behavior in a complex environment. Recent technological advances enable us to investigate sensory processing in animal models by simultaneously recording the activity of large populations of neurons with single-cell resolution, yielding high-dimensional datasets. In this review, we discuss concepts and approaches for assessing the population-level representation of sensory stimuli in the form of a representational map. In such a map, not only are the identities of stimuli distinctly represented, but their relational similarity is also mapped onto the space of neuronal activity. We highlight example studies in which the structure of representational maps in the brain are estimated from recordings in humans as well as animals and compare their methodological approaches. Finally, we integrate these aspects and provide an outlook for how the concept of representational maps could be applied to various fields in basic and clinical neuroscience.
Collapse
Affiliation(s)
- Takahiro Noda
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center, Johannes Gutenberg University-Mainz, Mainz, Germany
| | - Dominik F. Aschauer
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center, Johannes Gutenberg University-Mainz, Mainz, Germany
| | - Anna R. Chambers
- Department of Otolaryngology – Head and Neck Surgery, Harvard Medical School, Boston, MA, United States
- Eaton Peabody Laboratories, Massachusetts Eye and Ear Infirmary, Boston, MA, United States
| | - Johannes P.-H. Seiler
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center, Johannes Gutenberg University-Mainz, Mainz, Germany
| | - Simon Rumpel
- Institute of Physiology, Focus Program Translational Neurosciences, University Medical Center, Johannes Gutenberg University-Mainz, Mainz, Germany
| |
Collapse
|
9
|
Dwivedi K, Sadiya S, Balode MP, Roig G, Cichy RM. Visual features are processed before navigational affordances in the human brain. Sci Rep 2024; 14:5573. [PMID: 38448446 PMCID: PMC10917749 DOI: 10.1038/s41598-024-55652-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 02/26/2024] [Indexed: 03/08/2024] Open
Abstract
To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded human brain responses to visual scenes with electroencephalography and related those to computational models that operationalize three aspects of scene processing (2D, 3D, and semantic information), as well as to a behavioral model capturing navigational affordances. We found a temporal processing hierarchy: navigational affordance is processed later than the other scene features (2D, 3D, and semantic) investigated. This reveals the temporal order with which the human brain computes complex scene information and suggests that the brain leverages these pieces of information to plan navigation.
Collapse
Affiliation(s)
- Kshitij Dwivedi
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
| | - Sari Sadiya
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany.
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt, Germany.
| | - Marta P Balode
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Institute of Neuroinformatics, ETH Zurich and University of Zurich, Zurich, Switzerland
| | - Gemma Roig
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
- The Hessian Center for Artificial Intelligence (hessian.AI), Darmstadt, Germany
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
10
|
Loke J, Seijdel N, Snoek L, Sörensen LKA, van de Klundert R, van der Meer M, Quispel E, Cappaert N, Scholte HS. Human Visual Cortex and Deep Convolutional Neural Network Care Deeply about Object Background. J Cogn Neurosci 2024; 36:551-566. [PMID: 38165735 DOI: 10.1162/jocn_a_02098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Deep convolutional neural networks (DCNNs) are able to partially predict brain activity during object categorization tasks, but factors contributing to this predictive power are not fully understood. Our study aimed to investigate the factors contributing to the predictive power of DCNNs in object categorization tasks. We compared the activity of four DCNN architectures with EEG recordings obtained from 62 human participants during an object categorization task. Previous physiological studies on object categorization have highlighted the importance of figure-ground segregation-the ability to distinguish objects from their backgrounds. Therefore, we investigated whether figure-ground segregation could explain the predictive power of DCNNs. Using a stimulus set consisting of identical target objects embedded in different backgrounds, we examined the influence of object background versus object category within both EEG and DCNN activity. Crucially, the recombination of naturalistic objects and experimentally controlled backgrounds creates a challenging and naturalistic task, while retaining experimental control. Our results showed that early EEG activity (< 100 msec) and early DCNN layers represent object background rather than object category. We also found that the ability of DCNNs to predict EEG activity is primarily influenced by how both systems process object backgrounds, rather than object categories. We demonstrated the role of figure-ground segregation as a potential prerequisite for recognition of object features, by contrasting the activations of trained and untrained (i.e., random weights) DCNNs. These findings suggest that both human visual cortex and DCNNs prioritize the segregation of object backgrounds and target objects to perform object categorization. Altogether, our study provides new insights into the mechanisms underlying object categorization as we demonstrated that both human visual cortex and DCNNs care deeply about object background.
Collapse
|
11
|
Faghel-Soubeyrand S, Ramon M, Bamps E, Zoia M, Woodhams J, Richoz AR, Caldara R, Gosselin F, Charest I. Decoding face recognition abilities in the human brain. PNAS NEXUS 2024; 3:pgae095. [PMID: 38516275 PMCID: PMC10957238 DOI: 10.1093/pnasnexus/pgae095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024]
Abstract
Why are some individuals better at recognizing faces? Uncovering the neural mechanisms supporting face recognition ability has proven elusive. To tackle this challenge, we used a multimodal data-driven approach combining neuroimaging, computational modeling, and behavioral tests. We recorded the high-density electroencephalographic brain activity of individuals with extraordinary face recognition abilities-super-recognizers-and typical recognizers in response to diverse visual stimuli. Using multivariate pattern analyses, we decoded face recognition abilities from 1 s of brain activity with up to 80% accuracy. To better understand the mechanisms subtending this decoding, we compared representations in the brains of our participants with those in artificial neural network models of vision and semantics, as well as with those involved in human judgments of shape and meaning similarity. Compared to typical recognizers, we found stronger associations between early brain representations of super-recognizers and midlevel representations of vision models as well as shape similarity judgments. Moreover, we found stronger associations between late brain representations of super-recognizers and representations of the artificial semantic model as well as meaning similarity judgments. Overall, these results indicate that important individual variations in brain processing, including neural computations extending beyond purely visual processes, support differences in face recognition abilities. They provide the first empirical evidence for an association between semantic computations and face recognition abilities. We believe that such multimodal data-driven approaches will likely play a critical role in further revealing the complex nature of idiosyncratic face recognition in the human brain.
Collapse
Affiliation(s)
- Simon Faghel-Soubeyrand
- Department of Experimental Psychology, University of Oxford, Oxford OX2 6GG, UK
- Département de psychologie, Université de Montréal, Montréal, Québec H2V 2S9, Canada
| | - Meike Ramon
- Institute of Psychology, University of Lausanne, Lausanne CH-1015, Switzerland
| | - Eva Bamps
- Center for Contextual Psychiatry, Department of Neurosciences, KU Leuven, Leuven ON5, Belgium
| | - Matteo Zoia
- Department for Biomedical Research, University of Bern, Bern 3008, Switzerland
| | - Jessica Woodhams
- Département de psychologie, Université de Montréal, Montréal, Québec H2V 2S9, Canada
- School of Psychology, University of Birmingham, Hills Building, Edgbaston Park Rd, Birmingham B15 2TT, UK
| | | | - Roberto Caldara
- Département de Psychology, Université de Fribourg, Fribourg CH-1700, Switzerland
| | - Frédéric Gosselin
- Département de psychologie, Université de Montréal, Montréal, Québec H2V 2S9, Canada
| | - Ian Charest
- Département de psychologie, Université de Montréal, Montréal, Québec H2V 2S9, Canada
| |
Collapse
|
12
|
Elmoznino E, Bonner MF. High-performing neural network models of visual cortex benefit from high latent dimensionality. PLoS Comput Biol 2024; 20:e1011792. [PMID: 38198504 PMCID: PMC10805290 DOI: 10.1371/journal.pcbi.1011792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 01/23/2024] [Accepted: 12/30/2023] [Indexed: 01/12/2024] Open
Abstract
Geometric descriptions of deep neural networks (DNNs) have the potential to uncover core representational principles of computational models in neuroscience. Here we examined the geometry of DNN models of visual cortex by quantifying the latent dimensionality of their natural image representations. A popular view holds that optimal DNNs compress their representations onto low-dimensional subspaces to achieve invariance and robustness, which suggests that better models of visual cortex should have lower dimensional geometries. Surprisingly, we found a strong trend in the opposite direction-neural networks with high-dimensional image subspaces tended to have better generalization performance when predicting cortical responses to held-out stimuli in both monkey electrophysiology and human fMRI data. Moreover, we found that high dimensionality was associated with better performance when learning new categories of stimuli, suggesting that higher dimensional representations are better suited to generalize beyond their training domains. These findings suggest a general principle whereby high-dimensional geometry confers computational benefits to DNN models of visual cortex.
Collapse
Affiliation(s)
- Eric Elmoznino
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Michael F. Bonner
- Department of Cognitive Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
13
|
Rothkopf C, Bremmer F, Fiehler K, Dobs K, Triesch J. Models of vision need some action. Behav Brain Sci 2023; 46:e405. [PMID: 38054279 DOI: 10.1017/s0140525x23001577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Bowers et al. focus their criticisms on research that compares behavioral and brain data from the ventral stream with a class of deep neural networks for object recognition. While they are right to identify issues with current benchmarking research programs, they overlook a much more fundamental limitation of this literature: Disregarding the importance of action and interaction for perception.
Collapse
Affiliation(s)
- Constantin Rothkopf
- Centre for Cognitive Science, Technical University of Darmstadt, Darmstadt, Germany
- Frankfurt Institute for Advanced Studies, Goethe-Universität Frankfurt, Frankfurt am Main, Germany
- Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany
- HMWK-Clusterproject The Adaptive Mind, Hesse, Germanyhttps://www.theadaptivemind.de/
| | - Frank Bremmer
- Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany
- HMWK-Clusterproject The Adaptive Mind, Hesse, Germanyhttps://www.theadaptivemind.de/
- Applied Physics and Neurophysics, University of Marburg, Marburg, Germany
| | - Katja Fiehler
- Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany
- HMWK-Clusterproject The Adaptive Mind, Hesse, Germanyhttps://www.theadaptivemind.de/
- Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany
| | - Katharina Dobs
- Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany
- HMWK-Clusterproject The Adaptive Mind, Hesse, Germanyhttps://www.theadaptivemind.de/
- Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany
| | - Jochen Triesch
- Frankfurt Institute for Advanced Studies, Goethe-Universität Frankfurt, Frankfurt am Main, Germany
- Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany
- HMWK-Clusterproject The Adaptive Mind, Hesse, Germanyhttps://www.theadaptivemind.de/
| |
Collapse
|
14
|
Golan T, Taylor J, Schütt H, Peters B, Sommers RP, Seeliger K, Doerig A, Linton P, Konkle T, van Gerven M, Kording K, Richards B, Kietzmann TC, Lindsay GW, Kriegeskorte N. Deep neural networks are not a single hypothesis but a language for expressing computational hypotheses. Behav Brain Sci 2023; 46:e392. [PMID: 38054329 DOI: 10.1017/s0140525x23001553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
An ideal vision model accounts for behavior and neurophysiology in both naturalistic conditions and designed lab experiments. Unlike psychological theories, artificial neural networks (ANNs) actually perform visual tasks and generate testable predictions for arbitrary inputs. These advantages enable ANNs to engage the entire spectrum of the evidence. Failures of particular models drive progress in a vibrant ANN research program of human vision.
Collapse
Affiliation(s)
- Tal Golan
- Department of Cognitive and Brain Sciences, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - JohnMark Taylor
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA ://linton.vision/
| | - Heiko Schütt
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA ://linton.vision/
- Center for Neural Science, New York University, New York, NY, USA
| | - Benjamin Peters
- School of Psychology & Neuroscience, University of Glasgow, Glasgow, UK
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| | - Paul Linton
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA ://linton.vision/
- Presidential Scholars in Society and Neuroscience, Center for Science and Society, Columbia University, New York, NY, USA
- Italian Academy for Advanced Studies in America, Columbia University, New York, NY, USA
| | - Talia Konkle
- Department of Psychology and Center for Brain Sciences, Harvard University, Cambridge, MA, USA ://konklab.fas.harvard.edu/
| | - Marcel van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlandsartcogsys.com
| | - Konrad Kording
- Departments of Bioengineering and Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | - Blake Richards
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Mila, Montreal, QC, Canada
- School of Computer Science, McGill University, Montreal, QC, Canada
- Department of Neurology & Neurosurgery, McGill University, Montreal, QC, Canada
- Montreal Neurological Institute, Montreal, QC, Canada
| | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| | - Grace W Lindsay
- Department of Psychology and Center for Data Science, New York University, New York, NY, USA
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA ://linton.vision/
- Departments of Psychology, Neuroscience, and Electrical Engineering, Columbia University, New York, NY, USA
| |
Collapse
|
15
|
Pulvermüller F. Neurobiological mechanisms for language, symbols and concepts: Clues from brain-constrained deep neural networks. Prog Neurobiol 2023; 230:102511. [PMID: 37482195 PMCID: PMC10518464 DOI: 10.1016/j.pneurobio.2023.102511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 05/02/2023] [Accepted: 07/18/2023] [Indexed: 07/25/2023]
Abstract
Neural networks are successfully used to imitate and model cognitive processes. However, to provide clues about the neurobiological mechanisms enabling human cognition, these models need to mimic the structure and function of real brains. Brain-constrained networks differ from classic neural networks by implementing brain similarities at different scales, ranging from the micro- and mesoscopic levels of neuronal function, local neuronal links and circuit interaction to large-scale anatomical structure and between-area connectivity. This review shows how brain-constrained neural networks can be applied to study in silico the formation of mechanisms for symbol and concept processing and to work towards neurobiological explanations of specifically human cognitive abilities. These include verbal working memory and learning of large vocabularies of symbols, semantic binding carried by specific areas of cortex, attention focusing and modulation driven by symbol type, and the acquisition of concrete and abstract concepts partly influenced by symbols. Neuronal assembly activity in the networks is analyzed to deliver putative mechanistic correlates of higher cognitive processes and to develop candidate explanations founded in established neurobiological principles.
Collapse
Affiliation(s)
- Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, WE4, Freie Universität Berlin, 14195 Berlin, Germany; Berlin School of Mind and Brain, Humboldt Universität zu Berlin, 10099 Berlin, Germany; Einstein Center for Neurosciences Berlin, 10117 Berlin, Germany; Cluster of Excellence 'Matters of Activity', Humboldt Universität zu Berlin, 10099 Berlin, Germany.
| |
Collapse
|
16
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
17
|
Bracci S, Op de Beeck HP. Understanding Human Object Vision: A Picture Is Worth a Thousand Representations. Annu Rev Psychol 2023; 74:113-135. [PMID: 36378917 DOI: 10.1146/annurev-psych-032720-041031] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objects are the core meaningful elements in our visual environment. Classic theories of object vision focus upon object recognition and are elegant and simple. Some of their proposals still stand, yet the simplicity is gone. Recent evolutions in behavioral paradigms, neuroscientific methods, and computational modeling have allowed vision scientists to uncover the complexity of the multidimensional representational space that underlies object vision. We review these findings and propose that the key to understanding this complexity is to relate object vision to the full repertoire of behavioral goals that underlie human behavior, running far beyond object recognition. There might be no such thing as core object recognition, and if it exists, then its importance is more limited than traditionally thought.
Collapse
Affiliation(s)
- Stefania Bracci
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy;
| | - Hans P Op de Beeck
- Leuven Brain Institute, Research Unit Brain & Cognition, KU Leuven, Leuven, Belgium;
| |
Collapse
|
18
|
Schyns PG, Snoek L, Daube C. Degrees of algorithmic equivalence between the brain and its DNN models. Trends Cogn Sci 2022; 26:1090-1102. [PMID: 36216674 DOI: 10.1016/j.tics.2022.09.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 09/01/2022] [Accepted: 09/02/2022] [Indexed: 11/11/2022]
Abstract
Deep neural networks (DNNs) have become powerful and increasingly ubiquitous tools to model human cognition, and often produce similar behaviors. For example, with their hierarchical, brain-inspired organization of computations, DNNs apparently categorize real-world images in the same way as humans do. Does this imply that their categorization algorithms are also similar? We have framed the question with three embedded degrees that progressively constrain algorithmic similarity evaluations: equivalence of (i) behavioral/brain responses, which is current practice, (ii) the stimulus features that are processed to produce these outcomes, which is more constraining, and (iii) the algorithms that process these shared features, the ultimate goal. To improve DNNs as models of cognition, we develop for each degree an increasingly constrained benchmark that specifies the epistemological conditions for the considered equivalence.
Collapse
Affiliation(s)
- Philippe G Schyns
- School of Psychology and Neuroscience, University of Glasgow, Glasgow G12 8QB, UK.
| | - Lukas Snoek
- School of Psychology and Neuroscience, University of Glasgow, Glasgow G12 8QB, UK
| | - Christoph Daube
- School of Psychology and Neuroscience, University of Glasgow, Glasgow G12 8QB, UK
| |
Collapse
|
19
|
Ayzenberg V, Kamps FS, Dilks DD, Lourenco SF. Skeletal representations of shape in the human visual cortex. Neuropsychologia 2022; 164:108092. [PMID: 34801519 PMCID: PMC9840386 DOI: 10.1016/j.neuropsychologia.2021.108092] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 11/07/2021] [Accepted: 11/17/2021] [Indexed: 01/17/2023]
Abstract
Shape perception is crucial for object recognition. However, it remains unknown exactly how shape information is represented and used by the visual system. Here, we tested the hypothesis that the visual system represents object shape via a skeletal structure. Using functional magnetic resonance imaging (fMRI) and representational similarity analysis (RSA), we found that a model of skeletal similarity explained significant unique variance in the response profiles of V3 and LO. Moreover, the skeletal model remained predictive in these regions even when controlling for other models of visual similarity that approximate low-to high-level visual features (i.e., Gabor-jet, GIST, HMAX, and AlexNet), and across different surface forms, a manipulation that altered object contours while preserving the underlying skeleton. Together, these findings shed light on shape processing in human vision, as well as the computational properties of V3 and LO. We discuss how these regions may support two putative roles of shape skeletons: namely, perceptual organization and object recognition.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Department of Psychology, Carnegie Mellon University, USA,Corresponding author: (V. Ayzenberg)
| | - Frederik S. Kamps
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, USA
| | | | - Stella F. Lourenco
- Department of Psychology, Emory University, USA,Corresponding author: (S.F. Lourenco)
| |
Collapse
|