1
|
Chow JK, Palmeri TJ. Manipulating and measuring variation in deep neural network (DNN) representations of objects. Cognition 2024; 252:105920. [PMID: 39163818 DOI: 10.1016/j.cognition.2024.105920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 07/22/2024] [Accepted: 08/10/2024] [Indexed: 08/22/2024]
Abstract
We explore how DNNs can be used to develop a computational understanding of individual differences in high-level visual cognition given their ability to generate rich meaningful object representations informed by their architecture, experience, and training protocols. As a first step to quantifying individual differences in DNN representations, we systematically explored the robustness of a variety of representational similarity measures: Representational Similarity Analysis (RSA), Centered Kernel Alignment (CKA), and Projection-Weighted Canonical Correlation Analysis (PWCCA), with an eye to how these measures are used in cognitive science, cognitive neuroscience, and vision science. To manipulate object representations, we next created a large set of models varying in random initial weights and random training image order, training image frequencies, training category frequencies, and model size and architecture and measured the representational variation caused by each manipulation. We examined both small (All-CNN-C) and commonly-used large (VGG and ResNet) DNN architectures. To provide a comparison for the magnitude of representational differences, we established a baseline based on the representational variation caused by image-augmentation techniques used to train those DNNs. We found that variation in model randomization and model size never exceeded baseline. By contrast, differences in training image frequency and training category frequencies caused representational variation that exceeded baseline, with training category frequency manipulations exceeding baseline earlier in the networks. These findings provide insights into the magnitude of representational variations that can be expected with a range of manipulations and provide a springboard for further exploration of systematic model variations aimed at modeling individual differences in high-level visual cognition.
Collapse
Affiliation(s)
- Jason K Chow
- Department of Psychology, Vanderbilt University, USA.
| | | |
Collapse
|
2
|
Abassi E, Bognár A, de Gelder B, Giese M, Isik L, Lappe A, Mukovskiy A, Solanas MP, Taubert J, Vogels R. Neural Encoding of Bodies for Primate Social Perception. J Neurosci 2024; 44:e1221242024. [PMID: 39358024 PMCID: PMC11450534 DOI: 10.1523/jneurosci.1221-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 07/22/2024] [Accepted: 07/23/2024] [Indexed: 10/04/2024] Open
Abstract
Primates, as social beings, have evolved complex brain mechanisms to navigate intricate social environments. This review explores the neural bases of body perception in both human and nonhuman primates, emphasizing the processing of social signals conveyed by body postures, movements, and interactions. Early studies identified selective neural responses to body stimuli in macaques, particularly within and ventral to the superior temporal sulcus (STS). These regions, known as body patches, represent visual features that are present in bodies but do not appear to be semantic body detectors. They provide information about posture and viewpoint of the body. Recent research using dynamic stimuli has expanded the understanding of the body-selective network, highlighting its complexity and the interplay between static and dynamic processing. In humans, body-selective areas such as the extrastriate body area (EBA) and fusiform body area (FBA) have been implicated in the perception of bodies and their interactions. Moreover, studies on social interactions reveal that regions in the human STS are also tuned to the perception of dyadic interactions, suggesting a specialized social lateral pathway. Computational work developed models of body recognition and social interaction, providing insights into the underlying neural mechanisms. Despite advances, significant gaps remain in understanding the neural mechanisms of body perception and social interaction. Overall, this review underscores the importance of integrating findings across species to comprehensively understand the neural foundations of body perception and the interaction between computational modeling and neural recording.
Collapse
Affiliation(s)
- Etienne Abassi
- The Neuro, Montreal Neurological Institute-Hospital, McGill University, Montréal, QC H3A 2B4, Canada
| | - Anna Bognár
- Department of Neuroscience, KU Leuven, Leuven 3000, Belgium
- Leuven Brain Institute, KU Leuven, Leuven 3000, Belgium
| | - Bea de Gelder
- Cognitive Neuroscience, Maastricht University, Maastricht 6229 EV, Netherlands
| | - Martin Giese
- Section Computational Sensomotorics, Hertie Institute for Clinical Brain Research & Centre for Integrative Neurocience, University Clinic Tuebingen, Tuebingen D-72076, Germany
| | - Leyla Isik
- Cognitive Science, Johns Hopkins University, Baltimore, Maryland 21218
| | - Alexander Lappe
- Section Computational Sensomotorics, Hertie Institute for Clinical Brain Research & Centre for Integrative Neurocience, University Clinic Tuebingen, Tuebingen D-72076, Germany
| | - Albert Mukovskiy
- Section Computational Sensomotorics, Hertie Institute for Clinical Brain Research & Centre for Integrative Neurocience, University Clinic Tuebingen, Tuebingen D-72076, Germany
| | - Marta Poyo Solanas
- Cognitive Neuroscience, Maastricht University, Maastricht 6229 EV, Netherlands
| | - Jessica Taubert
- The School of Psychology, University of Queensland, St Lucia, QLD 4072, Australia
| | - Rufin Vogels
- Department of Neuroscience, KU Leuven, Leuven 3000, Belgium
- Leuven Brain Institute, KU Leuven, Leuven 3000, Belgium
| |
Collapse
|
3
|
Pacheco-Estefan D, Fellner MC, Kunz L, Zhang H, Reinacher P, Roy C, Brandt A, Schulze-Bonhage A, Yang L, Wang S, Liu J, Xue G, Axmacher N. Maintenance and transformation of representational formats during working memory prioritization. Nat Commun 2024; 15:8234. [PMID: 39300141 DOI: 10.1038/s41467-024-52541-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 09/11/2024] [Indexed: 09/22/2024] Open
Abstract
Visual working memory depends on both material-specific brain areas in the ventral visual stream (VVS) that support the maintenance of stimulus representations and on regions in the prefrontal cortex (PFC) that control these representations. How executive control prioritizes working memory contents and whether this affects their representational formats remains an open question, however. Here, we analyzed intracranial EEG (iEEG) recordings in epilepsy patients with electrodes in VVS and PFC who performed a multi-item working memory task involving a retro-cue. We employed Representational Similarity Analysis (RSA) with various Deep Neural Network (DNN) architectures to investigate the representational format of prioritized VWM content. While recurrent DNN representations matched PFC representations in the beta band (15-29 Hz) following the retro-cue, they corresponded to VVS representations in a lower frequency range (3-14 Hz) towards the end of the maintenance period. Our findings highlight the distinct coding schemes and representational formats of prioritized content in VVS and PFC.
Collapse
Affiliation(s)
- Daniel Pacheco-Estefan
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, 44801, Bochum, Germany.
| | - Marie-Christin Fellner
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, 44801, Bochum, Germany
| | - Lukas Kunz
- Department of Epileptology, University Hospital Bonn, Bonn, Germany
| | - Hui Zhang
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, 44801, Bochum, Germany
| | - Peter Reinacher
- Department of Stereotactic and Functional Neurosurgery, Medical Center - Faculty of Medicine, University of Freiburg, Freiburg, Germany
- Fraunhofer Institute for Laser Technology, Aachen, Germany
| | - Charlotte Roy
- Epilepsy Center, Medical Center - Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Armin Brandt
- Epilepsy Center, Medical Center - Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Andreas Schulze-Bonhage
- Epilepsy Center, Medical Center - Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Linglin Yang
- Department of Psychiatry, Second Affiliated Hospital, School of medicine, Zhejiang University, Hangzhou, China
| | - Shuang Wang
- Department of Neurology, Epilepsy center, Second Affiliated Hospital, School of medicine, Zhejiang University, Hangzhou, China
| | - Jing Liu
- Department of Applied Social Sciences, The Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR
| | - Gui Xue
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, PR China
| | - Nikolai Axmacher
- Department of Neuropsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, 44801, Bochum, Germany
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875, PR China
| |
Collapse
|
4
|
Ravichandran N, Lansner A, Herman P. Spiking representation learning for associative memories. Front Neurosci 2024; 18:1439414. [PMID: 39371606 PMCID: PMC11450452 DOI: 10.3389/fnins.2024.1439414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 08/29/2024] [Indexed: 10/08/2024] Open
Abstract
Networks of interconnected neurons communicating through spiking signals offer the bedrock of neural computations. Our brain's spiking neural networks have the computational capacity to achieve complex pattern recognition and cognitive functions effortlessly. However, solving real-world problems with artificial spiking neural networks (SNNs) has proved to be difficult for a variety of reasons. Crucially, scaling SNNs to large networks and processing large-scale real-world datasets have been challenging, especially when compared to their non-spiking deep learning counterparts. The critical operation that is needed of SNNs is the ability to learn distributed representations from data and use these representations for perceptual, cognitive and memory operations. In this work, we introduce a novel SNN that performs unsupervised representation learning and associative memory operations leveraging Hebbian synaptic and activity-dependent structural plasticity coupled with neuron-units modelled as Poisson spike generators with sparse firing (~1 Hz mean and ~100 Hz maximum firing rate). Crucially, the architecture of our model derives from the neocortical columnar organization and combines feedforward projections for learning hidden representations and recurrent projections for forming associative memories. We evaluated the model on properties relevant for attractor-based associative memories such as pattern completion, perceptual rivalry, distortion resistance, and prototype extraction.
Collapse
Affiliation(s)
- Naresh Ravichandran
- Computational Cognitive Brain Science Group, Department of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Anders Lansner
- Computational Cognitive Brain Science Group, Department of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Mathematics, Stockholm University, Stockholm, Sweden
| | - Pawel Herman
- Computational Cognitive Brain Science Group, Department of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
- Digital Futures, KTH Royal Institute of Technology, Stockholm, Sweden
- Swedish e-Science Research Centre (SeRC), Stockholm, Sweden
| |
Collapse
|
5
|
Lin R, Naselaris T, Kay K, Wehbe L. Stacked regressions and structured variance partitioning for interpretable brain maps. Neuroimage 2024; 298:120772. [PMID: 39117095 DOI: 10.1016/j.neuroimage.2024.120772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 07/26/2024] [Accepted: 08/02/2024] [Indexed: 08/10/2024] Open
Abstract
Relating brain activity associated with a complex stimulus to different properties of that stimulus is a powerful approach for constructing functional brain maps. However, when stimuli are naturalistic, their properties are often correlated (e.g., visual and semantic features of natural images, or different layers of a convolutional neural network that are used as features of images). Correlated properties can act as confounders for each other and complicate the interpretability of brain maps, and can impact the robustness of statistical estimators. Here, we present an approach for brain mapping based on two proposed methods: stacking different encoding models and structured variance partitioning. Our stacking algorithm combines encoding models that each uses as input a feature space that describes a different stimulus attribute. The algorithm learns to predict the activity of a voxel as a linear combination of the outputs of different encoding models. We show that the resulting combined model can predict held-out brain activity better or at least as well as the individual encoding models. Further, the weights of the linear combination are readily interpretable; they show the importance of each feature space for predicting a voxel. We then build on our stacking models to introduce structured variance partitioning, a new type of variance partitioning that takes into account the known relationships between features. Our approach constrains the size of the hypothesis space and allows us to ask targeted questions about the similarity between feature spaces and brain regions even in the presence of correlations between the feature spaces. We validate our approach in simulation, showcase its brain mapping potential on fMRI data, and release a Python package. Our methods can be useful for researchers interested in aligning brain activity with different layers of a neural network, or with other types of correlated feature spaces.
Collapse
Affiliation(s)
- Ruogu Lin
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Thomas Naselaris
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, United States of America; Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Kendrick Kay
- Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America.
| |
Collapse
|
6
|
Ramezanpour H, Giverin C, Kar K. Low-cost, portable, easy-to-use kiosks to facilitate home-cage testing of nonhuman primates during vision-based behavioral tasks. J Neurophysiol 2024; 132:666-677. [PMID: 39015072 DOI: 10.1152/jn.00397.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 06/12/2024] [Accepted: 07/10/2024] [Indexed: 07/18/2024] Open
Abstract
Nonhuman primates (NHPs), especially rhesus macaques, have significantly contributed to our understanding of the neural computations underlying human vision. Besides the established homologies in the visual brain areas between these species and our ability to probe detailed neural mechanisms in monkeys at multiple scales, NHPs' ability to perform human-like visual behavior makes them an extremely appealing animal model of human vision. Traditionally, such behavioral studies have been conducted in controlled laboratory settings, offering experimenters tight control over variables like luminance, eye movements, and auditory interference. However, in-lab experiments have several constraints, including limited experimental time, the need for dedicated human experimenters, additional lab space requirements, invasive surgeries for headpost implants, and extra time and training for chairing and head restraints. To overcome these limitations, we propose adopting home-cage behavioral training and testing of NHPs, enabling the administration of many vision-based behavioral tasks simultaneously across multiple monkeys with reduced human personnel requirements, no NHP head restraint, and monkeys' unrestricted access to experiments. In this article, we present a portable, low-cost, easy-to-use kiosk system developed to conduct home-cage vision-based behavioral tasks in NHPs. We provide details of its operation and build to enable more open-source development of this technology. Furthermore, we present validation results using behavioral measurements performed in the lab and in NHP home cages, demonstrating the system's reliability and potential to enhance the efficiency and flexibility of NHP behavioral research.NEW & NOTEWORTHY Training nonhuman primates (NHPs) for vision-based behavioral tasks in a laboratory setting is a time-consuming process and comes with many limitations. To overcome these challenges, we have developed an affordable, open-source, wireless, touchscreen training system that can be placed in the NHPs' housing environment. This system enables NHPs to work at their own pace. It provides a platform to implement continuous behavioral training protocols without major experimenter intervention and eliminates the need for other standard practices like NHP chair training, collar placement, and head restraints. Hence, these kiosks ultimately contribute to animal welfare and therefore better-quality neuroscience in the long run. In addition, NHPs quickly learn complex behavioral tasks using this system, making it a promising tool for wireless electrophysiological research in naturalistic, unrestricted environments to probe the relation between brain and behavior.
Collapse
Affiliation(s)
- Hamidreza Ramezanpour
- Department of Biology, York University Toronto, Ontario, Canada
- Centre for Vision Research, York University, Toronto, Ontario, Canada
| | - Christopher Giverin
- Department of Biology, York University Toronto, Ontario, Canada
- Vision: Science to Applications (VISTA), York University, Toronto, Ontario, Canada
| | - Kohitij Kar
- Department of Biology, York University Toronto, Ontario, Canada
- Centre for Vision Research, York University, Toronto, Ontario, Canada
- Vision: Science to Applications (VISTA), York University, Toronto, Ontario, Canada
- Centre for Integrated and Applied Neuroscience (CIAN), York University, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Rafiei F, Shekhar M, Rahnev D. The neural network RTNet exhibits the signatures of human perceptual decision-making. Nat Hum Behav 2024; 8:1752-1770. [PMID: 38997452 DOI: 10.1038/s41562-024-01914-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 05/13/2024] [Indexed: 07/14/2024]
Abstract
Convolutional neural networks show promise as models of biological vision. However, their decision behaviour, including the facts that they are deterministic and use equal numbers of computations for easy and difficult stimuli, differs markedly from human decision-making, thus limiting their applicability as models of human perceptual behaviour. Here we develop a new neural network, RTNet, that generates stochastic decisions and human-like response time (RT) distributions. We further performed comprehensive tests that showed RTNet reproduces all foundational features of human accuracy, RT and confidence and does so better than all current alternatives. To test RTNet's ability to predict human behaviour on novel images, we collected accuracy, RT and confidence data from 60 human participants performing a digit discrimination task. We found that the accuracy, RT and confidence produced by RTNet for individual novel images correlated with the same quantities produced by human participants. Critically, human participants who were more similar to the average human performance were also found to be closer to RTNet's predictions, suggesting that RTNet successfully captured average human behaviour. Overall, RTNet is a promising model of human RTs that exhibits the critical signatures of perceptual decision-making.
Collapse
Affiliation(s)
- Farshad Rafiei
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Medha Shekhar
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Dobromir Rahnev
- School of Psychology, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
8
|
Cohanpour M, Aly M, Gottlieb J. Neural Representations of Sensory Uncertainty and Confidence Are Associated with Perceptual Curiosity. J Neurosci 2024; 44:e0974232024. [PMID: 38969505 PMCID: PMC11326865 DOI: 10.1523/jneurosci.0974-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 04/07/2024] [Accepted: 06/18/2024] [Indexed: 07/07/2024] Open
Abstract
Humans are immensely curious and motivated to reduce uncertainty, but little is known about the neural mechanisms that generate curiosity. Curiosity is inversely associated with confidence, suggesting that it is triggered by states of low confidence (subjective uncertainty), but the neural mechanisms of this link, have been little investigated. Inspired by studies of sensory uncertainty, we hypothesized that visual areas provide multivariate representations of uncertainty, which are read out by higher-order structures to generate signals of confidence and, ultimately, curiosity. We scanned participants (17 female, 15 male) using fMRI while they performed a new task in which they rated their confidence in identifying distorted images of animals and objects and their curiosity to see the clear image. We measured the activity evoked by each image in the occipitotemporal cortex (OTC) and devised a new metric of "OTC Certainty" indicating the strength of evidence this activity conveys about the animal versus object categories. We show that, perceptual curiosity peaked at low confidence and OTC Certainty negatively correlated with curiosity, establishing a link between curiosity and a multivariate representation of sensory uncertainty. Moreover, univariate (average) activity in two frontal areas-vmPFC and ACC-correlated positively with confidence and negatively with curiosity, and the vmPFC mediated the relationship between OTC Certainty and curiosity. The results reveal novel mechanisms through which uncertainty about an event generates curiosity about that event.
Collapse
Affiliation(s)
- Michael Cohanpour
- Department of Neuroscience, Columbia University, New York, New York 10025
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York 10025
| | - Mariam Aly
- Department of Psychology, Columbia University, New York, New York 10025
| | - Jacqueline Gottlieb
- Department of Neuroscience, Columbia University, New York, New York 10025
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York 10025
- Kavli Institute for Brain Science, Columbia University, New York, New York 10025
| |
Collapse
|
9
|
Rose O, Ponce CR. A concentration of visual cortex-like neurons in prefrontal cortex. Nat Commun 2024; 15:7002. [PMID: 39143147 PMCID: PMC11324908 DOI: 10.1038/s41467-024-51441-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 08/07/2024] [Indexed: 08/16/2024] Open
Abstract
Visual recognition is largely realized through neurons in the ventral stream, though recently, studies have suggested that ventrolateral prefrontal cortex (vlPFC) is also important for visual processing. While it is hypothesized that sensory and cognitive processes are integrated in vlPFC neurons, it is not clear how this mechanism benefits vision, or even if vlPFC neurons have properties essential for computations in visual cortex implemented via recurrence. Here, we investigated if vlPFC neurons in two male monkeys had functions comparable to visual cortex, including receptive fields, image selectivity, and the capacity to synthesize highly activating stimuli using generative networks. We found a subset of vlPFC sites show all properties, suggesting subpopulations of vlPFC neurons encode statistics about the world. Further, these vlPFC sites may be anatomically clustered, consistent with fMRI-identified functional organization. Our findings suggest that stable visual encoding in vlPFC may be a necessary condition for local and brain-wide computations.
Collapse
Affiliation(s)
- Olivia Rose
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
- Roy and Diana Vagelos Division of Biology & Biomedical Sciences, Washington University, St. Louis, MO, USA
| | - Carlos R Ponce
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
10
|
Zhang J, Huang L, Ma Z, Zhou H. Predicting the temporal-dynamic trajectories of cortical neuronal responses in non-human primates based on deep spiking neural network. Cogn Neurodyn 2024; 18:1977-1988. [PMID: 39104695 PMCID: PMC11297849 DOI: 10.1007/s11571-023-09989-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/25/2023] [Accepted: 06/21/2023] [Indexed: 08/07/2024] Open
Abstract
Deep convolutional neural networks (CNNs) are commonly used as computational models for the primate ventral stream, while deep spiking neural networks (SNNs) incorporated with both the temporal and spatial spiking information still lack investigation. We compared performances of SNN and CNN in prediction of visual responses to the naturalistic stimuli in area V4, inferior temporal (IT), and orbitofrontal cortex (OFC). The accuracies based on SNN were significantly higher than that of CNN in prediction of temporal-dynamic trajectory and averaged firing rate of visual response in V4 and IT. The temporal dynamics were captured by SNN for neurons with diverse temporal profiles and category selectivities, and most sensitively captured around the time of peak responses for each brain region. Consistently, SNN activities showed significantly stronger correlations with IT, V4 and OFC responses. In SNN, correlations with neural activities were stronger for later time-step features than early time-step features. The temporal-dynamic prediction was also significantly improved by considering preceding neural activities during the prediction. Thus, our study demonstrated SNN as a powerful temporal-dynamic model for cortical responses to complex naturalistic stimuli.
Collapse
Affiliation(s)
- Jie Zhang
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- University of Chinese Academy of Sciences, Beijing, 100049 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Liwei Huang
- Peng Cheng Laboratory, Shenzhen, 518000 China
- Peking University, Beijing, 100871 China
| | - Zhengyu Ma
- Peng Cheng Laboratory, Shenzhen, 518000 China
| | - Huihui Zhou
- The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055 China
- Peng Cheng Laboratory, Shenzhen, 518000 China
| |
Collapse
|
11
|
Motlagh SC, Joanisse M, Wang B, Mohsenzadeh Y. Unveiling the neural dynamics of conscious perception in rapid object recognition. Neuroimage 2024; 296:120668. [PMID: 38848982 DOI: 10.1016/j.neuroimage.2024.120668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 05/23/2024] [Accepted: 06/05/2024] [Indexed: 06/09/2024] Open
Abstract
Our brain excels at recognizing objects, even when they flash by in a rapid sequence. However, the neural processes determining whether a target image in a rapid sequence can be recognized or not remains elusive. We used electroencephalography (EEG) to investigate the temporal dynamics of brain processes that shape perceptual outcomes in these challenging viewing conditions. Using naturalistic images and advanced multivariate pattern analysis (MVPA) techniques, we probed the brain dynamics governing conscious object recognition. Our results show that although initially similar, the processes for when an object can or cannot be recognized diverge around 180 ms post-appearance, coinciding with feedback neural processes. Decoding analyses indicate that gist perception (partial conscious perception) can occur at ∼120 ms through feedforward mechanisms. In contrast, object identification (full conscious perception of the image) is resolved at ∼190 ms after target onset, suggesting involvement of recurrent processing. These findings underscore the importance of recurrent neural connections in object recognition and awareness in rapid visual presentations.
Collapse
Affiliation(s)
- Saba Charmi Motlagh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Marc Joanisse
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Department of Psychology, Western University, London, Ontario, Canada
| | - Boyu Wang
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada
| | - Yalda Mohsenzadeh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada.
| |
Collapse
|
12
|
Aldarondo D, Merel J, Marshall JD, Hasenclever L, Klibaite U, Gellis A, Tassa Y, Wayne G, Botvinick M, Ölveczky BP. A virtual rodent predicts the structure of neural activity across behaviours. Nature 2024; 632:594-602. [PMID: 38862024 DOI: 10.1038/s41586-024-07633-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 05/30/2024] [Indexed: 06/13/2024]
Abstract
Animals have exquisite control of their bodies, allowing them to perform a diverse range of behaviours. How such control is implemented by the brain, however, remains unclear. Advancing our understanding requires models that can relate principles of control to the structure of neural activity in behaving animals. Here, to facilitate this, we built a 'virtual rodent', in which an artificial neural network actuates a biomechanically realistic model of the rat1 in a physics simulator2. We used deep reinforcement learning3-5 to train the virtual agent to imitate the behaviour of freely moving rats, thus allowing us to compare neural activity recorded in real rats to the network activity of a virtual rodent mimicking their behaviour. We found that neural activity in the sensorimotor striatum and motor cortex was better predicted by the virtual rodent's network activity than by any features of the real rat's movements, consistent with both regions implementing inverse dynamics6. Furthermore, the network's latent variability predicted the structure of neural variability across behaviours and afforded robustness in a way consistent with the minimal intervention principle of optimal feedback control7. These results demonstrate how physical simulation of biomechanically realistic virtual animals can help interpret the structure of neural activity across behaviour and relate it to theoretical principles of motor control.
Collapse
Affiliation(s)
- Diego Aldarondo
- Department of Organismic and Evolutionary Biology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
- Fauna Robotics, New York, NY, USA.
| | - Josh Merel
- DeepMind, Google, London, UK
- Fauna Robotics, New York, NY, USA
| | - Jesse D Marshall
- Department of Organismic and Evolutionary Biology and Center for Brain Science, Harvard University, Cambridge, MA, USA
- Reality Labs, Meta, New York, NY, USA
| | | | - Ugne Klibaite
- Department of Organismic and Evolutionary Biology and Center for Brain Science, Harvard University, Cambridge, MA, USA
| | - Amanda Gellis
- Department of Organismic and Evolutionary Biology and Center for Brain Science, Harvard University, Cambridge, MA, USA
| | | | | | - Matthew Botvinick
- DeepMind, Google, London, UK
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Bence P Ölveczky
- Department of Organismic and Evolutionary Biology and Center for Brain Science, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
13
|
Jang G, Kragel PA. Understanding human amygdala function with artificial neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.29.605621. [PMID: 39131372 PMCID: PMC11312467 DOI: 10.1101/2024.07.29.605621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
The amygdala is a cluster of subcortical nuclei that receives diverse sensory inputs and projects to the cortex, midbrain and other subcortical structures. Numerous accounts of amygdalar contributions to social and emotional behavior have been offered, yet an overarching description of amygdala function remains elusive. Here we adopt a computationally explicit framework that aims to develop a model of amygdala function based on the types of sensory inputs it receives, rather than individual constructs such as threat, arousal, or valence. Characterizing human fMRI signal acquired as participants viewed a full-length film, we developed encoding models that predict both patterns of amygdala activity and self-reported valence evoked by naturalistic images. We use deep image synthesis to generate artificial stimuli that distinctly engage encoding models of amygdala subregions that systematically differ from one another in terms of their low-level visual properties. These findings characterize how the amygdala compresses high-dimensional sensory inputs into low-dimensional representations relevant for behavior.
Collapse
|
14
|
Zhang J, Cao R, Zhu X, Zhou H, Wang S. Distinct attentional profile and functional connectivity of neurons with visual feature coding in the primate brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.24.600401. [PMID: 38979388 PMCID: PMC11230157 DOI: 10.1101/2024.06.24.600401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Visual attention and object recognition are two critical cognitive functions that significantly influence our perception of the world. While these neural processes converge on the temporal cortex, the exact nature of their interactions remains largely unclear. Here, we systematically investigated the interplay between visual attention and object feature coding by training macaques to perform a free-gaze visual search task using natural face and object stimuli. With a large number of units recorded from multiple brain areas, we discovered that units exhibiting visual feature coding displayed a distinct attentional response profile and functional connectivity compared to units not exhibiting feature coding. Attention directed towards search targets enhanced the pattern separation of stimuli across brain areas, and this enhancement was more pronounced for units encoding visual features. Our findings suggest two stages of neural processing, with the early stage primarily focused on processing visual features and the late stage dedicated to processing attention. Importantly, feature coding in the early stage could predict the attentional effect in the late stage. Together, our results suggest an intricate interplay between visual feature and attention coding in the primate brain, which can be attributed to the differential functional connectivity and neural networks engaged in these processes.
Collapse
|
15
|
Miao HY, Tong F. Convolutional neural network models applied to neuronal responses in macaque V1 reveal limited nonlinear processing. J Vis 2024; 24:1. [PMID: 38829629 PMCID: PMC11156204 DOI: 10.1167/jov.24.6.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/03/2024] [Indexed: 06/05/2024] Open
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple nonlinearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more nonlinear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven nonlinear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although the predictive accuracy of VGG-19 was somewhat better than that of standard AlexNet, we found that a modified version of AlexNet could match the performance of VGG-19 after only a few nonlinear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for nonlinear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few nonlinear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
16
|
Djambazovska S, Zafer A, Ramezanpour H, Kreiman G, Kar K. The Impact of Scene Context on Visual Object Recognition: Comparing Humans, Monkeys, and Computational Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.27.596127. [PMID: 38854011 PMCID: PMC11160639 DOI: 10.1101/2024.05.27.596127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
During natural vision, we rarely see objects in isolation but rather embedded in rich and complex contexts. Understanding how the brain recognizes objects in natural scenes by integrating contextual information remains a key challenge. To elucidate neural mechanisms compatible with human visual processing, we need an animal model that behaves similarly to humans, so that inferred neural mechanisms can provide hypotheses relevant to the human brain. Here we assessed whether rhesus macaques could model human context-driven object recognition by quantifying visual object identification abilities across variations in the amount, quality, and congruency of contextual cues. Behavioral metrics revealed strikingly similar context-dependent patterns between humans and monkeys. However, neural responses in the inferior temporal (IT) cortex of monkeys that were never explicitly trained to discriminate objects in context, as well as current artificial neural network models, could only partially explain this cross-species correspondence. The shared behavioral variance unexplained by context-naive neural data or computational models highlights fundamental knowledge gaps. Our findings demonstrate an intriguing alignment of human and monkey visual object processing that defies full explanation by either brain activity in a key visual region or state-of-the-art models.
Collapse
Affiliation(s)
- Sara Djambazovska
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
- Children’s Hospital, Harvard Medical School, MA, USA
| | - Anaa Zafer
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| | - Hamidreza Ramezanpour
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| | | | - Kohitij Kar
- York University, Department of Biology and Centre for Vision Research, Toronto, Canada
| |
Collapse
|
17
|
Bougou V, Vanhoyland M, Bertrand A, Van Paesschen W, Op De Beeck H, Janssen P, Theys T. Neuronal tuning and population representations of shape and category in human visual cortex. Nat Commun 2024; 15:4608. [PMID: 38816391 PMCID: PMC11139926 DOI: 10.1038/s41467-024-49078-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 05/22/2024] [Indexed: 06/01/2024] Open
Abstract
Object recognition and categorization are essential cognitive processes which engage considerable neural resources in the human ventral visual stream. However, the tuning properties of human ventral stream neurons for object shape and category are virtually unknown. We performed large-scale recordings of spiking activity in human Lateral Occipital Complex in response to stimuli in which the shape dimension was dissociated from the category dimension. Consistent with studies in nonhuman primates, the neuronal representations were primarily shape-based, although we also observed category-like encoding for images of animals. Surprisingly, linear decoders could reliably classify stimulus category even in data sets that were entirely shape-based. In addition, many recording sites showed an interaction between shape and category tuning. These results represent a detailed study on shape and category coding at the neuronal level in the human ventral visual stream, furnishing essential evidence that reconciles human imaging and macaque single-cell studies.
Collapse
Affiliation(s)
- Vasiliki Bougou
- Research Group of Experimental Neurosurgery and Neuroanatomy, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium
- Laboratory for Neuro-and Psychophysiology, Research Group Neurophysiology, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium
| | - Michaël Vanhoyland
- Research Group of Experimental Neurosurgery and Neuroanatomy, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium
- Laboratory for Neuro-and Psychophysiology, Research Group Neurophysiology, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium
- Department of Neurosurgery, University Hospitals Leuven, Leuven, Belgium
| | | | - Wim Van Paesschen
- Department of Neurology, University Hospitals Leuven, Leuven, Belgium
- Laboratory for Epilepsy Research, KU Leuven, Leuven, Belgium
| | - Hans Op De Beeck
- Laboratory Biological Psychology, Department of Neurosciences, KU Leuven, Leuven, Belgium
| | - Peter Janssen
- Laboratory for Neuro-and Psychophysiology, Research Group Neurophysiology, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium.
| | - Tom Theys
- Research Group of Experimental Neurosurgery and Neuroanatomy, Department of Neurosciences, KU Leuven and the Leuven Brain Institute, Leuven, Belgium
- Department of Neurosurgery, University Hospitals Leuven, Leuven, Belgium
| |
Collapse
|
18
|
Pinheiro-Chagas P, Sava-Segal C, Akkol S, Daitch A, Parvizi J. Spatiotemporal Dynamics of Successive Activations across the Human Brain during Simple Arithmetic Processing. J Neurosci 2024; 44:e2118222024. [PMID: 38485257 PMCID: PMC11044197 DOI: 10.1523/jneurosci.2118-22.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 02/16/2024] [Accepted: 03/03/2024] [Indexed: 03/26/2024] Open
Abstract
Previous neuroimaging studies have offered unique insights about the spatial organization of activations and deactivations across the brain; however, these were not powered to explore the exact timing of events at the subsecond scale combined with a precise anatomical source of information at the level of individual brains. As a result, we know little about the order of engagement across different brain regions during a given cognitive task. Using experimental arithmetic tasks as a prototype for human-unique symbolic processing, we recorded directly across 10,076 brain sites in 85 human subjects (52% female) using the intracranial electroencephalography. Our data revealed a remarkably distributed change of activity in almost half of the sampled sites. In each activated brain region, we found juxtaposed neuronal populations preferentially responsive to either the target or control conditions, arranged in an anatomically orderly manner. Notably, an orderly successive activation of a set of brain regions-anatomically consistent across subjects-was observed in individual brains. The temporal order of activations across these sites was replicable across subjects and trials. Moreover, the degree of functional connectivity between the sites decreased as a function of temporal distance between regions, suggesting that the information is partially leaked or transformed along the processing chain. Our study complements prior imaging studies by providing hitherto unknown information about the timing of events in the brain during arithmetic processing. Such findings can be a basis for developing mechanistic computational models of human-specific cognitive symbolic systems.
Collapse
Affiliation(s)
- Pedro Pinheiro-Chagas
- Stanford Human Intracranial Cognitive Electrophysiology Program, Department of Neurology and Neurological Science, Stanford University, Stanford, California 94305
- UCSF Memory and Aging Center, Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, California
| | - Clara Sava-Segal
- Stanford Human Intracranial Cognitive Electrophysiology Program, Department of Neurology and Neurological Science, Stanford University, Stanford, California 94305
| | - Serdar Akkol
- Stanford Human Intracranial Cognitive Electrophysiology Program, Department of Neurology and Neurological Science, Stanford University, Stanford, California 94305
| | - Amy Daitch
- Stanford Human Intracranial Cognitive Electrophysiology Program, Department of Neurology and Neurological Science, Stanford University, Stanford, California 94305
| | - Josef Parvizi
- Stanford Human Intracranial Cognitive Electrophysiology Program, Department of Neurology and Neurological Science, Stanford University, Stanford, California 94305
| |
Collapse
|
19
|
Lu Z, Wang Y, Golomb JD. Achieving more human brain-like vision via human EEG representational alignment. ARXIV 2024:arXiv:2401.17231v2. [PMID: 38351926 PMCID: PMC10862929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressing this gap, we present, for the first time, 'Re(presentational)Al(ignment)net', a vision model aligned with human brain activity based on non-invasive EEG, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding framework advances human neural alignment by optimizing multiple model layers and enabling the model to efficiently learn and mimic human brain's visual representational patterns across object categories and different modalities. Our findings suggest that ReAlnet represents a breakthrough in bridging the gap between artificial and human vision, and paving the way for more brain-like artificial intelligence systems.
Collapse
Affiliation(s)
- Zitong Lu
- Department of Psychology, The Ohio State University
| | - Yile Wang
- Department of Neuroscience, The University of Texas at Dallas
| | | |
Collapse
|
20
|
Kay K, Biderman N, Khajeh R, Beiran M, Cueva CJ, Shohamy D, Jensen G, Wei XX, Ferrera VP, Abbott LF. Emergent neural dynamics and geometry for generalization in a transitive inference task. PLoS Comput Biol 2024; 20:e1011954. [PMID: 38662797 PMCID: PMC11125559 DOI: 10.1371/journal.pcbi.1011954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 05/24/2024] [Accepted: 02/28/2024] [Indexed: 05/25/2024] Open
Abstract
Relational cognition-the ability to infer relationships that generalize to novel combinations of objects-is fundamental to human and animal intelligence. Despite this importance, it remains unclear how relational cognition is implemented in the brain due in part to a lack of hypotheses and predictions at the levels of collective neural activity and behavior. Here we discovered, analyzed, and experimentally tested neural networks (NNs) that perform transitive inference (TI), a classic relational task (if A > B and B > C, then A > C). We found NNs that (i) generalized perfectly, despite lacking overt transitive structure prior to training, (ii) generalized when the task required working memory (WM), a capacity thought to be essential to inference in the brain, (iii) emergently expressed behaviors long observed in living subjects, in addition to a novel order-dependent behavior, and (iv) expressed different task solutions yielding alternative behavioral and neural predictions. Further, in a large-scale experiment, we found that human subjects performing WM-based TI showed behavior inconsistent with a class of NNs that characteristically expressed an intuitive task solution. These findings provide neural insights into a classical relational ability, with wider implications for how the brain realizes relational cognition.
Collapse
Affiliation(s)
- Kenneth Kay
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Center for Theoretical Neuroscience, Columbia University, New York, New York, United States of America
- Grossman Center for the Statistics of Mind, Columbia University, New York, New York, United States of America
| | - Natalie Biderman
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Department of Psychology, Columbia University, New York, New York, United States of America
| | - Ramin Khajeh
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Center for Theoretical Neuroscience, Columbia University, New York, New York, United States of America
| | - Manuel Beiran
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Center for Theoretical Neuroscience, Columbia University, New York, New York, United States of America
| | - Christopher J. Cueva
- Department of Brain and Cognitive Sciences, MIT, Cambridge, Massachusetts, United States of America
| | - Daphna Shohamy
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Department of Psychology, Columbia University, New York, New York, United States of America
- The Kavli Institute for Brain Science, Columbia University, New York, New York, United States of America
| | - Greg Jensen
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Department of Neuroscience, Columbia University Medical Center, New York, New York, United States of America
- Department of Psychology at Reed College, Portland, Oregon, United States of America
| | - Xue-Xin Wei
- Departments of Neuroscience and Psychology, The University of Texas at Austin, Austin, Texas, United States of America
| | - Vincent P. Ferrera
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Department of Neuroscience, Columbia University Medical Center, New York, New York, United States of America
- Department of Psychiatry, Columbia University Medical Center, New York, New York, United States of America
| | - LF Abbott
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, United States of America
- Center for Theoretical Neuroscience, Columbia University, New York, New York, United States of America
- The Kavli Institute for Brain Science, Columbia University, New York, New York, United States of America
- Department of Neuroscience, Columbia University Medical Center, New York, New York, United States of America
| |
Collapse
|
21
|
Goris RLT, Coen-Cagli R, Miller KD, Priebe NJ, Lengyel M. Response sub-additivity and variability quenching in visual cortex. Nat Rev Neurosci 2024; 25:237-252. [PMID: 38374462 PMCID: PMC11444047 DOI: 10.1038/s41583-024-00795-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/24/2024] [Indexed: 02/21/2024]
Abstract
Sub-additivity and variability are ubiquitous response motifs in the primary visual cortex (V1). Response sub-additivity enables the construction of useful interpretations of the visual environment, whereas response variability indicates the factors that limit the precision with which the brain can do this. There is increasing evidence that experimental manipulations that elicit response sub-additivity often also quench response variability. Here, we provide an overview of these phenomena and suggest that they may have common origins. We discuss empirical findings and recent model-based insights into the functional operations, computational objectives and circuit mechanisms underlying V1 activity. These different modelling approaches all predict that response sub-additivity and variability quenching often co-occur. The phenomenology of these two response motifs, as well as many of the insights obtained about them in V1, generalize to other cortical areas. Thus, the connection between response sub-additivity and variability quenching may be a canonical motif across the cortex.
Collapse
Affiliation(s)
- Robbe L T Goris
- Center for Perceptual Systems, University of Texas at Austin, Austin, TX, USA.
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kenneth D Miller
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
- Kavli Institute for Brain Science, Columbia University, New York, NY, USA
- Dept. of Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA
- Morton B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Swartz Program in Theoretical Neuroscience, Columbia University, New York, NY, USA
| | - Nicholas J Priebe
- Center for Learning and Memory, University of Texas at Austin, Austin, TX, USA
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
22
|
Brady TF, Störmer VS. Comparing memory capacity across stimuli requires maximally dissimilar foils: Using deep convolutional neural networks to understand visual working memory capacity for real-world objects. Mem Cognit 2024; 52:595-609. [PMID: 37973770 DOI: 10.3758/s13421-023-01485-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2023] [Indexed: 11/19/2023]
Abstract
The capacity of visual working and visual long-term memory plays a critical role in theories of cognitive architecture and the relationship between memory and other cognitive systems. Here, we argue that before asking the question of how capacity varies across different stimuli or what the upper bound of capacity is for a given memory system, it is necessary to establish a methodology that allows a fair comparison between distinct stimulus sets and conditions. One of the most important factors determining performance in a memory task is target/foil dissimilarity. We argue that only by maximizing the dissimilarity of the target and foil in each stimulus set can we provide a fair basis for memory comparisons between stimuli. In the current work we focus on a way to pick such foils objectively for complex, meaningful real-world objects by using deep convolutional neural networks, and we validate this using both memory tests and similarity metrics. Using this method, we then provide evidence that there is a greater capacity for real-world objects relative to simple colors in visual working memory; critically, we also show that this difference can be reduced or eliminated when non-comparable foils are used, potentially explaining why previous work has not always found such a difference. Our study thus demonstrates that working memory capacity depends on the type of information that is remembered and that assessing capacity depends critically on foil dissimilarity, especially when comparing memory performance and other cognitive systems across different stimulus sets.
Collapse
Affiliation(s)
- Timothy F Brady
- Department of Psychology, University of California San Diego, La Jolla, CA, 92093, USA.
| | - Viola S Störmer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA
| |
Collapse
|
23
|
Dwivedi K, Sadiya S, Balode MP, Roig G, Cichy RM. Visual features are processed before navigational affordances in the human brain. Sci Rep 2024; 14:5573. [PMID: 38448446 PMCID: PMC10917749 DOI: 10.1038/s41598-024-55652-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 02/26/2024] [Indexed: 03/08/2024] Open
Abstract
To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded human brain responses to visual scenes with electroencephalography and related those to computational models that operationalize three aspects of scene processing (2D, 3D, and semantic information), as well as to a behavioral model capturing navigational affordances. We found a temporal processing hierarchy: navigational affordance is processed later than the other scene features (2D, 3D, and semantic) investigated. This reveals the temporal order with which the human brain computes complex scene information and suggests that the brain leverages these pieces of information to plan navigation.
Collapse
Affiliation(s)
- Kshitij Dwivedi
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
| | - Sari Sadiya
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany.
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt, Germany.
| | - Marta P Balode
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
- Institute of Neuroinformatics, ETH Zurich and University of Zurich, Zurich, Switzerland
| | - Gemma Roig
- Department of Computer Science, Goethe University Frankfurt, Frankfurt, Germany
- The Hessian Center for Artificial Intelligence (hessian.AI), Darmstadt, Germany
| | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
24
|
Jang H, Tong F. Improved modeling of human vision by incorporating robustness to blur in convolutional neural networks. Nat Commun 2024; 15:1989. [PMID: 38443349 PMCID: PMC10915141 DOI: 10.1038/s41467-024-45679-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 01/30/2024] [Indexed: 03/07/2024] Open
Abstract
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide multi-faceted neurocomputational evidence that blurry visual experiences may be critical for conferring robustness to biological visual systems.
Collapse
Affiliation(s)
- Hojin Jang
- Department of Psychology, Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA.
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea.
| | - Frank Tong
- Department of Psychology, Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
25
|
Loke J, Seijdel N, Snoek L, Sörensen LKA, van de Klundert R, van der Meer M, Quispel E, Cappaert N, Scholte HS. Human Visual Cortex and Deep Convolutional Neural Network Care Deeply about Object Background. J Cogn Neurosci 2024; 36:551-566. [PMID: 38165735 DOI: 10.1162/jocn_a_02098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Deep convolutional neural networks (DCNNs) are able to partially predict brain activity during object categorization tasks, but factors contributing to this predictive power are not fully understood. Our study aimed to investigate the factors contributing to the predictive power of DCNNs in object categorization tasks. We compared the activity of four DCNN architectures with EEG recordings obtained from 62 human participants during an object categorization task. Previous physiological studies on object categorization have highlighted the importance of figure-ground segregation-the ability to distinguish objects from their backgrounds. Therefore, we investigated whether figure-ground segregation could explain the predictive power of DCNNs. Using a stimulus set consisting of identical target objects embedded in different backgrounds, we examined the influence of object background versus object category within both EEG and DCNN activity. Crucially, the recombination of naturalistic objects and experimentally controlled backgrounds creates a challenging and naturalistic task, while retaining experimental control. Our results showed that early EEG activity (< 100 msec) and early DCNN layers represent object background rather than object category. We also found that the ability of DCNNs to predict EEG activity is primarily influenced by how both systems process object backgrounds, rather than object categories. We demonstrated the role of figure-ground segregation as a potential prerequisite for recognition of object features, by contrasting the activations of trained and untrained (i.e., random weights) DCNNs. These findings suggest that both human visual cortex and DCNNs prioritize the segregation of object backgrounds and target objects to perform object categorization. Altogether, our study provides new insights into the mechanisms underlying object categorization as we demonstrated that both human visual cortex and DCNNs care deeply about object background.
Collapse
|
26
|
Wang J, Cao R, Chakravarthula PN, Li X, Wang S. A critical period for developing face recognition. PATTERNS (NEW YORK, N.Y.) 2024; 5:100895. [PMID: 38370121 PMCID: PMC10873156 DOI: 10.1016/j.patter.2023.100895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 02/20/2024]
Abstract
Face learning has important critical periods during development. However, the computational mechanisms of critical periods remain unknown. Here, we conducted a series of in silico experiments and showed that, similar to humans, deep artificial neural networks exhibited critical periods during which a stimulus deficit could impair the development of face learning. Face learning could only be restored when providing information within the critical period, whereas, outside of the critical period, the model could not incorporate new information anymore. We further provided a full computational account by learning rate and demonstrated an alternative approach by knowledge distillation and attention transfer to partially recover the model outside of the critical period. We finally showed that model performance and recovery were associated with identity-selective units and the correspondence with the primate visual systems. Our present study not only reveals computational mechanisms underlying face learning but also points to strategies to restore impaired face learning.
Collapse
Affiliation(s)
- Jinge Wang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Runnan Cao
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA
| | | | - Xin Li
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Computer Science, University at Albany, Albany, NY 12222, USA
| | - Shuo Wang
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
- Department of Radiology, Washington University in St. Louis, St. Louis, MO 63110, USA
| |
Collapse
|
27
|
Srivastava S, Wang WY, Eckstein MP. Emergent human-like covert attention in feedforward convolutional neural networks. Curr Biol 2024; 34:579-593.e12. [PMID: 38244541 DOI: 10.1016/j.cub.2023.12.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 10/09/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024]
Abstract
Covert attention allows the selection of locations or features of the visual scene without moving the eyes. Cues and contexts predictive of a target's location orient covert attention and improve perceptual performance. The performance benefits are widely attributed to theories of covert attention as a limited resource, zoom, spotlight, or weighting of visual information. However, such concepts are difficult to map to neuronal populations. We show that a feedforward convolutional neural network (CNN) trained on images to optimize target detection accuracy and with no explicit incorporation of an attention mechanism, a limited resource, or feedback connections learns to utilize cues and contexts in the three most prominent covert attention tasks (Posner cueing, set size effects in search, and contextual cueing) and predicts the cue/context influences on human accuracy. The CNN's cueing/context effects generalize across network training schemes, to peripheral and central pre-cues, discrimination tasks, and reaction time measures, and critically do not vary with reductions in network resources (size). The CNN shows comparable cueing/context effects to a model that optimally uses image information to make decisions (Bayesian ideal observer) but generalizes these effects to cue instances unseen during training. Together, the findings suggest that human-like behavioral signatures of covert attention in the three landmark paradigms might be an emergent property of task accuracy optimization in neuronal populations without positing limited attentional resources. The findings might explain recent behavioral results showing cueing and context effects across a variety of simple organisms with no neocortex, from archerfish to fruit flies.
Collapse
Affiliation(s)
- Sudhanshu Srivastava
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - William Yang Wang
- Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - Miguel P Eckstein
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| |
Collapse
|
28
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|
29
|
Raman R, Bognár A, Nejad GG, Taubert N, Giese M, Vogels R. Bodies in motion: Unraveling the distinct roles of motion and shape in dynamic body responses in the temporal cortex. Cell Rep 2023; 42:113438. [PMID: 37995183 PMCID: PMC10783614 DOI: 10.1016/j.celrep.2023.113438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/26/2023] [Accepted: 10/26/2023] [Indexed: 11/25/2023] Open
Abstract
The temporal cortex represents social stimuli, including bodies. We examine and compare the contributions of dynamic and static features to the single-unit responses to moving monkey bodies in and between a patch in the anterior dorsal bank of the superior temporal sulcus (dorsal patch [DP]) and patches in the anterior inferotemporal cortex (ventral patch [VP]), using fMRI guidance in macaques. The response to dynamics varies within both regions, being higher in DP. The dynamic body selectivity of VP neurons correlates with static features derived from convolutional neural networks and motion. DP neurons' dynamic body selectivity is not predicted by static features but is dominated by motion. Whereas these data support the dominance of motion in the newly proposed "dynamic social perception" stream, they challenge the traditional view that distinguishes DP and VP processing in terms of motion versus static features, underscoring the role of inferotemporal neurons in representing body dynamics.
Collapse
Affiliation(s)
- Rajani Raman
- Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium
| | - Anna Bognár
- Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium
| | - Ghazaleh Ghamkhari Nejad
- Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium
| | - Nick Taubert
- Hertie Institute for Clinical Brain Research and Center for Integrative Neuroscience, University Clinic Tuebingen, 72074 Tuebingen, Germany
| | - Martin Giese
- Hertie Institute for Clinical Brain Research and Center for Integrative Neuroscience, University Clinic Tuebingen, 72074 Tuebingen, Germany
| | - Rufin Vogels
- Department of Neurosciences, KU Leuven, 3000 Leuven, Belgium; Leuven Brain Institute, KU Leuven, 3000 Leuven, Belgium.
| |
Collapse
|
30
|
Shi Y, Bi D, Hesse JK, Lanfranchi FF, Chen S, Tsao DY. Rapid, concerted switching of the neural code in inferotemporal cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.06.570341. [PMID: 38106108 PMCID: PMC10723419 DOI: 10.1101/2023.12.06.570341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
A fundamental paradigm in neuroscience is the concept of neural coding through tuning functions 1 . According to this idea, neurons encode stimuli through fixed mappings of stimulus features to firing rates. Here, we report that the tuning of visual neurons can rapidly and coherently change across a population to attend to a whole and its parts. We set out to investigate a longstanding debate concerning whether inferotemporal (IT) cortex uses a specialized code for representing specific types of objects or whether it uses a general code that applies to any object. We found that face cells in macaque IT cortex initially adopted a general code optimized for face detection. But following a rapid, concerted population event lasting < 20 ms, the neural code transformed into a face-specific one with two striking properties: (i) response gradients to principal detection-related dimensions reversed direction, and (ii) new tuning developed to multiple higher feature space dimensions supporting fine face discrimination. These dynamics were face specific and did not occur in response to objects. Overall, these results show that, for faces, face cells shift from detection to discrimination by switching from an object-general code to a face-specific code. More broadly, our results suggest a novel mechanism for neural representation: concerted, stimulus-dependent switching of the neural code used by a cortical area.
Collapse
|
31
|
Schnell AE, Leemans M, Vinken K, Op de Beeck H. A computationally informed comparison between the strategies of rodents and humans in visual object recognition. eLife 2023; 12:RP87719. [PMID: 38079481 PMCID: PMC10712954 DOI: 10.7554/elife.87719] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2023] Open
Abstract
Many species are able to recognize objects, but it has been proven difficult to pinpoint and compare how different species solve this task. Recent research suggested to combine computational and animal modelling in order to obtain a more systematic understanding of task complexity and compare strategies between species. In this study, we created a large multidimensional stimulus set and designed a visual discrimination task partially based upon modelling with a convolutional deep neural network (CNN). Experiments included rats (N = 11; 1115 daily sessions in total for all rats together) and humans (N = 45). Each species was able to master the task and generalize to a variety of new images. Nevertheless, rats and humans showed very little convergence in terms of which object pairs were associated with high and low performance, suggesting the use of different strategies. There was an interaction between species and whether stimulus pairs favoured early or late processing in a CNN. A direct comparison with CNN representations and visual feature analyses revealed that rat performance was best captured by late convolutional layers and partially by visual features such as brightness and pixel-level similarity, while human performance related more to the higher-up fully connected layers. These findings highlight the additional value of using a computational approach for the design of object recognition tasks. Overall, this computationally informed investigation of object recognition behaviour reveals a strong discrepancy in strategies between rodent and human vision.
Collapse
Affiliation(s)
| | - Maarten Leemans
- Department of Brain and Cognition & Leuven Brain InstituteLeuvenBelgium
| | - Kasper Vinken
- Department of Neurobiology, Harvard Medical SchoolBostonUnited States
| | - Hans Op de Beeck
- Department of Brain and Cognition & Leuven Brain InstituteLeuvenBelgium
| |
Collapse
|
32
|
McMahon E, Isik L. Seeing social interactions. Trends Cogn Sci 2023; 27:1165-1179. [PMID: 37805385 PMCID: PMC10841760 DOI: 10.1016/j.tics.2023.09.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 09/01/2023] [Accepted: 09/05/2023] [Indexed: 10/09/2023]
Abstract
Seeing the interactions between other people is a critical part of our everyday visual experience, but recognizing the social interactions of others is often considered outside the scope of vision and grouped with higher-level social cognition like theory of mind. Recent work, however, has revealed that recognition of social interactions is efficient and automatic, is well modeled by bottom-up computational algorithms, and occurs in visually-selective regions of the brain. We review recent evidence from these three methodologies (behavioral, computational, and neural) that converge to suggest the core of social interaction perception is visual. We propose a computational framework for how this process is carried out in the brain and offer directions for future interdisciplinary investigations of social perception.
Collapse
Affiliation(s)
- Emalie McMahon
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
33
|
von Seth J, Nicholls VI, Tyler LK, Clarke A. Recurrent connectivity supports higher-level visual and semantic object representations in the brain. Commun Biol 2023; 6:1207. [PMID: 38012301 PMCID: PMC10682037 DOI: 10.1038/s42003-023-05565-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023] Open
Abstract
Visual object recognition has been traditionally conceptualised as a predominantly feedforward process through the ventral visual pathway. While feedforward artificial neural networks (ANNs) can achieve human-level classification on some image-labelling tasks, it's unclear whether computational models of vision alone can accurately capture the evolving spatiotemporal neural dynamics. Here, we probe these dynamics using a combination of representational similarity and connectivity analyses of fMRI and MEG data recorded during the recognition of familiar, unambiguous objects. Modelling the visual and semantic properties of our stimuli using an artificial neural network as well as a semantic feature model, we find that unique aspects of the neural architecture and connectivity dynamics relate to visual and semantic object properties. Critically, we show that recurrent processing between the anterior and posterior ventral temporal cortex relates to higher-level visual properties prior to semantic object properties, in addition to semantic-related feedback from the frontal lobe to the ventral temporal lobe between 250 and 500 ms after stimulus onset. These results demonstrate the distinct contributions made by semantic object properties in explaining neural activity and connectivity, highlighting it as a core part of object recognition not fully accounted for by current biologically inspired neural networks.
Collapse
Affiliation(s)
- Jacqueline von Seth
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | | | - Lorraine K Tyler
- Department of Psychology, University of Cambridge, Cambridge, UK
- Cambridge Centre for Ageing and Neuroscience (Cam-CAN), University of Cambridge and MRC Cognition and Brain Sciences Unit, Cambridge, UK
| | - Alex Clarke
- Department of Psychology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
34
|
Pinheiro-Chagas P, Sava-Segal C, Akkol S, Daitch A, Parvizi J. Spatiotemporal dynamics of successive activations across the human brain during simple arithmetic processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.22.568334. [PMID: 38045319 PMCID: PMC10690273 DOI: 10.1101/2023.11.22.568334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Previous neuroimaging studies have offered unique insights about the spatial organization of activations and deactivations across the brain, however these were not powered to explore the exact timing of events at the subsecond scale combined with precise anatomical source information at the level of individual brains. As a result, we know little about the order of engagement across different brain regions during a given cognitive task. Using experimental arithmetic tasks as a prototype for human-unique symbolic processing, we recorded directly across 10,076 brain sites in 85 human subjects (52% female) using intracranial electroencephalography (iEEG). Our data revealed a remarkably distributed change of activity in almost half of the sampled sites. Notably, an orderly successive activation of a set of brain regions - anatomically consistent across subjects-was observed in individual brains. Furthermore, the temporal order of activations across these sites was replicable across subjects and trials. Moreover, the degree of functional connectivity between the sites decreased as a function of temporal distance between regions, suggesting that information is partially leaked or transformed along the processing chain. Furthermore, in each activated region, distinct neuronal populations with opposite activity patterns during target and control conditions were juxtaposed in an anatomically orderly manner. Our study complements the prior imaging studies by providing hitherto unknown information about the timing of events in the brain during arithmetic processing. Such findings can be a basis for developing mechanistic computational models of human-specific cognitive symbolic systems. Significance statement Our study elucidates the spatiotemporal dynamics and anatomical specificity of brain activations across >10,000 sites during arithmetic tasks, as captured by intracranial EEG. We discovered an orderly, successive activation of brain regions, consistent across individuals, and a decrease in functional connectivity as a function of temporal distance between regions. Our findings provide unprecedented insights into the sequence of cognitive processing and regional interactions, offering a novel perspective for enhancing computational models of cognitive symbolic systems.
Collapse
|
35
|
Jarne C, Laje R. Exploring weight initialization, diversity of solutions, and degradation in recurrent neural networks trained for temporal and decision-making tasks. J Comput Neurosci 2023; 51:407-431. [PMID: 37561278 DOI: 10.1007/s10827-023-00857-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 05/26/2023] [Accepted: 06/27/2023] [Indexed: 08/11/2023]
Abstract
Recurrent Neural Networks (RNNs) are frequently used to model aspects of brain function and structure. In this work, we trained small fully-connected RNNs to perform temporal and flow control tasks with time-varying stimuli. Our results show that different RNNs can solve the same task by converging to different underlying dynamics and also how the performance gracefully degrades as either network size is decreased, interval duration is increased, or connectivity damage is induced. For the considered tasks, we explored how robust the network obtained after training can be according to task parameterization. In the process, we developed a framework that can be useful to parameterize other tasks of interest in computational neuroscience. Our results are useful to quantify different aspects of the models, which are normally used as black boxes and need to be understood in order to model the biological response of cerebral cortex areas.
Collapse
Affiliation(s)
- Cecilia Jarne
- Universidad Nacional de Quilmes, Departamento de Ciencia y Tecnología, Bernal, Buenos Aires, Argentina.
- CONICET, Buenos Aires, Argentina.
- Center for Functionally Integrative Neuroscience, Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
| | - Rodrigo Laje
- Universidad Nacional de Quilmes, Departamento de Ciencia y Tecnología, Bernal, Buenos Aires, Argentina
- CONICET, Buenos Aires, Argentina
| |
Collapse
|
36
|
Karapetian A, Boyanova A, Pandaram M, Obermayer K, Kietzmann TC, Cichy RM. Empirically Identifying and Computationally Modeling the Brain-Behavior Relationship for Human Scene Categorization. J Cogn Neurosci 2023; 35:1879-1897. [PMID: 37590093 PMCID: PMC10586810 DOI: 10.1162/jocn_a_02043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modeling. For the empirical part, we collected EEG data and RTs from human participants during a scene categorization task (natural vs. man-made). We then related EEG data to behavior to behavior using a multivariate extension of signal detection theory. We observed a correlation between neural data and behavior specifically between ∼100 msec and ∼200 msec after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network (RCNN) as a model of brain and behavior. Unifying our previous observations in an image-computable model, the RCNN predicted well the neural representations, the behavioral scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioral correlates of scene categorization in humans.
Collapse
Affiliation(s)
- Agnessa Karapetian
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | | | | | - Klaus Obermayer
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Technische Universität Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| | | | - Radoslaw M Cichy
- Freie Universität Berlin, Germany
- Charité - Universitätsmedizin Berlin, Einstein Center for Neurosciences Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
- Humboldt-Universität zu Berlin, Germany
| |
Collapse
|
37
|
Velarde OM, Makse HA, Parra LC. Architecture of the brain's visual system enhances network stability and performance through layers, delays, and feedback. PLoS Comput Biol 2023; 19:e1011078. [PMID: 37948463 PMCID: PMC10664920 DOI: 10.1371/journal.pcbi.1011078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/22/2023] [Accepted: 10/19/2023] [Indexed: 11/12/2023] Open
Abstract
In the visual system of primates, image information propagates across successive cortical areas, and there is also local feedback within an area and long-range feedback across areas. Recent findings suggest that the resulting temporal dynamics of neural activity are crucial in several vision tasks. In contrast, artificial neural network models of vision are typically feedforward and do not capitalize on the benefits of temporal dynamics, partly due to concerns about stability and computational costs. In this study, we focus on recurrent networks with feedback connections for visual tasks with static input corresponding to a single fixation. We demonstrate mathematically that a network's dynamics can be stabilized by four key features of biological networks: layer-ordered structure, temporal delays between layers, longer distance feedback across layers, and nonlinear neuronal responses. Conversely, when feedback has a fixed distance, one can omit delays in feedforward connections to achieve more efficient artificial implementations. We also evaluated the effect of feedback connections on object detection and classification performance using standard benchmarks, specifically the COCO and CIFAR10 datasets. Our findings indicate that feedback connections improved the detection of small objects, and classification performance became more robust to noise. We found that performance increased with the temporal dynamics, not unlike what is observed in core vision of primates. These results suggest that delays and layered organization are crucial features for stability and performance in both biological and artificial recurrent neural networks.
Collapse
Affiliation(s)
- Osvaldo Matias Velarde
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| | - Hernán A. Makse
- Levich Institute and Physics Department, The City College of New York, New York, New York, United States of America
| | - Lucas C. Parra
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| |
Collapse
|
38
|
Toosi T, Issa EB. Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment. ARXIV 2023:arXiv:2310.20599v1. [PMID: 37961740 PMCID: PMC10635293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In natural vision, feedback connections support versatile visual inference capabilities such as making sense of the occluded or noisy bottom-up sensory information or mediating pure top-down processes such as imagination. However, the mechanisms by which the feedback pathway learns to give rise to these capabilities flexibly are not clear. We propose that top-down effects emerge through alignment between feedforward and feedback pathways, each optimizing its own objectives. To achieve this co-optimization, we introduce Feedback-Feedforward Alignment (FFA), a learning algorithm that leverages feedback and feedforward pathways as mutual credit assignment computational graphs, enabling alignment. In our study, we demonstrate the effectiveness of FFA in co-optimizing classification and reconstruction tasks on widely used MNIST and CIFAR10 datasets. Notably, the alignment mechanism in FFA endows feedback connections with emergent visual inference functions, including denoising, resolving occlusions, hallucination, and imagination. Moreover, FFA offers bio-plausibility compared to traditional back-propagation (BP) methods in implementation. By repurposing the computational graph of credit assignment into a goal-driven feedback pathway, FFA alleviates weight transport problems encountered in BP, enhancing the bio-plausibility of the learning algorithm. Our study presents FFA as a promising proof-of-concept for the mechanisms underlying how feedback connections in the visual cortex support flexible visual functions. This work also contributes to the broader field of visual inference underlying perceptual phenomena and has implications for developing more biologically inspired learning algorithms.
Collapse
Affiliation(s)
- Tahereh Toosi
- Center for Theoretical Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| | - Elias B. Issa
- Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY
| |
Collapse
|
39
|
Pham TQ, Matsui T, Chikazoe J. Evaluation of the Hierarchical Correspondence between the Human Brain and Artificial Neural Networks: A Review. BIOLOGY 2023; 12:1330. [PMID: 37887040 PMCID: PMC10604784 DOI: 10.3390/biology12101330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/22/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Artificial neural networks (ANNs) that are heavily inspired by the human brain now achieve human-level performance across multiple task domains. ANNs have thus drawn attention in neuroscience, raising the possibility of providing a framework for understanding the information encoded in the human brain. However, the correspondence between ANNs and the brain cannot be measured directly. They differ in outputs and substrates, neurons vastly outnumber their ANN analogs (i.e., nodes), and the key algorithm responsible for most of modern ANN training (i.e., backpropagation) is likely absent from the brain. Neuroscientists have thus taken a variety of approaches to examine the similarity between the brain and ANNs at multiple levels of their information hierarchy. This review provides an overview of the currently available approaches and their limitations for evaluating brain-ANN correspondence.
Collapse
Affiliation(s)
| | - Teppei Matsui
- Graduate School of Brain Science, Doshisha University, Kyoto 610-0321, Japan
| | | |
Collapse
|
40
|
Schmid D, Jarvers C, Neumann H. Canonical circuit computations for computer vision. BIOLOGICAL CYBERNETICS 2023; 117:299-329. [PMID: 37306782 PMCID: PMC10600314 DOI: 10.1007/s00422-023-00966-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 05/18/2023] [Indexed: 06/13/2023]
Abstract
Advanced computer vision mechanisms have been inspired by neuroscientific findings. However, with the focus on improving benchmark achievements, technical solutions have been shaped by application and engineering constraints. This includes the training of neural networks which led to the development of feature detectors optimally suited to the application domain. However, the limitations of such approaches motivate the need to identify computational principles, or motifs, in biological vision that can enable further foundational advances in machine vision. We propose to utilize structural and functional principles of neural systems that have been largely overlooked. They potentially provide new inspirations for computer vision mechanisms and models. Recurrent feedforward, lateral, and feedback interactions characterize general principles underlying processing in mammals. We derive a formal specification of core computational motifs that utilize these principles. These are combined to define model mechanisms for visual shape and motion processing. We demonstrate how such a framework can be adopted to run on neuromorphic brain-inspired hardware platforms and can be extended to automatically adapt to environment statistics. We argue that the identified principles and their formalization inspires sophisticated computational mechanisms with improved explanatory scope. These and other elaborated, biologically inspired models can be employed to design computer vision solutions for different tasks and they can be used to advance neural network architectures of learning.
Collapse
Affiliation(s)
- Daniel Schmid
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Christian Jarvers
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Heiko Neumann
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| |
Collapse
|
41
|
Li JS, Sarma AA, Sejnowski TJ, Doyle JC. Internal feedback in the cortical perception-action loop enables fast and accurate behavior. Proc Natl Acad Sci U S A 2023; 120:e2300445120. [PMID: 37738297 PMCID: PMC10523540 DOI: 10.1073/pnas.2300445120] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 07/18/2023] [Indexed: 09/24/2023] Open
Abstract
Animals move smoothly and reliably in unpredictable environments. Models of sensorimotor control, drawing on control theory, have assumed that sensory information from the environment leads to actions, which then act back on the environment, creating a single, unidirectional perception-action loop. However, the sensorimotor loop contains internal delays in sensory and motor pathways, which can lead to unstable control. We show here that these delays can be compensated by internal feedback signals that flow backward, from motor toward sensory areas. This internal feedback is ubiquitous in neural sensorimotor systems, and we show how internal feedback compensates internal delays. This is accomplished by filtering out self-generated and other predictable changes so that unpredicted, actionable information can be rapidly transmitted toward action by the fastest components, effectively compressing the sensory input to more efficiently use feedforward pathways: Tracts of fast, giant neurons necessarily convey less accurate signals than tracts with many smaller neurons, but they are crucial for fast and accurate behavior. We use a mathematically tractable control model to show that internal feedback has an indispensable role in achieving state estimation, localization of function (how different parts of the cortex control different parts of the body), and attention, all of which are crucial for effective sensorimotor control. This control model can explain anatomical, physiological, and behavioral observations, including motor signals in the visual cortex, heterogeneous kinetics of sensory receptors, and the presence of giant cells in the cortex of humans as well as internal feedback patterns and unexplained heterogeneity in neural systems.
Collapse
Affiliation(s)
- Jing Shuang Li
- Control and Dynamical Systems, Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA91125
| | - Anish A. Sarma
- Control and Dynamical Systems, Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA91125
- School of Medicine, Vanderbilt University, Nashville, TN37232
| | - Terrence J. Sejnowski
- Department of Neurobiology, Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA92037
- Department of Neurobiology, Division of Biological Sciences, University of California San Diego, La Jolla, CA92093
| | - John C. Doyle
- Control and Dynamical Systems, Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA91125
| |
Collapse
|
42
|
Borgomaneri S, Zanon M, Di Luzio P, Cataneo A, Arcara G, Romei V, Tamietto M, Avenanti A. Increasing associative plasticity in temporo-occipital back-projections improves visual perception of emotions. Nat Commun 2023; 14:5720. [PMID: 37737239 PMCID: PMC10517146 DOI: 10.1038/s41467-023-41058-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 08/17/2023] [Indexed: 09/23/2023] Open
Abstract
The posterior superior temporal sulcus (pSTS) is a critical node in a network specialized for perceiving emotional facial expressions that is reciprocally connected with early visual cortices (V1/V2). Current models of perceptual decision-making increasingly assign relevance to recursive processing for visual recognition. However, it is unknown whether inducing plasticity into reentrant connections from pSTS to V1/V2 impacts emotion perception. Using a combination of electrophysiological and neurostimulation methods, we demonstrate that strengthening the connectivity from pSTS to V1/V2 selectively increases the ability to perceive facial expressions associated with emotions. This behavior is associated with increased electrophysiological activity in both these brain regions, particularly in V1/V2, and depends on specific temporal parameters of stimulation that follow Hebbian principles. Therefore, we provide evidence that pSTS-to-V1/V2 back-projections are instrumental to perception of emotion from facial stimuli and functionally malleable via manipulation of associative plasticity.
Collapse
Affiliation(s)
- Sara Borgomaneri
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy.
| | - Marco Zanon
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy
- Neuroscience Area, International School for Advanced Studies (SISSA), Trieste, Italy
| | - Paolo Di Luzio
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy
| | - Antonio Cataneo
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy
| | | | - Vincenzo Romei
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy
- Facultad de Lenguas y Educación, Universidad Antonio de Nebrija, Madrid, 28015, Spain
| | - Marco Tamietto
- Dipartimento di Psicologia, Università degli Studi di Torino, Torino, Italy.
- Department of Medical and Clinical Psychology, Tilburg University, Tilburg, The Netherlands.
| | - Alessio Avenanti
- Centro studi e ricerche in Neuroscienze Cognitive, Dipartimento di Psicologia "Renzo Canestrari", Alma Mater Studiorum Università di Bologna, Cesena Campus, Cesena, Italy.
- Centro de Investigación en Neuropsicología y Neurociencias Cognitivas, Universidad Católica del Maule, Talca, Chile.
| |
Collapse
|
43
|
Pan X, DeForge A, Schwartz O. Generalizing biological surround suppression based on center surround similarity via deep neural network models. PLoS Comput Biol 2023; 19:e1011486. [PMID: 37738258 PMCID: PMC10550176 DOI: 10.1371/journal.pcbi.1011486] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 10/04/2023] [Accepted: 09/04/2023] [Indexed: 09/24/2023] Open
Abstract
Sensory perception is dramatically influenced by the context. Models of contextual neural surround effects in vision have mostly accounted for Primary Visual Cortex (V1) data, via nonlinear computations such as divisive normalization. However, surround effects are not well understood within a hierarchy, for neurons with more complex stimulus selectivity beyond V1. We utilized feedforward deep convolutional neural networks and developed a gradient-based technique to visualize the most suppressive and excitatory surround. We found that deep neural networks exhibited a key signature of surround effects in V1, highlighting center stimuli that visually stand out from the surround and suppressing responses when the surround stimulus is similar to the center. We found that in some neurons, especially in late layers, when the center stimulus was altered, the most suppressive surround surprisingly can follow the change. Through the visualization approach, we generalized previous understanding of surround effects to more complex stimuli, in ways that have not been revealed in visual cortices. In contrast, the suppression based on center surround similarity was not observed in an untrained network. We identified further successes and mismatches of the feedforward CNNs to the biology. Our results provide a testable hypothesis of surround effects in higher visual cortices, and the visualization approach could be adopted in future biological experimental designs.
Collapse
Affiliation(s)
- Xu Pan
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| | - Annie DeForge
- School of Information, University of California, Berkeley, CA, United States of America
- Bentley University, Waltham, MA, United States of America
| | - Odelia Schwartz
- Department of Computer Science, University of Miami, Coral Gables, FL, United States of America
| |
Collapse
|
44
|
Rowland JM, van der Plas TL, Loidolt M, Lees RM, Keeling J, Dehning J, Akam T, Priesemann V, Packer AM. Propagation of activity through the cortical hierarchy and perception are determined by neural variability. Nat Neurosci 2023; 26:1584-1594. [PMID: 37640911 PMCID: PMC10471496 DOI: 10.1038/s41593-023-01413-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 07/18/2023] [Indexed: 08/31/2023]
Abstract
Brains are composed of anatomically and functionally distinct regions performing specialized tasks, but regions do not operate in isolation. Orchestration of complex behaviors requires communication between brain regions, but how neural dynamics are organized to facilitate reliable transmission is not well understood. Here we studied this process directly by generating neural activity that propagates between brain regions and drives behavior, assessing how neural populations in sensory cortex cooperate to transmit information. We achieved this by imaging two densely interconnected regions-the primary and secondary somatosensory cortex (S1 and S2)-in mice while performing two-photon photostimulation of S1 neurons and assigning behavioral salience to the photostimulation. We found that the probability of perception is determined not only by the strength of the photostimulation but also by the variability of S1 neural activity. Therefore, maximizing the signal-to-noise ratio of the stimulus representation in cortex relative to the noise or variability is critical to facilitate activity propagation and perception.
Collapse
Affiliation(s)
- James M Rowland
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
| | - Thijs L van der Plas
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
| | - Matthias Loidolt
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
- Laboratory for Molecular Cell Biology, University College London, London, UK
| | - Robert M Lees
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
- Science and Technology Facilities Council, Octopus Imaging Facility, Research Complex at Harwell, Harwell Campus, Oxfordshire, UK
| | - Joshua Keeling
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
| | - Jonas Dehning
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
| | - Thomas Akam
- Department of Experimental Psychology, University of Oxford, Oxford, UK
- Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | - Viola Priesemann
- Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
- Institute for the Dynamics of Complex Systems, University of Göttingen, Göttingen, Germany
| | - Adam M Packer
- Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
45
|
Vacher J, Launay C, Mamassian P, Coen-Cagli R. Measuring uncertainty in human visual segmentation. PLoS Comput Biol 2023; 19:e1011483. [PMID: 37747914 PMCID: PMC10553811 DOI: 10.1371/journal.pcbi.1011483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 10/05/2023] [Accepted: 08/31/2023] [Indexed: 09/27/2023] Open
Abstract
Segmenting visual stimuli into distinct groups of features and visual objects is central to visual function. Classical psychophysical methods have helped uncover many rules of human perceptual segmentation, and recent progress in machine learning has produced successful algorithms. Yet, the computational logic of human segmentation remains unclear, partially because we lack well-controlled paradigms to measure perceptual segmentation maps and compare models quantitatively. Here we propose a new, integrated approach: given an image, we measure multiple pixel-based same-different judgments and perform model-based reconstruction of the underlying segmentation map. The reconstruction is robust to several experimental manipulations and captures the variability of individual participants. We demonstrate the validity of the approach on human segmentation of natural images and composite textures. We show that image uncertainty affects measured human variability, and it influences how participants weigh different visual features. Because any putative segmentation algorithm can be inserted to perform the reconstruction, our paradigm affords quantitative tests of theories of perception as well as new benchmarks for segmentation algorithms.
Collapse
Affiliation(s)
- Jonathan Vacher
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Claire Launay
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New-York, United States of America
| | - Pascal Mamassian
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New-York, United States of America
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New-York, United States of America
- Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, New-York, United States of America
| |
Collapse
|
46
|
Miao HY, Tong F. Convolutional neural network models of neuronal responses in macaque V1 reveal limited non-linear processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.26.554952. [PMID: 37693397 PMCID: PMC10491131 DOI: 10.1101/2023.08.26.554952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Computational models of the primary visual cortex (V1) have suggested that V1 neurons behave like Gabor filters followed by simple non-linearities. However, recent work employing convolutional neural network (CNN) models has suggested that V1 relies on far more non-linear computations than previously thought. Specifically, unit responses in an intermediate layer of VGG-19 were found to best predict macaque V1 responses to thousands of natural and synthetic images. Here, we evaluated the hypothesis that the poor performance of lower-layer units in VGG-19 might be attributable to their small receptive field size rather than to their lack of complexity per se. We compared VGG-19 with AlexNet, which has much larger receptive fields in its lower layers. Whereas the best-performing layer of VGG-19 occurred after seven non-linear steps, the first convolutional layer of AlexNet best predicted V1 responses. Although VGG-19's predictive accuracy was somewhat better than standard AlexNet, we found that a modified version of AlexNet could match VGG-19's performance after only a few non-linear computations. Control analyses revealed that decreasing the size of the input images caused the best-performing layer of VGG-19 to shift to a lower layer, consistent with the hypothesis that the relationship between image size and receptive field size can strongly affect model performance. We conducted additional analyses using a Gabor pyramid model to test for non-linear contributions of normalization and contrast saturation. Overall, our findings suggest that the feedforward responses of V1 neurons can be well explained by assuming only a few non-linear processing stages.
Collapse
Affiliation(s)
- Hui-Yuan Miao
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
| | - Frank Tong
- Department of Psychology, Vanderbilt University, Nashville, TN, 37240, USA
- Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, 37240, USA
| |
Collapse
|
47
|
Veerabadran V, Goldman J, Shankar S, Cheung B, Papernot N, Kurakin A, Goodfellow I, Shlens J, Sohl-Dickstein J, Mozer MC, Elsayed GF. Subtle adversarial image manipulations influence both human and machine perception. Nat Commun 2023; 14:4933. [PMID: 37582834 PMCID: PMC10427626 DOI: 10.1038/s41467-023-40499-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 08/01/2023] [Indexed: 08/17/2023] Open
Abstract
Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.
Collapse
Affiliation(s)
- Vijay Veerabadran
- Google, Mountain View, CA, USA
- Department of Cognitive Science, University of California, San Diego, CA, USA
| | | | - Shreya Shankar
- Google, Mountain View, CA, USA
- University of California, Berkeley, CA, USA
| | - Brian Cheung
- Google, Mountain View, CA, USA
- MIT Brain and Cognitive Sciences, Cambridge, MA, USA
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Dobs K, Yuan J, Martinez J, Kanwisher N. Behavioral signatures of face perception emerge in deep neural networks optimized for face recognition. Proc Natl Acad Sci U S A 2023; 120:e2220642120. [PMID: 37523537 PMCID: PMC10410721 DOI: 10.1073/pnas.2220642120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 06/08/2023] [Indexed: 08/02/2023] Open
Abstract
Human face recognition is highly accurate and exhibits a number of distinctive and well-documented behavioral "signatures" such as the use of a characteristic representational space, the disproportionate performance cost when stimuli are presented upside down, and the drop in accuracy for faces from races the participant is less familiar with. These and other phenomena have long been taken as evidence that face recognition is "special". But why does human face perception exhibit these properties in the first place? Here, we use deep convolutional neural networks (CNNs) to test the hypothesis that all of these signatures of human face perception result from optimization for the task of face recognition. Indeed, as predicted by this hypothesis, these phenomena are all found in CNNs trained on face recognition, but not in CNNs trained on object recognition, even when additionally trained to detect faces while matching the amount of face experience. To test whether these signatures are in principle specific to faces, we optimized a CNN on car discrimination and tested it on upright and inverted car images. As we found for face perception, the car-trained network showed a drop in performance for inverted vs. upright cars. Similarly, CNNs trained on inverted faces produced an inverted face inversion effect. These findings show that the behavioral signatures of human face perception reflect and are well explained as the result of optimization for the task of face recognition, and that the nature of the computations underlying this task may not be so special after all.
Collapse
Affiliation(s)
- Katharina Dobs
- Department of Psychology, Justus Liebig University Giessen, Giessen35394, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Marburg35302, Germany
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Joanne Yuan
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Julio Martinez
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Psychology, Stanford University, Stanford, CA94305
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA02139
| |
Collapse
|
49
|
Bernáez Timón L, Ekelmans P, Kraynyukova N, Rose T, Busse L, Tchumatchenko T. How to incorporate biological insights into network models and why it matters. J Physiol 2023; 601:3037-3053. [PMID: 36069408 DOI: 10.1113/jp282755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 08/24/2022] [Indexed: 11/08/2022] Open
Abstract
Due to the staggering complexity of the brain and its neural circuitry, neuroscientists rely on the analysis of mathematical models to elucidate its function. From Hodgkin and Huxley's detailed description of the action potential in 1952 to today, new theories and increasing computational power have opened up novel avenues to study how neural circuits implement the computations that underlie behaviour. Computational neuroscientists have developed many models of neural circuits that differ in complexity, biological realism or emergent network properties. With recent advances in experimental techniques for detailed anatomical reconstructions or large-scale activity recordings, rich biological data have become more available. The challenge when building network models is to reflect experimental results, either through a high level of detail or by finding an appropriate level of abstraction. Meanwhile, machine learning has facilitated the development of artificial neural networks, which are trained to perform specific tasks. While they have proven successful at achieving task-oriented behaviour, they are often abstract constructs that differ in many features from the physiology of brain circuits. Thus, it is unclear whether the mechanisms underlying computation in biological circuits can be investigated by analysing artificial networks that accomplish the same function but differ in their mechanisms. Here, we argue that building biologically realistic network models is crucial to establishing causal relationships between neurons, synapses, circuits and behaviour. More specifically, we advocate for network models that consider the connectivity structure and the recorded activity dynamics while evaluating task performance.
Collapse
Affiliation(s)
- Laura Bernáez Timón
- Institute for Physiological Chemistry, University of Mainz Medical Center, Mainz, Germany
| | - Pierre Ekelmans
- Frankfurt Institute for Advanced Studies, Frankfurt, Germany
| | - Nataliya Kraynyukova
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| | - Tobias Rose
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| | - Laura Busse
- Division of Neurobiology, Faculty of Biology, LMU Munich, Munich, Germany
- Bernstein Center for Computational Neuroscience, Munich, Germany
| | - Tatjana Tchumatchenko
- Institute for Physiological Chemistry, University of Mainz Medical Center, Mainz, Germany
- Institute of Experimental Epileptology and Cognition Research, University of Bonn Medical Center, Bonn, Germany
| |
Collapse
|
50
|
Tscshantz A, Millidge B, Seth AK, Buckley CL. Hybrid predictive coding: Inferring, fast and slow. PLoS Comput Biol 2023; 19:e1011280. [PMID: 37531366 PMCID: PMC10395865 DOI: 10.1371/journal.pcbi.1011280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 06/20/2023] [Indexed: 08/04/2023] Open
Abstract
Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors"-the differences between predicted and observed data. Implicit in this proposal is the idea that successful perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception-including complex forms of object recognition-arise from an initial "feedforward sweep" that occurs on fast timescales which preclude substantial recurrent activity. Here, we propose that the feedforward sweep can be understood as performing amortized inference (applying a learned function that maps directly from data to beliefs) and recurrent processing can be understood as performing iterative inference (sequentially updating neural activity in order to improve the accuracy of beliefs). We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner by describing both in terms of a dual optimization of a single objective function. We show that the resulting scheme can be implemented in a biologically plausible neural architecture that approximates Bayesian inference utilising local Hebbian update rules. We demonstrate that our hybrid predictive coding model combines the benefits of both amortized and iterative inference-obtaining rapid and computationally cheap perceptual inference for familiar data while maintaining the context-sensitivity, precision, and sample efficiency of iterative inference schemes. Moreover, we show how our model is inherently sensitive to its uncertainty and adaptively balances iterative and amortized inference to obtain accurate beliefs using minimum computational expense. Hybrid predictive coding offers a new perspective on the functional relevance of the feedforward and recurrent activity observed during visual perception and offers novel insights into distinct aspects of visual phenomenology.
Collapse
Affiliation(s)
- Alexander Tscshantz
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Beren Millidge
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Brain Networks Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Anil K. Seth
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Christopher L. Buckley
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
| |
Collapse
|