1
|
Lalwani P, Polk T, Garrett DD. Modulation of brain signal variability in visual cortex reflects aging, GABA, and behavior. eLife 2025; 14:e83865. [PMID: 40243542 PMCID: PMC12005714 DOI: 10.7554/elife.83865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/30/2024] [Indexed: 04/18/2025] Open
Abstract
Moment-to-moment neural variability has been shown to scale positively with the complexity of stimulus input. However, the mechanisms underlying the ability to align variability to input complexity are unknown. Using a combination of behavioral methods, computational modeling, fMRI, MR spectroscopy, and pharmacological intervention, we investigated the role of aging and GABA in neural variability during visual processing. We replicated previous findings that participants expressed higher variability when viewing more complex visual stimuli. Additionally, we found that such variability modulation was associated with higher baseline visual GABA levels and was reduced in older adults. When pharmacologically increasing GABA activity, we found that participants with lower baseline GABA levels showed a drug-related increase in variability modulation while participants with higher baseline GABA showed no change or even a reduction, consistent with an inverted-U account. Finally, higher baseline GABA and variability modulation were jointly associated with better visual-discrimination performance. These results suggest that GABA plays an important role in how humans utilize neural variability to adapt to the complexity of the visual world.
Collapse
Affiliation(s)
- Poortata Lalwani
- Department of Psychology, University of MichiganAnn ArborUnited States
| | - Thad Polk
- Department of Psychology, University of MichiganAnn ArborUnited States
| | - Douglas D Garrett
- Max Planck UCL Centre for Computational Psychiatry and Ageing ResearchBerlinGermany
- Center for Lifespan Psychology, Max Planck Institute for Human DevelopmentBerlinGermany
| |
Collapse
|
2
|
Hansen BC, Greene MR, Lewinsohn HAS, Kris AE, Smyth S, Tang B. Brain-guided convolutional neural networks reveal task-specific representations in scene processing. Sci Rep 2025; 15:13025. [PMID: 40234494 PMCID: PMC12000445 DOI: 10.1038/s41598-025-96307-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Accepted: 03/27/2025] [Indexed: 04/17/2025] Open
Abstract
Scene categorization is the dominant proxy for visual understanding, yet humans can perform a large number of visual tasks within any scene. Consequently, we know little about how different tasks change how a scene is processed, represented, and its features ultimately used. Here, we developed a novel brain-guided convolutional neural network (CNN) where each convolutional layer was separately guided by neural responses taken at different time points while observers performed a pre-cued object detection task or a scene affordance task on the same set of images. We then reconstructed each layer's activation maps via deconvolution to spatially assess how different features were used within each task. The brain-guided CNN made use of image features that human observers identified as being crucial to complete each task starting around 244 ms and persisted to 402 ms. Critically, because the same images were used across the two tasks, the CNN could only succeed if the neural data captured task-relevant differences. Our analyses of the activation maps across layers revealed that the brain's spatiotemporal representation of local image features evolves systematically over time. This underscores how distinct image features emerge at different stages of processing, shaped by the observer's goals and behavioral context.
Collapse
Affiliation(s)
- Bruce C Hansen
- Department of Psychological & Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA.
| | - Michelle R Greene
- Barnard College, Department of Psychology, Columbia University, New York, NY, USA
| | - Henry A S Lewinsohn
- Department of Psychological & Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA
| | - Audrey E Kris
- Department of Psychological & Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA
| | - Sophie Smyth
- Department of Psychological & Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA
| | - Binghui Tang
- Department of Psychological & Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA
| |
Collapse
|
3
|
Huang S, Howard CM, Bogdan PC, Morales-Torres R, Slayton M, Cabeza R, Davis SW. Trial-level Representational Similarity Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.27.645646. [PMID: 40236023 PMCID: PMC11996353 DOI: 10.1101/2025.03.27.645646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
Neural representation refers to the brain activity that stands in for one's cognitive experience, and in cognitive neuroscience, the principal method to studying neural representations is representational similarity analysis (RSA). The classic RSA (cRSA) approach examines the overall quality of representations across numerous items by assessing the correspondence between two representational similarity matrices (RSMs): one based on a theoretical model of stimulus similarity and the other based on similarity in measured neural data. However, because cRSA cannot model representation at the level of individual trials, it is fundamentally limited in its ability to assess subject-, stimulus-, and trial-level variances that all influence representation. Here, we formally introduce trial-level RSA (tRSA), an analytical framework that estimates the strength of neural representation for singular experimental trials and evaluates hypotheses using multi-level models. First, we verified the correspondence between tRSA and cRSA in quantifying the overall representation strength across all trials. Second, we compared the statistical inferences drawn from both approaches using simulated data that reflected a wide range of scenarios. Compared to cRSA, the multi-level framework of tRSA was both more theoretically appropriate and significantly sensitive to true effects. Third, using real fMRI datasets, we further demonstrated several issues with cRSA, to which tRSA was more robust. Finally, we presented some novel findings of neural representations that could only be assessed with tRSA and not cRSA. In summary, tRSA proves to be a robust and versatile analytical approach for cognitive neuroscience and beyond.
Collapse
|
4
|
Altavini TS, Chen M, Astorga G, Yan Y, Li W, Freiwald W, Gilbert CD. Expectation-dependent stimulus selectivity in the ventral visual cortical pathway. Proc Natl Acad Sci U S A 2025; 122:e2406684122. [PMID: 40146852 PMCID: PMC12002251 DOI: 10.1073/pnas.2406684122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 02/13/2025] [Indexed: 03/29/2025] Open
Abstract
The hierarchical view of the ventral object recognition pathway is primarily based on feedforward mechanisms, starting from a fixed basis set of object primitives and ending on a representation of whole objects in the inferotemporal cortex. Here, we provide a different view. Rather than being a fixed "labeled line" for a specific feature, neurons are continually changing their stimulus selectivities on a moment-to-moment basis, as dictated by top-down influences of object expectation and perceptual task. Here, we also derive the selectivity for stimulus features from an ethologically curated stimulus set, based on a delayed match-to-sample task, that finds components that are informative for object recognition in addition to full objects, though the top-down effects were seen for both informative and uninformative components. Cortical areas responding to these stimuli were identified with functional MRI in order to guide placement of chronically implanted electrode arrays.
Collapse
Affiliation(s)
- Tiago S. Altavini
- Laboratory of Neurobiology, The Rockefeller University, New York, NY10065
| | - Minggui Chen
- Laboratory of Neurobiology, The Rockefeller University, New York, NY10065
| | - Guadalupe Astorga
- Laboratory of Neurobiology, The Rockefeller University, New York, NY10065
| | - Yin Yan
- Beijing Normal University, Beijing100875, China
| | - Wu Li
- Beijing Normal University, Beijing100875, China
| | - Winrich Freiwald
- Laboratory of Neurobiology, The Rockefeller University, New York, NY10065
| | - Charles D. Gilbert
- Laboratory of Neurobiology, The Rockefeller University, New York, NY10065
| |
Collapse
|
5
|
An NM, Roh H, Kim S, Kim JH, Im M. Machine Learning Techniques for Simulating Human Psychophysical Testing of Low-Resolution Phosphene Face Images in Artificial Vision. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2405789. [PMID: 39985243 PMCID: PMC12005743 DOI: 10.1002/advs.202405789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 01/18/2025] [Indexed: 02/24/2025]
Abstract
To evaluate the quality of artificial visual percepts generated by emerging methodologies, researchers often rely on labor-intensive and tedious human psychophysical experiments. These experiments necessitate repeated iterations upon any major/minor modifications in the hardware/software configurations. Here, the capacity of standard machine learning (ML) models is investigated to accurately replicate quaternary match-to-sample tasks using low-resolution facial images represented by arrays of phosphenes as input stimuli. Initially, the performance of the ML models trained to approximate innate human facial recognition abilities across a dataset comprising 3600 phosphene images of human faces is analyzed. Subsequently, due to the time constraints and the potential for subject fatigue, the psychophysical test is limited to presenting only 720 low-resolution phosphene images to 36 human subjects. Notably, the superior model adeptly mirrors the behavioral trend of human subjects, offering precise predictions for 8 out of 9 phosphene quality levels on the overlapping test queries. Subsequently, human recognition performances for untested phosphene images are predicted, streamlining the process and minimizing the need for additional psychophysical tests. The findings underscore the transformative potential of ML in reshaping the research paradigm of visual prosthetics, facilitating the expedited advancement of prostheses.
Collapse
Affiliation(s)
- Na Min An
- Brain Science InstituteKorea Institute of Science and Technology (KIST)Seoul02792Republic of Korea
- Present address:
Kim Jaechul Graduate School of AIKAISTSeoul02455Republic of Korea
| | - Hyeonhee Roh
- Brain Science InstituteKorea Institute of Science and Technology (KIST)Seoul02792Republic of Korea
| | - Sein Kim
- Brain Science InstituteKorea Institute of Science and Technology (KIST)Seoul02792Republic of Korea
| | - Jae Hun Kim
- Brain Science InstituteKorea Institute of Science and Technology (KIST)Seoul02792Republic of Korea
- Sensor System Research CenterAdvanced Materials and Systems Research DivisionKISTSeoul02792Republic of Korea
| | - Maesoon Im
- Brain Science InstituteKorea Institute of Science and Technology (KIST)Seoul02792Republic of Korea
- Division of Bio‐Medical Science and TechnologyUniversity of Science and Technology (UST)Seoul02792Republic of Korea
- KHU‐KIST Department of Converging Science and TechnologyKyung Hee UniversitySeoul02447Republic of Korea
| |
Collapse
|
6
|
Zhu X, Watson DM, Rogers D, Andrews TJ. View-symmetric representations of faces in human and artificial neural networks. Neuropsychologia 2025; 207:109061. [PMID: 39645227 DOI: 10.1016/j.neuropsychologia.2024.109061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 12/05/2024] [Indexed: 12/09/2024]
Abstract
View symmetry has been suggested to be an important intermediate representation between view-specific and view-invariant representations of faces in the human brain. Here, we compared view-symmetry in humans and a deep convolutional neural network (DCNN) trained to recognise faces. First, we compared the output of the DCNN to head rotations in yaw (left-right), pitch (up-down) and roll (in-plane rotation). For yaw, an initial view-specific representation was evident in the convolutional layers, but a view-symmetric representation emerged in the fully-connected layers. Consistent with a role in the recognition of faces, we found that view-symmetric responses to yaw were greater for same identity compared to different identity faces. In contrast, we did not find a similar transition from view-specific to view-symmetric representations in the DCNN for either pitch or roll. These findings suggest that view-symmetry emerges when opposite rotations of the head lead to mirror images. Next, we compared the view-symmetric patterns of response to yaw in the DCNN with corresponding behavioural and neural responses in humans. We found that responses in the fully-connected layers of the DCNN correlated with judgements of perceptual similarity and with the responses of higher visual regions. These findings suggest that view-symmetric representations may be computationally efficient way to represent faces in humans and artificial neural networks for the recognition of identity.
Collapse
Affiliation(s)
- Xun Zhu
- Department of Psychology, University of York, YO10 4PF, UK
| | - David M Watson
- Department of Psychology, University of York, YO10 4PF, UK
| | - Daniel Rogers
- Department of Psychology, University of York, YO10 4PF, UK
| | | |
Collapse
|
7
|
Golmohamadian M, Faraji M, Fallah F, Sharifizadeh F, Ebrahimpour R. Flexibility in choosing decision policies in gathering discrete evidence over time. PLoS One 2025; 20:e0316320. [PMID: 39808606 PMCID: PMC11731777 DOI: 10.1371/journal.pone.0316320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 12/10/2024] [Indexed: 01/16/2025] Open
Abstract
The brain can remarkably adapt its decision-making process to suit the dynamic environment and diverse aims and demands. The brain's flexibility can be classified into three categories: flexibility in choosing solutions, decision policies, and actions. We employ two experiments to explore flexibility in decision policy: a visual object categorization task and an auditory object categorization task. Both tasks required participants to accumulate discrete evidence over time, with the only difference being the sensory state of the stimuli. We aim to investigate how the brain demonstrates flexibility in selecting decision policies in different sensory contexts when the solution and action remain the same. Our results indicate that the decision policy of the brain in integrating information is independent of inter-pulse interval across these two tasks. However, the decision policy based on how the brain ranks the first and second pulse of evidence changes flexibly. We show that the sequence of pulses does not affect the choice accuracy in the auditory mode. However, in the visual mode, the first pulse had the larger leverage on decisions. Our research underscores the importance of incorporating diverse contexts to improve our understanding of the brain's flexibility in real-world decision-making.
Collapse
Affiliation(s)
- Masoumeh Golmohamadian
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Science (IPM), Tehran, Iran
| | - Mehrbod Faraji
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Science (IPM), Tehran, Iran
- Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Fatemeh Fallah
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Science (IPM), Tehran, Iran
| | - Fatemeh Sharifizadeh
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Science (IPM), Tehran, Iran
| | - Reza Ebrahimpour
- Center for Cognitive Science, Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran, Iran
| |
Collapse
|
8
|
Li B, Todo Y, Tang Z. Artificial Visual System for Stereo-Orientation Recognition Based on Hubel-Wiesel Model. Biomimetics (Basel) 2025; 10:38. [PMID: 39851754 PMCID: PMC11762170 DOI: 10.3390/biomimetics10010038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 12/26/2024] [Accepted: 01/06/2025] [Indexed: 01/26/2025] Open
Abstract
Stereo-orientation selectivity is a fundamental neural mechanism in the brain that plays a crucial role in perception. However, due to the recognition process of high-dimensional spatial information commonly occurring in high-order cortex, we still know little about the mechanisms underlying stereo-orientation selectivity and lack a modeling strategy. A classical explanation for the mechanism of two-dimensional orientation selectivity within the primary visual cortex is based on the Hubel-Wiesel model, a cascading neural connection structure. The local-to-global information aggregation thought within the Hubel-Wiesel model not only contributed to neurophysiology but also inspired the development of computer vision fields. In this paper, we provide a clear and efficient conceptual understanding of stereo-orientation selectivity and propose a quantitative explanation for its generation based on the thought of local-to-global information aggregation within the Hubel-Wiesel model and develop an artificial visual system (AVS) for stereo-orientation recognition. Our approach involves modeling depth selective cells to receive depth information, simple stereo-orientation selective cells for combining distinct depth information inputs to generate various local stereo-orientation selectivity, and complex stereo-orientation selective cells responsible for integrating the same local information to generate global stereo-orientation selectivity. Simulation results demonstrate that our AVS is effective in stereo-orientation recognition and robust against spatial noise jitters. AVS achieved an overall over 90% accuracy on noise data in orientation recognition tasks, significantly outperforming deep models. In addition, the AVS contributes to enhancing deep models' performance, robustness, and stability in 3D object recognition tasks. Notably, AVS enhanced the TransNeXt model in improving its overall performance from 73.1% to 97.2% on the 3D-MNIST dataset and from 56.1% to 86.4% on the 3D-Fashion-MNIST dataset. Our explanation for the generation of stereo-orientation selectivity offers a reliable, explainable, and robust approach for extracting spatial features and provides a straightforward modeling method for neural computation research.
Collapse
Affiliation(s)
- Bin Li
- Division of Electrical Engineering and Computer Science, Kanazawa University, Kanazawa-shi 920-1192, Japan;
| | - Yuki Todo
- Faculty of Electrical, Information and Communication Engineering, Kanazawa University, Kanazawa-shi 920-1192, Japan
| | - Zheng Tang
- Institute of AI for Industries, Chinese Academy of Sciences, 168 Tianquan Road, Nanjing 211100, China
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
| |
Collapse
|
9
|
Waschke L, Kamp F, van den Elzen E, Krishna S, Lindenberger U, Rutishauser U, Garrett DD. Single-neuron spiking variability in hippocampus dynamically tracks sensory content during memory formation in humans. Nat Commun 2025; 16:236. [PMID: 39747026 PMCID: PMC11696175 DOI: 10.1038/s41467-024-55406-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 12/11/2024] [Indexed: 01/04/2025] Open
Abstract
During memory formation, the hippocampus is presumed to represent the content of stimuli, but how it does so is unknown. Using computational modelling and human single-neuron recordings, we show that the more precisely hippocampal spiking variability tracks the composite features of each individual stimulus, the better those stimuli are later remembered. We propose that moment-to-moment spiking variability may provide a new window into how the hippocampus constructs memories from the building blocks of our sensory world.
Collapse
Affiliation(s)
- Leonhard Waschke
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Max Planck Institute for Human Development, Berlin, Germany
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany
| | - Fabian Kamp
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Max Planck Institute for Human Development, Berlin, Germany
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck School of Cognition, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Evi van den Elzen
- Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, The Netherlands
| | - Suresh Krishna
- Department of Physiology, McGill University, Montreal, Canada
| | - Ulman Lindenberger
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Max Planck Institute for Human Development, Berlin, Germany
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany
| | - Ueli Rutishauser
- Department of Neurosurgery, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Neurology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Division of Biology and Bioengineering, California Institute of Technology, Pasadena, CA, USA
- Center for Neural Science and Medicine, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Douglas D Garrett
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Max Planck Institute for Human Development, Berlin, Germany.
- Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany.
| |
Collapse
|
10
|
Greene MR, Rohan AM. The brain prioritizes the basic level of object category abstraction. Sci Rep 2025; 15:31. [PMID: 39747114 PMCID: PMC11695711 DOI: 10.1038/s41598-024-80546-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 11/19/2024] [Indexed: 01/04/2025] Open
Abstract
The same object can be described at multiple levels of abstraction ("parka", "coat", "clothing"), yet human observers consistently name objects at a mid-level of specificity known as the basic level. Little is known about the temporal dynamics involved in retrieving neural representations that prioritize the basic level, nor how these dynamics change with evolving task demands. In this study, observers viewed 1080 objects arranged in a three-tier category taxonomy while 64-channel EEG was recorded. Observers performed a categorical one-back task in different recording sessions on the basic or subordinate levels. We used time-resolved multiple regression to assess the utility of superordinate-, basic-, and subordinate-level categories across the scalp. We found robust use of basic-level category information starting at about 50 ms after stimulus onset and moving from posterior electrodes (149 ms) through lateral (261 ms) to anterior sites (332 ms). Task differences were not evident in the first 200 ms of processing but were observed between 200-300 ms after stimulus presentation. Together, this work demonstrates that the object category representations prioritize the basic level and do so relatively early, congruent with results that show that basic-level categorization is an automatic and obligatory process.
Collapse
Affiliation(s)
- Michelle R Greene
- Bates College Program in Neuroscience, Bates College, Lewiston, ME, USA.
- Department of Psychology, Barnard College, Columbia University, 3009 Broadway, New York, NY 10027, USA.
| | - Alyssa Magill Rohan
- Bates College Program in Neuroscience, Bates College, Lewiston, ME, USA
- Boston Children's Hospital, Boston, USA
| |
Collapse
|
11
|
Yu H, Zhao Q. Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning. Cogn Neurodyn 2024; 18:3615-3628. [PMID: 39712112 PMCID: PMC11655826 DOI: 10.1007/s11571-023-09932-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 12/25/2022] [Accepted: 01/13/2023] [Indexed: 02/05/2023] Open
Abstract
The integration and interaction of cross-modal senses in brain neural networks can facilitate high-level cognitive functionalities. In this work, we proposed a bioinspired multisensory integration neural network (MINN) that integrates visual and audio senses for recognizing multimodal information across different sensory modalities. This deep learning-based model incorporates a cascading framework of parallel convolutional neural networks (CNNs) for extracting intrinsic features from visual and audio inputs, and a recurrent neural network (RNN) for multimodal information integration and interaction. The network was trained using synthetic training data generated for digital recognition tasks. It was revealed that the spatial and temporal features extracted from visual and audio inputs by CNNs were encoded in subspaces orthogonal with each other. In integration epoch, network state evolved along quasi-rotation-symmetric trajectories and a structural manifold with stable attractors was formed in RNN, supporting accurate cross-modal recognition. We further evaluated the robustness of the MINN algorithm with noisy inputs and asynchronous digital inputs. Experimental results demonstrated the superior performance of MINN for flexible integration and accurate recognition of multisensory information with distinct sense properties. The present results provide insights into the computational principles governing multisensory integration and a comprehensive neural network model for brain-inspired intelligence.
Collapse
Affiliation(s)
- Haitao Yu
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| | - Quanfa Zhao
- School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072 China
| |
Collapse
|
12
|
Hiramoto M, Cline HT. Identification of movie encoding neurons enables movie recognition AI. Proc Natl Acad Sci U S A 2024; 121:e2412260121. [PMID: 39560649 PMCID: PMC11621835 DOI: 10.1073/pnas.2412260121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 09/12/2024] [Indexed: 11/20/2024] Open
Abstract
Natural visual scenes are dominated by spatiotemporal image dynamics, but how the visual system integrates "movie" information over time is unclear. We characterized optic tectal neuronal receptive fields using sparse noise stimuli and reverse correlation analysis. Neurons recognized movies of ~200-600 ms durations with defined start and stop stimuli. Movie durations from start to stop responses were tuned by sensory experience though a hierarchical algorithm. Neurons encoded families of image sequences following trigonometric functions. Spike sequence and information flow suggest that repetitive circuit motifs underlie movie detection. Principles of frog topographic retinotectal plasticity and cortical simple cells are employed in machine learning networks for static image recognition, suggesting that discoveries of principles of movie encoding in the brain, such as how image sequences and duration are encoded, may benefit movie recognition technology. We built and trained a machine learning network that mimicked neural principles of visual system movie encoders. The network, named MovieNet, outperformed current machine learning image recognition networks in classifying natural movie scenes, while reducing data size and steps to complete the classification task. This study reveals how movie sequences and time are encoded in the brain and demonstrates that brain-based movie processing principles enable efficient machine learning.
Collapse
Affiliation(s)
- Masaki Hiramoto
- Department of Neuroscience, Dorris Neuroscience Center, Scripps Research Institute, La Jolla, CA92037
| | - Hollis T. Cline
- Department of Neuroscience, Dorris Neuroscience Center, Scripps Research Institute, La Jolla, CA92037
| |
Collapse
|
13
|
Motlagh SC, Joanisse M, Wang B, Mohsenzadeh Y. Unveiling the neural dynamics of conscious perception in rapid object recognition. Neuroimage 2024; 296:120668. [PMID: 38848982 DOI: 10.1016/j.neuroimage.2024.120668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 05/23/2024] [Accepted: 06/05/2024] [Indexed: 06/09/2024] Open
Abstract
Our brain excels at recognizing objects, even when they flash by in a rapid sequence. However, the neural processes determining whether a target image in a rapid sequence can be recognized or not remains elusive. We used electroencephalography (EEG) to investigate the temporal dynamics of brain processes that shape perceptual outcomes in these challenging viewing conditions. Using naturalistic images and advanced multivariate pattern analysis (MVPA) techniques, we probed the brain dynamics governing conscious object recognition. Our results show that although initially similar, the processes for when an object can or cannot be recognized diverge around 180 ms post-appearance, coinciding with feedback neural processes. Decoding analyses indicate that gist perception (partial conscious perception) can occur at ∼120 ms through feedforward mechanisms. In contrast, object identification (full conscious perception of the image) is resolved at ∼190 ms after target onset, suggesting involvement of recurrent processing. These findings underscore the importance of recurrent neural connections in object recognition and awareness in rapid visual presentations.
Collapse
Affiliation(s)
- Saba Charmi Motlagh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Marc Joanisse
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Department of Psychology, Western University, London, Ontario, Canada
| | - Boyu Wang
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada
| | - Yalda Mohsenzadeh
- Western Center for Brain and Mind, Western University, London, Ontario, Canada; Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada; Department of Computer Science, Western University, London, Ontario, Canada.
| |
Collapse
|
14
|
Quaia C, Krauzlis RJ. Object recognition in primates: what can early visual areas contribute? Front Behav Neurosci 2024; 18:1425496. [PMID: 39070778 PMCID: PMC11272660 DOI: 10.3389/fnbeh.2024.1425496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Introduction If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. Methods To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. Results We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Discussion Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, United States
| | | |
Collapse
|
15
|
Quaia C, Krauzlis RJ. Object recognition in primates: What can early visual areas contribute? ARXIV 2024:arXiv:2407.04816v1. [PMID: 39398202 PMCID: PMC11468158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, USA
| | - Richard J Krauzlis
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, USA
| |
Collapse
|
16
|
Roshan SS, Sadeghnejad N, Sharifizadeh F, Ebrahimpour R. A neurocomputational model of decision and confidence in object recognition task. Neural Netw 2024; 175:106318. [PMID: 38643618 DOI: 10.1016/j.neunet.2024.106318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 03/16/2024] [Accepted: 04/11/2024] [Indexed: 04/23/2024]
Abstract
How does the brain process natural visual stimuli to make a decision? Imagine driving through fog. An object looms ahead. What do you do? This decision requires not only identifying the object but also choosing an action based on your decision confidence. In this circumstance, confidence is making a bridge between seeing and believing. Our study unveils how the brain processes visual information to make such decisions with an assessment of confidence, using a model inspired by the visual cortex. To computationally model the process, this study uses a spiking neural network inspired by the hierarchy of the visual cortex in mammals to investigate the dynamics of feedforward object recognition and decision-making in the brain. The model consists of two modules: a temporal dynamic object representation module and an attractor neural network-based decision-making module. Unlike traditional models, ours captures the evolution of evidence within the visual cortex, mimicking how confidence forms in the brain. This offers a more biologically plausible approach to decision-making when encountering real-world stimuli. We conducted experiments using natural stimuli and measured accuracy, reaction time, and confidence. The model's estimated confidence aligns remarkably well with human-reported confidence. Furthermore, the model can simulate the human change-of-mind phenomenon, reflecting the ongoing evaluation of evidence in the brain. Also, this finding offers decision-making and confidence encoding share the same neural circuit.
Collapse
Affiliation(s)
- Setareh Sadat Roshan
- Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran 1956836484, Iran
| | - Naser Sadeghnejad
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran 1956836484, Iran
| | - Fatemeh Sharifizadeh
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran 1956836484, Iran
| | - Reza Ebrahimpour
- Center for Cognitive Science, Institute for Convergence Science & Technology, Sharif University of Technology, Tehran 14588-89694, Iran.
| |
Collapse
|
17
|
Boring MJ, Richardson RM, Ghuman AS. Interacting ventral temporal gradients of timescales and functional connectivity and their relationships to visual behavior. iScience 2024; 27:110003. [PMID: 38868193 PMCID: PMC11166696 DOI: 10.1016/j.isci.2024.110003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 04/02/2024] [Accepted: 05/14/2024] [Indexed: 06/14/2024] Open
Abstract
Cortical gradients in endogenous and stimulus-evoked neurodynamic timescales, and long-range cortical interactions, provide organizational constraints to the brain and influence neural populations' roles in cognition. It is unclear how these functional gradients interrelate and which influence behavior. Here, intracranial recordings from 4,090 electrode contacts in 35 individuals map gradients of neural timescales and functional connectivity to assess their interactions along category-selective ventral temporal cortex. Endogenous and stimulus-evoked information processing timescales were not significantly correlated with one another suggesting that local neural timescales are context dependent and may arise through distinct neurophysiological mechanisms. Endogenous neural timescales correlated with functional connectivity even after removing the effects of shared anatomical gradients. Neural timescales and functional connectivity correlated with how strongly a population's activity predicted behavior in a simple visual task. These results suggest both interrelated and distinct neurophysiological processes give rise to different functional connectivity and neural timescale gradients, which together influence behavior.
Collapse
Affiliation(s)
- Matthew J. Boring
- Center for Neuroscience at the University of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, University of Pittsburgh and Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| | - R. Mark Richardson
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Avniel Singh Ghuman
- Center for Neuroscience at the University of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
- Center for the Neural Basis of Cognition, University of Pittsburgh and Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
18
|
Lee K, Dora S, Mejias JF, Bohte SM, Pennartz CMA. Predictive coding with spiking neurons and feedforward gist signaling. Front Comput Neurosci 2024; 18:1338280. [PMID: 38680678 PMCID: PMC11045951 DOI: 10.3389/fncom.2024.1338280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/14/2024] [Indexed: 05/01/2024] Open
Abstract
Predictive coding (PC) is an influential theory in neuroscience, which suggests the existence of a cortical architecture that is constantly generating and updating predictive representations of sensory inputs. Owing to its hierarchical and generative nature, PC has inspired many computational models of perception in the literature. However, the biological plausibility of existing models has not been sufficiently explored due to their use of artificial neurons that approximate neural activity with firing rates in the continuous time domain and propagate signals synchronously. Therefore, we developed a spiking neural network for predictive coding (SNN-PC), in which neurons communicate using event-driven and asynchronous spikes. Adopting the hierarchical structure and Hebbian learning algorithms from previous PC neural network models, SNN-PC introduces two novel features: (1) a fast feedforward sweep from the input to higher areas, which generates a spatially reduced and abstract representation of input (i.e., a neural code for the gist of a scene) and provides a neurobiological alternative to an arbitrary choice of priors; and (2) a separation of positive and negative error-computing neurons, which counters the biological implausibility of a bi-directional error neuron with a very high baseline firing rate. After training with the MNIST handwritten digit dataset, SNN-PC developed hierarchical internal representations and was able to reconstruct samples it had not seen during training. SNN-PC suggests biologically plausible mechanisms by which the brain may perform perceptual inference and learning in an unsupervised manner. In addition, it may be used in neuromorphic applications that can utilize its energy-efficient, event-driven, local learning, and parallel information processing nature.
Collapse
Affiliation(s)
- Kwangjun Lee
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Shirin Dora
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Department of Computer Science, School of Science, Loughborough University, Loughborough, United Kingdom
| | - Jorge F. Mejias
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| | - Sander M. Bohte
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
- Machine Learning Group, Centre of Mathematics and Computer Science, Amsterdam, Netherlands
| | - Cyriel M. A. Pennartz
- Cognitive and Systems Neuroscience Group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
19
|
Campbell A, Tanaka JW. Fast saccades to faces during the feedforward sweep. J Vis 2024; 24:16. [PMID: 38630459 PMCID: PMC11037494 DOI: 10.1167/jov.24.4.16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 09/19/2023] [Indexed: 04/19/2024] Open
Abstract
Saccadic choice tasks use eye movements as a response method, typically in a task where observers are asked to saccade as quickly as possible to an image of a prespecified target category. Using this approach, face-selective saccades have been observed within 100 ms poststimulus. When taking into account oculomotor processing, this suggests that faces can be detected in as little as 70 to 80 ms. It has therefore been suggested that face detection must occur during the initial feedforward sweep, since this latency leaves little time for feedback processing. In the current experiment, we tested this hypothesis using backward masking-a technique shown to primarily disrupt feedback processing while leaving feedforward activation relatively intact. Based on minimum saccadic reaction time, we found that face detection benefited from ultra-fast, accurate saccades within 110 to 160 ms and that these eye movements are obtainable even under extreme masking conditions that limit perceptual awareness. However, masking did significantly increase the median SRT for faces. In the manual responses, we found remarkable detection accuracy for faces and houses, even when participants indicated having no visual experience of the test images. These results provide evidence for the view that the saccadic bias to faces is initiated by coarse information used to categorize faces in the feedforward sweep but that, in most cases, additional processing is required to quickly reach the threshold for saccade initiation.
Collapse
Affiliation(s)
- Alison Campbell
- Department of Psychology, University of Victoria, Victoria, BC, Canada
- https://orcid.org/0000-0001-6891-8609
| | - James W Tanaka
- Department of Psychology, University of Victoria, Victoria, BC, Canada
- https://orcid.org/0000-0001-6559-0388
| |
Collapse
|
20
|
Sadeghnejad N, Ezoji M, Ebrahimpour R, Qodosi M, Zabbah S. A fully spiking coupled model of a deep neural network and a recurrent attractor explains dynamics of decision making in an object recognition task. J Neural Eng 2024; 21:026011. [PMID: 38506115 DOI: 10.1088/1741-2552/ad2d30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 02/26/2024] [Indexed: 03/21/2024]
Abstract
Objective.Object recognition and making a choice regarding the recognized object is pivotal for most animals. This process in the brain contains information representation and decision making steps which both take different amount of times for different objects. While dynamics of object recognition and decision making are usually ignored in object recognition models, here we proposed a fully spiking hierarchical model, explaining the process of object recognition from information representation to making decision.Approach.Coupling a deep neural network and a recurrent attractor based decision making model beside using spike time dependent plasticity learning rules in several convolutional and pooling layers, we proposed a model which can resemble brain behaviors during an object recognition task. We also measured human choices and reaction times in a psychophysical object recognition task and used it as a reference to evaluate the model.Main results.The proposed model explains not only the probability of making a correct decision but also the time that it takes to make a decision. Importantly, neural firing rates in both feature representation and decision making levels mimic the observed patterns in animal studies (number of spikes (p-value < 10-173) and the time of the peak response (p-value < 10-31) are significantly modulated with the strength of the stimulus). Moreover, the speed-accuracy trade-off as a well-known characteristic of decision making process in the brain is also observed in the model (changing the decision bound significantly affect the reaction time (p-value < 10-59) and accuracy (p-value < 10-165)).Significance.We proposed a fully spiking deep neural network which can explain dynamics of making decision about an object in both neural and behavioral level. Results showed that there is a strong and significant correlation (r= 0.57) between the reaction time of the model and of human participants in the psychophysical object recognition task.
Collapse
Affiliation(s)
- Naser Sadeghnejad
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Mehdi Ezoji
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Reza Ebrahimpour
- Center for Cognitive Science, Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran, Iran
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Mohamad Qodosi
- Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Sajjad Zabbah
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
- Max Planck UCL Centre for Computational Psychiatry and Aging Research, University College London, London, United Kingdom
| |
Collapse
|
21
|
Pagonabarraga J, Bejr-Kasem H, Martinez-Horta S, Kulisevsky J. Parkinson disease psychosis: from phenomenology to neurobiological mechanisms. Nat Rev Neurol 2024; 20:135-150. [PMID: 38225264 DOI: 10.1038/s41582-023-00918-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2023] [Indexed: 01/17/2024]
Abstract
Parkinson disease (PD) psychosis (PDP) is a spectrum of illusions, hallucinations and delusions that are associated with PD throughout its disease course. Psychotic phenomena can manifest from the earliest stages of PD and might follow a continuum from minor hallucinations to structured hallucinations and delusions. Initially, PDP was considered to be a complication associated with dopaminergic drug use. However, subsequent research has provided evidence that PDP arises from the progression of brain alterations caused by PD itself, coupled with the use of dopaminergic drugs. The combined dysfunction of attentional control systems, sensory processing, limbic structures, the default mode network and thalamocortical connections provides a conceptual framework to explain how new incoming stimuli are incorrectly categorized, and how aberrant hierarchical predictive processing can produce false percepts that intrude into the stream of consciousness. The past decade has seen the publication of new data on the phenomenology and neurobiological basis of PDP from the initial stages of the disease, as well as the neurotransmitter systems involved in PDP initiation and progression. In this Review, we discuss the latest clinical, neuroimaging and neurochemical evidence that could aid early identification of psychotic phenomena in PD and inform the discovery of new therapeutic targets and strategies.
Collapse
Affiliation(s)
- Javier Pagonabarraga
- Movement Disorder Unit, Neurology Department, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain.
- Department of Medicine, Autonomous University of Barcelona, Barcelona, Spain.
- Sant Pau Biomedical Research Institute (IIB-Sant Pau), Barcelona, Spain.
- Centro de Investigación en Red - Enfermedades Neurodegenerativas (CIBERNED), Madrid, Spain.
| | - Helena Bejr-Kasem
- Movement Disorder Unit, Neurology Department, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain
- Department of Medicine, Autonomous University of Barcelona, Barcelona, Spain
- Sant Pau Biomedical Research Institute (IIB-Sant Pau), Barcelona, Spain
- Centro de Investigación en Red - Enfermedades Neurodegenerativas (CIBERNED), Madrid, Spain
| | - Saul Martinez-Horta
- Movement Disorder Unit, Neurology Department, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain
- Department of Medicine, Autonomous University of Barcelona, Barcelona, Spain
- Sant Pau Biomedical Research Institute (IIB-Sant Pau), Barcelona, Spain
- Centro de Investigación en Red - Enfermedades Neurodegenerativas (CIBERNED), Madrid, Spain
| | - Jaime Kulisevsky
- Movement Disorder Unit, Neurology Department, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain
- Department of Medicine, Autonomous University of Barcelona, Barcelona, Spain
- Sant Pau Biomedical Research Institute (IIB-Sant Pau), Barcelona, Spain
- Centro de Investigación en Red - Enfermedades Neurodegenerativas (CIBERNED), Madrid, Spain
| |
Collapse
|
22
|
Waschke L, Kamp F, van den Elzen E, Krishna S, Lindenberger U, Rutishauser U, Garrett DD. Single-neuron spiking variability in hippocampus dynamically tracks sensory content during memory formation in humans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.23.529684. [PMID: 36865320 PMCID: PMC9980052 DOI: 10.1101/2023.02.23.529684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
During memory formation, the hippocampus is presumed to represent the content of stimuli, but how it does so is unknown. Using computational modelling and human single-neuron recordings, we show that the more precisely hippocampal spiking variability tracks the composite features of each individual stimulus, the better those stimuli are later remembered. We propose that moment-to-moment spiking variability may provide a new window into how the hippocampus constructs memories from the building blocks of our sensory world.
Collapse
|
23
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|
24
|
von Seth J, Nicholls VI, Tyler LK, Clarke A. Recurrent connectivity supports higher-level visual and semantic object representations in the brain. Commun Biol 2023; 6:1207. [PMID: 38012301 PMCID: PMC10682037 DOI: 10.1038/s42003-023-05565-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023] Open
Abstract
Visual object recognition has been traditionally conceptualised as a predominantly feedforward process through the ventral visual pathway. While feedforward artificial neural networks (ANNs) can achieve human-level classification on some image-labelling tasks, it's unclear whether computational models of vision alone can accurately capture the evolving spatiotemporal neural dynamics. Here, we probe these dynamics using a combination of representational similarity and connectivity analyses of fMRI and MEG data recorded during the recognition of familiar, unambiguous objects. Modelling the visual and semantic properties of our stimuli using an artificial neural network as well as a semantic feature model, we find that unique aspects of the neural architecture and connectivity dynamics relate to visual and semantic object properties. Critically, we show that recurrent processing between the anterior and posterior ventral temporal cortex relates to higher-level visual properties prior to semantic object properties, in addition to semantic-related feedback from the frontal lobe to the ventral temporal lobe between 250 and 500 ms after stimulus onset. These results demonstrate the distinct contributions made by semantic object properties in explaining neural activity and connectivity, highlighting it as a core part of object recognition not fully accounted for by current biologically inspired neural networks.
Collapse
Affiliation(s)
- Jacqueline von Seth
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | | | - Lorraine K Tyler
- Department of Psychology, University of Cambridge, Cambridge, UK
- Cambridge Centre for Ageing and Neuroscience (Cam-CAN), University of Cambridge and MRC Cognition and Brain Sciences Unit, Cambridge, UK
| | - Alex Clarke
- Department of Psychology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
25
|
Pusch R, Packheiser J, Azizi AH, Sevincik CS, Rose J, Cheng S, Stüttgen MC, Güntürkün O. Working memory performance is tied to stimulus complexity. Commun Biol 2023; 6:1119. [PMID: 37923920 PMCID: PMC10624839 DOI: 10.1038/s42003-023-05486-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 10/18/2023] [Indexed: 11/06/2023] Open
Abstract
Working memory is the cognitive capability to maintain and process information over short periods. Behavioral and computational studies have shown that visual information is associated with working memory performance. However, the underlying neural correlates remain unknown. To identify how visual information affects working memory performance, we conducted behavioral experiments in pigeons (Columba livia) and single unit recordings in the avian prefrontal analog, the nidopallium caudolaterale (NCL). Complex pictures featuring luminance, spatial and color information, were associated with higher working memory performance compared to uniform gray pictures in conjunction with distinct neural coding patterns. For complex pictures, we found a multiplexed neuronal code displaying visual and value-related features that switched to a representation of the upcoming choice during a delay period. When processing gray stimuli, NCL neurons did not multiplex and exclusively represented the choice already during stimulus presentation and throughout the delay period. The prolonged representation possibly resulted in a decay of the memory trace ultimately leading to a decrease in performance. In conclusion, we found that high stimulus complexity is associated with neuronal multiplexing of the working memory representation possibly allowing a facilitated read-out of the neural code resulting in enhancement of working memory performance.
Collapse
Affiliation(s)
- Roland Pusch
- Department of Biopsychology, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany.
| | - Julian Packheiser
- Department of Biopsychology, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany
- Social Brain Lab, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
| | - Amir Hossein Azizi
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran (ABRII), Karaj, Iran
| | - Celil Semih Sevincik
- Department of Biopsychology, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany
| | - Jonas Rose
- Neural Basis of Learning, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany
| | - Sen Cheng
- Institute for Neural Computation, Faculty of Computer Science, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany
| | - Maik C Stüttgen
- Institute of Pathophysiology, University Medical Center of the Johannes Gutenberg University, Duesbergweg 6, D-55128, Mainz, Germany
| | - Onur Güntürkün
- Department of Biopsychology, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, D-44780, Bochum, Germany
- Research Center One Health Ruhr, Research Alliance Ruhr, Ruhr University Bochum, Bochum, Germany
| |
Collapse
|
26
|
Feather J, Leclerc G, Mądry A, McDermott JH. Model metamers reveal divergent invariances between biological and artificial neural networks. Nat Neurosci 2023; 26:2017-2034. [PMID: 37845543 PMCID: PMC10620097 DOI: 10.1038/s41593-023-01442-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 08/29/2023] [Indexed: 10/18/2023]
Abstract
Deep neural network models of sensory systems are often proposed to learn representational transformations with invariances like those in the brain. To reveal these invariances, we generated 'model metamers', stimuli whose activations within a model stage are matched to those of a natural stimulus. Metamers for state-of-the-art supervised and unsupervised neural network models of vision and audition were often completely unrecognizable to humans when generated from late model stages, suggesting differences between model and human invariances. Targeted model changes improved human recognizability of model metamers but did not eliminate the overall human-model discrepancy. The human recognizability of a model's metamers was well predicted by their recognizability by other models, suggesting that models contain idiosyncratic invariances in addition to those required by the task. Metamer recognizability dissociated from both traditional brain-based benchmarks and adversarial vulnerability, revealing a distinct failure mode of existing sensory models and providing a complementary benchmark for model assessment.
Collapse
Affiliation(s)
- Jenelle Feather
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Computational Neuroscience, Flatiron Institute, Cambridge, MA, USA.
| | - Guillaume Leclerc
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Aleksander Mądry
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
27
|
Kehoe DH, Fallah M. Oculomotor feature discrimination is cortically mediated. Front Syst Neurosci 2023; 17:1251933. [PMID: 37899790 PMCID: PMC10600481 DOI: 10.3389/fnsys.2023.1251933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 09/26/2023] [Indexed: 10/31/2023] Open
Abstract
Eye movements are often directed toward stimuli with specific features. Decades of neurophysiological research has determined that this behavior is subserved by a feature-reweighting of the neural activation encoding potential eye movements. Despite the considerable body of research examining feature-based target selection, no comprehensive theoretical account of the feature-reweighting mechanism has yet been proposed. Given that such a theory is fundamental to our understanding of the nature of oculomotor processing, we propose an oculomotor feature-reweighting mechanism here. We first summarize the considerable anatomical and functional evidence suggesting that oculomotor substrates that encode potential eye movements rely on the visual cortices for feature information. Next, we highlight the results from our recent behavioral experiments demonstrating that feature information manifests in the oculomotor system in order of featural complexity, regardless of whether the feature information is task-relevant. Based on the available evidence, we propose an oculomotor feature-reweighting mechanism whereby (1) visual information is projected into the oculomotor system only after a visual representation manifests in the highest stage of the cortical visual processing hierarchy necessary to represent the relevant features and (2) these dynamically recruited cortical module(s) then perform feature discrimination via shifting neural feature representations, while also maintaining parity between the feature representations in cortical and oculomotor substrates by dynamically reweighting oculomotor vectors. Finally, we discuss how our behavioral experiments may extend to other areas in vision science and its possible clinical applications.
Collapse
Affiliation(s)
- Devin H. Kehoe
- Department of Psychology, York University, Toronto, ON, Canada
- Centre for Vision Research, York University, Toronto, ON, Canada
- VISTA: Vision Science to Applications, York University, Toronto, ON, Canada
- Canadian Action and Perception Network, Canada
- Département de Neurosciences, Université de Montréal, Montréal, QC, Canada
| | - Mazyar Fallah
- Department of Psychology, York University, Toronto, ON, Canada
- Centre for Vision Research, York University, Toronto, ON, Canada
- Canadian Action and Perception Network, Canada
- College of Biological Science, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
28
|
Frick A, Besson G, Salmon E, Delhaye E. Perirhinal cortex is associated with fine-grained discrimination of conceptually confusable objects in Alzheimer's disease. Neurobiol Aging 2023; 130:1-11. [PMID: 37419076 DOI: 10.1016/j.neurobiolaging.2023.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 06/01/2023] [Accepted: 06/03/2023] [Indexed: 07/09/2023]
Abstract
The perirhinal cortex (PrC) stands among the first brain areas to deteriorate in Alzheimer's disease (AD). This study tests to what extent the PrC is involved in representing and discriminating confusable objects based on the conjunction of their perceptual and conceptual features. To this aim, AD patients and control counterparts performed 3 tasks: a naming, a recognition memory, and a conceptual matching task, where we manipulated conceptual and perceptual confusability. A structural MRI of the antero-lateral parahippocampal subregions was obtained for each participant. We found that the sensitivity to conceptual confusability was associated with the left PrC volume in both AD patients and control participants for the recognition memory task, while it was specifically associated with the volume of the left PrC in AD patients for the conceptual matching task. This suggests that a decreased volume of the PrC is related to the ability to disambiguate conceptually confusable items. Therefore, testing recognition memory or conceptual matching of easily conceptually confusable items can provide a potential cognitive marker of PrC atrophy.
Collapse
Affiliation(s)
- Aurélien Frick
- GIGA-CRC In Vivo Imaging, University of Liège, Liège, Belgium; Psychology and Neuroscience of Cognition Research Unit, University of Liège, Liège, Belgium.
| | - Gabriel Besson
- CINEICC, Faculty of Psychology and Educational Sciences, University of Coimbra, Coimbra, Portugal
| | - Eric Salmon
- GIGA-CRC In Vivo Imaging, University of Liège, Liège, Belgium
| | - Emma Delhaye
- GIGA-CRC In Vivo Imaging, University of Liège, Liège, Belgium; Psychology and Neuroscience of Cognition Research Unit, University of Liège, Liège, Belgium
| |
Collapse
|
29
|
Tscshantz A, Millidge B, Seth AK, Buckley CL. Hybrid predictive coding: Inferring, fast and slow. PLoS Comput Biol 2023; 19:e1011280. [PMID: 37531366 PMCID: PMC10395865 DOI: 10.1371/journal.pcbi.1011280] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 06/20/2023] [Indexed: 08/04/2023] Open
Abstract
Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors"-the differences between predicted and observed data. Implicit in this proposal is the idea that successful perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception-including complex forms of object recognition-arise from an initial "feedforward sweep" that occurs on fast timescales which preclude substantial recurrent activity. Here, we propose that the feedforward sweep can be understood as performing amortized inference (applying a learned function that maps directly from data to beliefs) and recurrent processing can be understood as performing iterative inference (sequentially updating neural activity in order to improve the accuracy of beliefs). We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner by describing both in terms of a dual optimization of a single objective function. We show that the resulting scheme can be implemented in a biologically plausible neural architecture that approximates Bayesian inference utilising local Hebbian update rules. We demonstrate that our hybrid predictive coding model combines the benefits of both amortized and iterative inference-obtaining rapid and computationally cheap perceptual inference for familiar data while maintaining the context-sensitivity, precision, and sample efficiency of iterative inference schemes. Moreover, we show how our model is inherently sensitive to its uncertainty and adaptively balances iterative and amortized inference to obtain accurate beliefs using minimum computational expense. Hybrid predictive coding offers a new perspective on the functional relevance of the feedforward and recurrent activity observed during visual perception and offers novel insights into distinct aspects of visual phenomenology.
Collapse
Affiliation(s)
- Alexander Tscshantz
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Beren Millidge
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
- Brain Networks Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Anil K. Seth
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- Sussex Centre for Consciousness Science, University of Sussex, Brighton, United Kingdom
| | - Christopher L. Buckley
- Sussex AI Group, Department of Informatics, University of Sussex, Brighton, United Kingdom
- VERSES Research Lab, Los Angeles, California, United States of America
| |
Collapse
|
30
|
Penacchio O, Otazu X, Wilkins AJ, Haigh SM. A mechanistic account of visual discomfort. Front Neurosci 2023; 17:1200661. [PMID: 37547142 PMCID: PMC10397803 DOI: 10.3389/fnins.2023.1200661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 06/27/2023] [Indexed: 08/08/2023] Open
Abstract
Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Collapse
Affiliation(s)
- Olivier Penacchio
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, United Kingdom
| | - Xavier Otazu
- Department of Computer Science, Universitat Autònoma de Barcelona, Bellaterra, Spain
- Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Arnold J. Wilkins
- Department of Psychology, University of Essex, Colchester, United Kingdom
| | - Sarah M. Haigh
- Department of Psychology, University of Nevada Reno, Reno, NV, United States
- Institute for Neuroscience, University of Nevada Reno, Reno, NV, United States
| |
Collapse
|
31
|
Cho H, Fonken YM, Adamek M, Jimenez R, Lin JJ, Schalk G, Knight RT, Brunner P. Unexpected sound omissions are signaled in human posterior superior temporal gyrus: an intracranial study. Cereb Cortex 2023; 33:8837-8848. [PMID: 37280730 PMCID: PMC10350817 DOI: 10.1093/cercor/bhad155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 04/11/2023] [Accepted: 04/11/2023] [Indexed: 06/08/2023] Open
Abstract
Context modulates sensory neural activations enhancing perceptual and behavioral performance and reducing prediction errors. However, the mechanism of when and where these high-level expectations act on sensory processing is unclear. Here, we isolate the effect of expectation absent of any auditory evoked activity by assessing the response to omitted expected sounds. Electrocorticographic signals were recorded directly from subdural electrode grids placed over the superior temporal gyrus (STG). Subjects listened to a predictable sequence of syllables, with some infrequently omitted. We found high-frequency band activity (HFA, 70-170 Hz) in response to omissions, which overlapped with a posterior subset of auditory-active electrodes in STG. Heard syllables could be distinguishable reliably from STG, but not the identity of the omitted stimulus. Both omission- and target-detection responses were also observed in the prefrontal cortex. We propose that the posterior STG is central for implementing predictions in the auditory environment. HFA omission responses in this region appear to index mismatch-signaling or salience detection processes.
Collapse
Affiliation(s)
- Hohyun Cho
- Department of Neurosurgery, Washington University School of Medicine in Saint Louis, St. Louis, MO 63110, USA
- National Center for Adaptive Neurotechnologies, St. Louis, MO 63110, USA
| | - Yvonne M Fonken
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
- TNO Human Factors Research Institute, Soesterberg 3769 DE, Netherlands
| | - Markus Adamek
- Department of Neurosurgery, Washington University School of Medicine in Saint Louis, St. Louis, MO 63110, USA
- National Center for Adaptive Neurotechnologies, St. Louis, MO 63110, USA
| | - Richard Jimenez
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Jack J Lin
- Department of Neurology and Center for Mind and Brain, University of California, Davis, Davis, CA 95618, USA
| | - Gerwin Schalk
- Frontier Lab for Applied Neurotechnology, Tianqiao and Chrissy Chen Institute, Shanghai 201203, People’s Republic of China
- Department of Neurosurgery, Fudan University/Huashan Hospital, Shanghai 200031, People’s Republic of China
| | - Robert T Knight
- Department of Psychology and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Peter Brunner
- Department of Neurosurgery, Washington University School of Medicine in Saint Louis, St. Louis, MO 63110, USA
- National Center for Adaptive Neurotechnologies, St. Louis, MO 63110, USA
- Department of Neurology, Albany Medical College, Albany, NY 12208, USA
| |
Collapse
|
32
|
Wang R, Lu X, Jiang Y. Distributed and hierarchical neural encoding of multidimensional biological motion attributes in the human brain. Cereb Cortex 2023; 33:8510-8522. [PMID: 37118887 PMCID: PMC10786095 DOI: 10.1093/cercor/bhad136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 03/31/2023] [Accepted: 04/01/2023] [Indexed: 04/30/2023] Open
Abstract
The human visual system can efficiently extract distinct physical, biological, and social attributes (e.g. facing direction, gender, and emotional state) from biological motion (BM), but how these attributes are encoded in the brain remains largely unknown. In the current study, we used functional magnetic resonance imaging to investigate this issue when participants viewed multidimensional BM stimuli. Using multiple regression representational similarity analysis, we identified distributed brain areas, respectively, related to the processing of facing direction, gender, and emotional state conveyed by BM. These brain areas are governed by a hierarchical structure in which the respective neural encoding of facing direction, gender, and emotional state is modulated by each other in descending order. We further revealed that a portion of the brain areas identified in representational similarity analysis was specific to the neural encoding of each attribute and correlated with the corresponding behavioral results. These findings unravel the brain networks for encoding BM attributes in consideration of their interactions, and highlight that the processing of multidimensional BM attributes is recurrently interactive.
Collapse
Affiliation(s)
- Ruidi Wang
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
- Chinese Institute for Brain Research, 26 Science Park Road, Beijing 102206, China
| | - Xiqian Lu
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
- Chinese Institute for Brain Research, 26 Science Park Road, Beijing 102206, China
| | - Yi Jiang
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China
- Department of Psychology, University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China
- Chinese Institute for Brain Research, 26 Science Park Road, Beijing 102206, China
| |
Collapse
|
33
|
Sörensen LKA, Bohté SM, de Jong D, Slagter HA, Scholte HS. Mechanisms of human dynamic object recognition revealed by sequential deep neural networks. PLoS Comput Biol 2023; 19:e1011169. [PMID: 37294830 DOI: 10.1371/journal.pcbi.1011169] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 05/09/2023] [Indexed: 06/11/2023] Open
Abstract
Humans can quickly recognize objects in a dynamically changing world. This ability is showcased by the fact that observers succeed at recognizing objects in rapidly changing image sequences, at up to 13 ms/image. To date, the mechanisms that govern dynamic object recognition remain poorly understood. Here, we developed deep learning models for dynamic recognition and compared different computational mechanisms, contrasting feedforward and recurrent, single-image and sequential processing as well as different forms of adaptation. We found that only models that integrate images sequentially via lateral recurrence mirrored human performance (N = 36) and were predictive of trial-by-trial responses across image durations (13-80 ms/image). Importantly, models with sequential lateral-recurrent integration also captured how human performance changes as a function of image presentation durations, with models processing images for a few time steps capturing human object recognition at shorter presentation durations and models processing images for more time steps capturing human object recognition at longer presentation durations. Furthermore, augmenting such a recurrent model with adaptation markedly improved dynamic recognition performance and accelerated its representational dynamics, thereby predicting human trial-by-trial responses using fewer processing resources. Together, these findings provide new insights into the mechanisms rendering object recognition so fast and effective in a dynamic visual world.
Collapse
Affiliation(s)
- Lynn K A Sörensen
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Amsterdam Brain & Cognition (ABC), University of Amsterdam, Amsterdam, Netherlands
| | - Sander M Bohté
- Machine Learning Group, Centrum Wiskunde & Informatica, Amsterdam, Netherlands
- Swammerdam Institute of Life Sciences (SILS), University of Amsterdam, Amsterdam, Netherlands
- Bernoulli Institute, Rijksuniversiteit Groningen, Groningen, Netherlands
| | - Dorina de Jong
- Istituto Italiano di Tecnologia, Center for Translational Neurophysiology of Speech and Communication, (CTNSC), Ferrara, Italy
- Università di Ferrara, Dipartimento di Scienze Biomediche e Chirurgico Specialistiche, Ferrara, Italy
| | - Heleen A Slagter
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
- Institute of Brain and Behaviour Amsterdam, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - H Steven Scholte
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Amsterdam Brain & Cognition (ABC), University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
34
|
Watanabe N, Miyoshi K, Jimura K, Shimane D, Keerativittayayut R, Nakahara K, Takeda M. Multimodal deep neural decoding reveals highly resolved spatiotemporal profile of visual object representation in humans. Neuroimage 2023; 275:120164. [PMID: 37169115 DOI: 10.1016/j.neuroimage.2023.120164] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 05/02/2023] [Accepted: 05/09/2023] [Indexed: 05/13/2023] Open
Abstract
Perception and categorization of objects in a visual scene are essential to grasp the surrounding situation. Recently, neural decoding schemes, such as machine learning in functional magnetic resonance imaging (fMRI), has been employed to elucidate the underlying neural mechanisms. However, it remains unclear as to how spatially distributed brain regions temporally represent visual object categories and sub-categories. One promising strategy to address this issue is neural decoding with concurrently obtained neural response data of high spatial and temporal resolution. In this study, we explored the spatial and temporal organization of visual object representations using concurrent fMRI and electroencephalography (EEG), combined with neural decoding using deep neural networks (DNNs). We hypothesized that neural decoding by multimodal neural data with DNN would show high classification performance in visual object categorization (faces or non-face objects) and sub-categorization within faces and objects. Visualization of the fMRI DNN was more sensitive than that in the univariate approach and revealed that visual categorization occurred in brain-wide regions. Interestingly, the EEG DNN valued the earlier phase of neural responses for categorization and the later phase of neural responses for sub-categorization. Combination of the two DNNs improved the classification performance for both categorization and sub-categorization compared with fMRI DNN or EEG DNN alone. These deep learning-based results demonstrate a categorization principle in which visual objects are represented in a spatially organized and coarse-to-fine manner, and provide strong evidence of the ability of multimodal deep learning to uncover spatiotemporal neural machinery in sensory processing.
Collapse
Affiliation(s)
- Noriya Watanabe
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan
| | - Kosuke Miyoshi
- Narrative Nights, Inc., Yokohama, Kanagawa, 236-0011, Japan
| | - Koji Jimura
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan; Department of Informatics, Gunma University, Maebashi, Gunma, 371-8510, Japan
| | - Daisuke Shimane
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan
| | - Ruedeerat Keerativittayayut
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan; Chulabhorn Royal Academy, Bangkok, 10210, Thailand
| | - Kiyoshi Nakahara
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan
| | - Masaki Takeda
- Research Center for Brain Communication, Kochi University of Technology, Kami, Kochi, 782-8502, Japan.
| |
Collapse
|
35
|
Wu X, Fuentemilla L. Distinct encoding and post-encoding representational formats contribute to episodic sequence memory formation. Cereb Cortex 2023:7147876. [PMID: 37130823 DOI: 10.1093/cercor/bhad138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 03/31/2023] [Accepted: 04/04/2023] [Indexed: 05/04/2023] Open
Abstract
In episodic encoding, an unfolding experience is rapidly transformed into a memory representation that binds separate episodic elements into a memory form to be later recollected. However, it is unclear how brain activity changes over time to accommodate the encoding of incoming information. This study aimed to investigate the dynamics of the representational format that contributed to memory formation of sequential episodes. We combined representational similarity analysis and multivariate decoding approaches on EEG data to compare whether "category-level" or "item-level" representations supported memory formation during the online encoding of a picture triplet sequence and offline, in the period that immediately followed encoding. The findings revealed a gradual integration of category-level representation during the online encoding of the picture sequence and a rapid item-based neural reactivation of the encoded sequence at the episodic offset. However, we found that only memory reinstatement at episodic offset was associated with successful memory retrieval from long-term memory. These results suggest that post-encoding memory reinstatement is crucial for the rapid formation of unique memory for episodes that unfold over time. Overall, the study sheds light on the dynamics of representational format changes that take place during the formation of episodic memories.
Collapse
Affiliation(s)
- Xiongbo Wu
- Department of Cognition, Development and Educational Psychology, University of Barcelona, Pg Vall Hebrón 171, Barcelona 08035, Spain
- Institute of Neurosciences, University of Barcelona, Pg Vall Hebrón 171, Barcelona 08035, Spain
- Department of Psychology, Ludwig-Maximilians-Universität München, Leopoldstraße 13, Munich 80802, Germany
| | - Lluís Fuentemilla
- Department of Cognition, Development and Educational Psychology, University of Barcelona, Pg Vall Hebrón 171, Barcelona 08035, Spain
- Institute of Neurosciences, University of Barcelona, Pg Vall Hebrón 171, Barcelona 08035, Spain
- Cognition and Brain Plasticity Unit, Institute for Biomedical Research of Bellvitge, C/ Feixa Llarga, s/n - Pavelló de Govern - Edifici Modular, 08907, L'Hospitalet de Llobregat, Spain
| |
Collapse
|
36
|
Sadeghnejad N, Ezoji M, Ebrahimpour R, Zabbah S. Resolving the neural mechanism of core object recognition in space and time: A computational approach. Neurosci Res 2023; 190:36-50. [PMID: 36502958 DOI: 10.1016/j.neures.2022.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 11/09/2022] [Accepted: 12/01/2022] [Indexed: 12/14/2022]
Abstract
The underlying mechanism of object recognition- a fundamental brain ability- has been investigated in various studies. However, balancing between the speed and accuracy of recognition is less explored. Most of the computational models of object recognition are not potentially able to explain the recognition time and, thus, only focus on the recognition accuracy because of two reasons: lack of a temporal representation mechanism for sensory processing and using non-biological classifiers for decision-making processing. Here, we proposed a hierarchical temporal model of object recognition using a spiking deep neural network coupled to a biologically plausible decision-making model for explaining both recognition time and accuracy. We showed that the response dynamics of the proposed model can resemble those of the brain. Firstly, in an object recognition task, the model can mimic human's and monkey's recognition time as well as accuracy. Secondly, the model can replicate different speed-accuracy trade-off regimes as observed in the literature. More importantly, we demonstrated that temporal representation of different abstraction levels (superordinate, midlevel, and subordinate) in the proposed model matched the brain representation dynamics observed in previous studies. We conclude that the accumulation of spikes, generated by a hierarchical feedforward spiking structure, to reach abound can well explain not even the dynamics of making a decision, but also the representations dynamics for different abstraction levels.
Collapse
Affiliation(s)
- Naser Sadeghnejad
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
| | - Mehdi Ezoji
- Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran.
| | - Reza Ebrahimpour
- Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran, Iran; Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran; School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Sajjad Zabbah
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), Tehran, Iran; Wellcome Centre for Human Neuroimaging, University College London, London, UK; Max Planck UCL Centre for Computational Psychiatry and Aging Research, University College London, London, UK
| |
Collapse
|
37
|
Maheshwari J, Choudhary S, Joshi SD, Gandhi TK. Analysing the brain networks corresponding to the facial contrast-chimeras. Perception 2023; 52:371-384. [PMID: 37097905 DOI: 10.1177/03010066231169002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023]
Abstract
How humans recognise faces and objects effortlessly, has become a great point of interest. To understand the underlying process, one of the approaches is to study the facial features, in particular ordinal contrast relations around the eye region, which plays a crucial role in face recognition and perception. Recently the graph-theoretic approaches to electroencephalogram (EEG) analysis are found to be effective in understating the underlying process of human brain while performing various tasks. We have explored this approach in face recognition and perception to know the importance of contrast features around the eye region. We studied functional brain networks, formed using EEG responses, corresponding to four types of visual stimuli with varying contrast relationships: Positive faces, chimeric faces (photo-negated faces, preserving the polarity of contrast relationships around eyes), photo-negated faces and only eyes. We observed the variations in brain networks of each type of stimuli by finding the distribution of graph distances across brain networks of all subjects. Moreover, our statistical analysis shows that positive and chimeric faces are equally easy to recognise in contrast to difficult recognition of negative faces and only eyes.
Collapse
Affiliation(s)
- Jyoti Maheshwari
- Bharti School of Telecommunication Technology and Management, Indian Institute Of Technology Delhi, India
- Indian Institute Of Technology Delhi,India
| | | | | | | |
Collapse
|
38
|
Yargholi E, Op de Beeck H. Category Trumps Shape as an Organizational Principle of Object Space in the Human Occipitotemporal Cortex. J Neurosci 2023; 43:2960-2972. [PMID: 36922027 PMCID: PMC10124953 DOI: 10.1523/jneurosci.2179-22.2023] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 02/22/2023] [Accepted: 03/03/2023] [Indexed: 03/17/2023] Open
Abstract
The organizational principles of the object space represented in the human ventral visual cortex are debated. Here we contrast two prominent proposals that, in addition to an organization in terms of animacy, propose either a representation related to aspect ratio (stubby-spiky) or to the distinction between faces and bodies. We designed a critical test that dissociates the latter two categories from aspect ratio and investigated responses from human fMRI (of either sex) and deep neural networks (BigBiGAN). Representational similarity and decoding analyses showed that the object space in the occipitotemporal cortex and BigBiGAN was partially explained by animacy but not by aspect ratio. Data-driven approaches showed clusters for face and body stimuli and animate-inanimate separation in the representational space of occipitotemporal cortex and BigBiGAN, but no arrangement related to aspect ratio. In sum, the findings go in favor of a model in terms of an animacy representation combined with strong selectivity for faces and bodies.SIGNIFICANCE STATEMENT We contrasted animacy, aspect ratio, and face-body as principal dimensions characterizing object space in the occipitotemporal cortex. This is difficult to test, as typically faces and bodies differ in aspect ratio (faces are mostly stubby and bodies are mostly spiky). To dissociate the face-body distinction from the difference in aspect ratio, we created a new stimulus set in which faces and bodies have a similar and very wide distribution of values along the shape dimension of the aspect ratio. Brain imaging (fMRI) with this new stimulus set showed that, in addition to animacy, the object space is mainly organized by the face-body distinction and selectivity for aspect ratio is minor (despite its wide distribution).
Collapse
Affiliation(s)
- Elahe' Yargholi
- Department of Brain and Cognition, Leuven Brain Institute, Faculty of Psychology & Educational Sciences, KU Leuven, 3000 Leuven, Belgium
| | - Hans Op de Beeck
- Department of Brain and Cognition, Leuven Brain Institute, Faculty of Psychology & Educational Sciences, KU Leuven, 3000 Leuven, Belgium
| |
Collapse
|
39
|
Jérémie JN, Perrinet LU. Ultrafast Image Categorization in Biology and Neural Models. Vision (Basel) 2023; 7:29. [PMID: 37092462 PMCID: PMC10123664 DOI: 10.3390/vision7020029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 03/09/2023] [Accepted: 03/15/2023] [Indexed: 03/29/2023] Open
Abstract
Humans are able to categorize images very efficiently, in particular to detect the presence of an animal very quickly. Recently, deep learning algorithms based on convolutional neural networks (CNNs) have achieved higher than human accuracy for a wide range of visual categorization tasks. However, the tasks on which these artificial networks are typically trained and evaluated tend to be highly specialized and do not generalize well, e.g., accuracy drops after image rotation. In this respect, biological visual systems are more flexible and efficient than artificial systems for more general tasks, such as recognizing an animal. To further the comparison between biological and artificial neural networks, we re-trained the standard VGG 16 CNN on two independent tasks that are ecologically relevant to humans: detecting the presence of an animal or an artifact. We show that re-training the network achieves a human-like level of performance, comparable to that reported in psychophysical tasks. In addition, we show that the categorization is better when the outputs of the models are combined. Indeed, animals (e.g., lions) tend to be less present in photographs that contain artifacts (e.g., buildings). Furthermore, these re-trained models were able to reproduce some unexpected behavioral observations from human psychophysics, such as robustness to rotation (e.g., an upside-down or tilted image) or to a grayscale transformation. Finally, we quantified the number of CNN layers required to achieve such performance and showed that good accuracy for ultrafast image categorization can be achieved with only a few layers, challenging the belief that image recognition requires deep sequential analysis of visual objects. We hope to extend this framework to biomimetic deep neural architectures designed for ecological tasks, but also to guide future model-based psychophysical experiments that would deepen our understanding of biological vision.
Collapse
Affiliation(s)
- Jean-Nicolas Jérémie
- Institut de Neurosciences de la Timone (UMR 7289), Aix Marseille University, CNRS, 13005 Marseille, France
| | - Laurent U. Perrinet
- Institut de Neurosciences de la Timone (UMR 7289), Aix Marseille University, CNRS, 13005 Marseille, France
| |
Collapse
|
40
|
Macaques recognize features in synthetic images derived from ventral stream neurons. Proc Natl Acad Sci U S A 2023; 120:e2213034120. [PMID: 36857345 PMCID: PMC10013870 DOI: 10.1073/pnas.2213034120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023] Open
Abstract
Primates can recognize features in virtually all types of images, an ability that still requires a comprehensive computational explanation. One hypothesis is that visual cortex neurons learn patterns from scenes, objects, and textures, and use these patterns to interpolate incoming visual information. We have used machine learning algorithms to instantiate visual patterns stored by neurons-we call these highly activating images prototypes. Prototypes from inferotemporal (IT) neurons often resemble parts of real-world objects, such as monkey faces and body parts, a similarity established via pretrained neural networks [C. R. Ponce et al., Cell 177, 999-1009.e10 (2019)] and naïve human participants [A. Bardon, W. Xiao, C. R. Ponce, M. S. Livingstone, G. Kreiman, Proc. Natl. Acad. Sci. U.S.A. 119, e2118705119 (2022)]. However, it is not known whether monkeys themselves perceive similarities between neuronal prototypes and real-world objects. Here, we investigated whether monkeys reported similarities between prototypes and real-world objects using a two-alternative forced choice task. We trained the animals to saccade to synthetic images of monkeys, and subsequently tested how they classified prototypes synthesized from IT and primary visual cortex (V1). We found monkeys classified IT prototypes as conspecifics more often than they did random generator images and V1 prototypes, and their choices were partially predicted by convolutional neural networks. Further, we confirmed that monkeys could abstract general shape information from images of real-world objects. Finally, we verified these results with human participants. Our results provide further evidence that prototypes from cortical neurons represent interpretable abstractions from the visual world.
Collapse
|
41
|
Frisby SL, Halai AD, Cox CR, Lambon Ralph MA, Rogers TT. Decoding semantic representations in mind and brain. Trends Cogn Sci 2023; 27:258-281. [PMID: 36631371 DOI: 10.1016/j.tics.2022.12.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 12/12/2022] [Accepted: 12/13/2022] [Indexed: 01/11/2023]
Abstract
A key goal for cognitive neuroscience is to understand the neurocognitive systems that support semantic memory. Recent multivariate analyses of neuroimaging data have contributed greatly to this effort, but the rapid development of these novel approaches has made it difficult to track the diversity of findings and to understand how and why they sometimes lead to contradictory conclusions. We address this challenge by reviewing cognitive theories of semantic representation and their neural instantiation. We then consider contemporary approaches to neural decoding and assess which types of representation each can possibly detect. The analysis suggests why the results are heterogeneous and identifies crucial links between cognitive theory, data collection, and analysis that can help to better connect neuroimaging to mechanistic theories of semantic cognition.
Collapse
Affiliation(s)
- Saskia L Frisby
- Medical Research Council (MRC) Cognition and Brain Sciences Unit, Chaucer Road, Cambridge CB2 7EF, UK.
| | - Ajay D Halai
- Medical Research Council (MRC) Cognition and Brain Sciences Unit, Chaucer Road, Cambridge CB2 7EF, UK
| | - Christopher R Cox
- Department of Psychology, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Matthew A Lambon Ralph
- Medical Research Council (MRC) Cognition and Brain Sciences Unit, Chaucer Road, Cambridge CB2 7EF, UK
| | - Timothy T Rogers
- Department of Psychology, University of Wisconsin-Madison, 1202 West Johnson Street, Madison, WI 53706, USA.
| |
Collapse
|
42
|
Stereopsis provides a constant feed to visual shape representation. Vision Res 2023; 204:108175. [PMID: 36571983 DOI: 10.1016/j.visres.2022.108175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 11/10/2022] [Accepted: 12/05/2022] [Indexed: 12/25/2022]
Abstract
The contribution of stereopsis in human visual shape perception was examined using stimuli with either null, normal, or reversed binocular disparity in an old/new object recognition task. The highest levels of recognition performance were observed with null and normal binocular disparity displays, which did not differ. However, reversed disparity led to significantly worse performance than either of the other display conditions. This indicates that stereopsis provides a continuous input to the mechanisms involved in shape perception.
Collapse
|
43
|
Liu W, Cheng Y, Yuan X, Jiang Y. Looking more masculine among females: Spatial context modulates gender perception of face and biological motion. Br J Psychol 2023; 114:194-208. [PMID: 36302701 DOI: 10.1111/bjop.12605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 09/12/2022] [Accepted: 10/11/2022] [Indexed: 01/11/2023]
Abstract
Perception of visual information highly depends on spatial context. For instance, perception of a low-level visual feature, such as orientation, can be shifted away from its surrounding context, exhibiting a simultaneous contrast effect. Although previous studies have demonstrated the adaptation aftereffect of gender, a high-level visual feature, it remains largely unknown whether gender perception can also be shaped by a simultaneously presented context. In the present study, we found that the gender perception of a central face or a point-light walker was repelled away from the gender of its surrounding faces or walkers. A norm-based opponent model of lateral inhibition, which accounts for the adaptation aftereffect of high-level features, can also excellently fit the simultaneous contrast effect. But different from the reported contextual effect of low-level features, the simultaneous contrast effect of gender cannot be observed when the centre and the surrounding stimuli are from different categories, or when the surrounding stimuli are suppressed from awareness. These findings on one hand reveal a resemblance between the simultaneous contrast effect and the adaptation aftereffect of high-level features, on the other hand highlight different biological mechanisms underlying the contextual effects of low- and high-level visual features.
Collapse
Affiliation(s)
- Wenjie Liu
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.,Chinese Institute for Brain Research, Beijing, China
| | - Yuhui Cheng
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.,Chinese Institute for Brain Research, Beijing, China
| | - Xiangyong Yuan
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.,Chinese Institute for Brain Research, Beijing, China
| | - Yi Jiang
- State Key Laboratory of Brain and Cognitive Science, CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China.,Chinese Institute for Brain Research, Beijing, China
| |
Collapse
|
44
|
Mokari-Mahallati M, Ebrahimpour R, Bagheri N, Karimi-Rouzbahani H. Deeper neural network models better reflect how humans cope with contrast variation in object recognition. Neurosci Res 2023:S0168-0102(23)00007-X. [PMID: 36681154 DOI: 10.1016/j.neures.2023.01.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 11/27/2022] [Accepted: 01/17/2023] [Indexed: 01/20/2023]
Abstract
Visual inputs are far from ideal in everyday situations such as in the fog where the contrasts of input stimuli are low. However, human perception remains relatively robust to contrast variations. To provide insights about the underlying mechanisms of contrast invariance, we addressed two questions. Do contrast effects disappear along the visual hierarchy? Do later stages of the visual hierarchy contribute to contrast invariance? We ran a behavioral experiment where we manipulated the level of stimulus contrast and the involvement of higher-level visual areas through immediate and delayed backward masking of the stimulus. Backward masking led to significant drop in performance in our visual categorization task, supporting the role of higher-level visual areas in contrast invariance. To obtain mechanistic insights, we ran the same categorization task on three state-of the-art computational models of human vision each with a different depth in visual hierarchy. We found contrast effects all along the visual hierarchy, no matter how far into the hierarchy. Moreover, that final layers of deeper hierarchical models, which had been shown to be best models of final stages of the visual system, coped with contrast effects more effectively. These results suggest that, while contrast effects reach the final stages of the hierarchy, those stages play a significant role in compensating for contrast variations in the visual system.
Collapse
Affiliation(s)
- Masoumeh Mokari-Mahallati
- Department of Electrical Engineering, Shahid Rajaee Teacher Training University, Tehran, Islamic Republic of Iran
| | - Reza Ebrahimpour
- Center for Cognitive Science, Institute for Convergence Science and Technology (ICST), Sharif University of Technology, Tehran P.O.Box:11155-1639, Islamic Republic of Iran; Department of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Islamic Republic of Iran; School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Islamic Republic of Iran.
| | - Nasour Bagheri
- Department of Electrical Engineering, Shahid Rajaee Teacher Training University, Tehran, Islamic Republic of Iran
| | - Hamid Karimi-Rouzbahani
- MRC Cognition & Brain Sciences Unit, University of Cambridge, UK; Mater Research Institute, Faculty of Medicine, University of Queensland, Australia
| |
Collapse
|
45
|
Grimaldi A, Gruel A, Besnainou C, Jérémie JN, Martinet J, Perrinet LU. Precise Spiking Motifs in Neurobiological and Neuromorphic Data. Brain Sci 2022; 13:68. [PMID: 36672049 PMCID: PMC9856822 DOI: 10.3390/brainsci13010068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/20/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022] Open
Abstract
Why do neurons communicate through spikes? By definition, spikes are all-or-none neural events which occur at continuous times. In other words, spikes are on one side binary, existing or not without further details, and on the other, can occur at any asynchronous time, without the need for a centralized clock. This stands in stark contrast to the analog representation of values and the discretized timing classically used in digital processing and at the base of modern-day neural networks. As neural systems almost systematically use this so-called event-based representation in the living world, a better understanding of this phenomenon remains a fundamental challenge in neurobiology in order to better interpret the profusion of recorded data. With the growing need for intelligent embedded systems, it also emerges as a new computing paradigm to enable the efficient operation of a new class of sensors and event-based computers, called neuromorphic, which could enable significant gains in computation time and energy consumption-a major societal issue in the era of the digital economy and global warming. In this review paper, we provide evidence from biology, theory and engineering that the precise timing of spikes plays a crucial role in our understanding of the efficiency of neural networks.
Collapse
Affiliation(s)
- Antoine Grimaldi
- INT UMR 7289, Aix Marseille Univ, CNRS, 27 Bd Jean Moulin, 13005 Marseille, France
| | - Amélie Gruel
- SPARKS, Côte d’Azur, CNRS, I3S, 2000 Rte des Lucioles, 06900 Sophia-Antipolis, France
| | - Camille Besnainou
- INT UMR 7289, Aix Marseille Univ, CNRS, 27 Bd Jean Moulin, 13005 Marseille, France
| | - Jean-Nicolas Jérémie
- INT UMR 7289, Aix Marseille Univ, CNRS, 27 Bd Jean Moulin, 13005 Marseille, France
| | - Jean Martinet
- SPARKS, Côte d’Azur, CNRS, I3S, 2000 Rte des Lucioles, 06900 Sophia-Antipolis, France
| | - Laurent U. Perrinet
- INT UMR 7289, Aix Marseille Univ, CNRS, 27 Bd Jean Moulin, 13005 Marseille, France
| |
Collapse
|
46
|
Li H, Xu C, Ma L, Bo H, Zhang D. MODENN: A Shallow Broad Neural Network Model Based on Multi-Order Descartes Expansion. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9417-9433. [PMID: 34748480 DOI: 10.1109/tpami.2021.3125690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Deep neural networks have achieved great success in almost every field of artificial intelligence. However, several weaknesses keep bothering researchers due to its hierarchical structure, particularly when large-scale parallelism, faster learning, better performance, and high reliability are required. Inspired by the parallel and large-scale information processing structures in the human brain, a shallow broad neural network model is proposed on a specially designed multi-order Descartes expansion operation. Such Descartes expansion acts as an efficient feature extraction method for the network, improve the separability of the original pattern by transforming the raw data pattern into a high-dimensional feature space, the multi-order Descartes expansion space. As a result, a single-layer perceptron network will be able to accomplish the classification task. The multi-order Descartes expansion neural network (MODENN) is thus created by combining the multi-order Descartes expansion operation and the single-layer perceptron together, and its capacity is proved equivalent to the traditional multi-layer perceptron and the deep neural networks. Three kinds of experiments were implemented, the results showed that the proposed MODENN model retains great potentiality in many aspects, including implementability, parallelizability, performance, robustness, and interpretability, indicating MODENN would be an excellent alternative to mainstream neural networks.
Collapse
|
47
|
Zhang M, Armendariz M, Xiao W, Rose O, Bendtz K, Livingstone M, Ponce C, Kreiman G. Look twice: A generalist computational model predicts return fixations across tasks and species. PLoS Comput Biol 2022; 18:e1010654. [PMID: 36413523 PMCID: PMC9681066 DOI: 10.1371/journal.pcbi.1010654] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 10/13/2022] [Indexed: 11/23/2022] Open
Abstract
Primates constantly explore their surroundings via saccadic eye movements that bring different parts of an image into high resolution. In addition to exploring new regions in the visual field, primates also make frequent return fixations, revisiting previously foveated locations. We systematically studied a total of 44,328 return fixations out of 217,440 fixations. Return fixations were ubiquitous across different behavioral tasks, in monkeys and humans, both when subjects viewed static images and when subjects performed natural behaviors. Return fixations locations were consistent across subjects, tended to occur within short temporal offsets, and typically followed a 180-degree turn in saccadic direction. To understand the origin of return fixations, we propose a proof-of-principle, biologically-inspired and image-computable neural network model. The model combines five key modules: an image feature extractor, bottom-up saliency cues, task-relevant visual features, finite inhibition-of-return, and saccade size constraints. Even though there are no free parameters that are fine-tuned for each specific task, species, or condition, the model produces fixation sequences resembling the universal properties of return fixations. These results provide initial steps towards a mechanistic understanding of the trade-off between rapid foveal recognition and the need to scrutinize previous fixation locations.
Collapse
Affiliation(s)
- Mengmi Zhang
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | - Marcelo Armendariz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- Laboratory for Neuro- and Psychophysiology, KU Leuven, Leuven, Belgium
| | - Will Xiao
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Olivia Rose
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Katarina Bendtz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
| | - Margaret Livingstone
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Carlos Ponce
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Gabriel Kreiman
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
48
|
Favila SE, Kuhl BA, Winawer J. Perception and memory have distinct spatial tuning properties in human visual cortex. Nat Commun 2022; 13:5864. [PMID: 36257949 PMCID: PMC9579130 DOI: 10.1038/s41467-022-33161-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 09/06/2022] [Indexed: 11/12/2022] Open
Abstract
Reactivation of earlier perceptual activity is thought to underlie long-term memory recall. Despite evidence for this view, it is unclear whether mnemonic activity exhibits the same tuning properties as feedforward perceptual activity. Here, we leverage population receptive field models to parameterize fMRI activity in human visual cortex during spatial memory retrieval. Though retinotopic organization is present during both perception and memory, large systematic differences in tuning are also evident. Whereas there is a three-fold decline in spatial precision from early to late visual areas during perception, this pattern is not observed during memory retrieval. This difference cannot be explained by reduced signal-to-noise or poor performance on memory trials. Instead, by simulating top-down activity in a network model of cortex, we demonstrate that this property is well explained by the hierarchical structure of the visual system. Together, modeling and empirical results suggest that computational constraints imposed by visual system architecture limit the fidelity of memory reactivation in sensory cortex.
Collapse
Affiliation(s)
- Serra E Favila
- Department of Psychology, New York University, New York, NY, 10003, USA.
- Department of Psychology, Columbia University, New York, NY, 10027, USA.
| | - Brice A Kuhl
- Department of Psychology, University of Oregon, Eugene, OR, 97403, USA
- Institute of Neuroscience, University of Oregon, Eugene, OR, 97403, USA
| | - Jonathan Winawer
- Department of Psychology, New York University, New York, NY, 10003, USA
- Center for Neural Science, New York University, New York, NY, 10003, USA
| |
Collapse
|
49
|
Shi J, Tripp B, Shea-Brown E, Mihalas S, A. Buice M. MouseNet: A biologically constrained convolutional neural network model for the mouse visual cortex. PLoS Comput Biol 2022; 18:e1010427. [PMID: 36067234 PMCID: PMC9481165 DOI: 10.1371/journal.pcbi.1010427] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 09/16/2022] [Accepted: 07/22/2022] [Indexed: 11/19/2022] Open
Abstract
Convolutional neural networks trained on object recognition derive inspiration from the neural architecture of the visual system in mammals, and have been used as models of the feedforward computation performed in the primate ventral stream. In contrast to the deep hierarchical organization of primates, the visual system of the mouse has a shallower arrangement. Since mice and primates are both capable of visually guided behavior, this raises questions about the role of architecture in neural computation. In this work, we introduce a novel framework for building a biologically constrained convolutional neural network model of the mouse visual cortex. The architecture and structural parameters of the network are derived from experimental measurements, specifically the 100-micrometer resolution interareal connectome, the estimates of numbers of neurons in each area and cortical layer, and the statistics of connections between cortical layers. This network is constructed to support detailed task-optimized models of mouse visual cortex, with neural populations that can be compared to specific corresponding populations in the mouse brain. Using a well-studied image classification task as our working example, we demonstrate the computational capability of this mouse-sized network. Given its relatively small size, MouseNet achieves roughly 2/3rds the performance level on ImageNet as VGG16. In combination with the large scale Allen Brain Observatory Visual Coding dataset, we use representational similarity analysis to quantify the extent to which MouseNet recapitulates the neural representation in mouse visual cortex. Importantly, we provide evidence that optimizing for task performance does not improve similarity to the corresponding biological system beyond a certain point. We demonstrate that the distributions of some physiological quantities are closer to the observed distributions in the mouse brain after task training. We encourage the use of the MouseNet architecture by making the code freely available.
Collapse
Affiliation(s)
- Jianghong Shi
- Applied Mathematics and Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America
| | - Bryan Tripp
- Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, Ontario, Canada
| | - Eric Shea-Brown
- Applied Mathematics and Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America
- Allen Institute, Seattle, WA, United States of America
| | - Stefan Mihalas
- Applied Mathematics and Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America
- Allen Institute, Seattle, WA, United States of America
| | - Michael A. Buice
- Applied Mathematics and Computational Neuroscience Center, University of Washington, Seattle, WA, United States of America
- Allen Institute, Seattle, WA, United States of America
| |
Collapse
|
50
|
Yu Q, Song S, Ma C, Wei J, Chen S, Tan KC. Temporal Encoding and Multispike Learning Framework for Efficient Recognition of Visual Patterns. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:3387-3399. [PMID: 33531306 DOI: 10.1109/tnnls.2021.3052804] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Biological systems under a parallel and spike-based computation endow individuals with abilities to have prompt and reliable responses to different stimuli. Spiking neural networks (SNNs) have thus been developed to emulate their efficiency and to explore principles of spike-based processing. However, the design of a biologically plausible and efficient SNN for image classification still remains as a challenging task. Previous efforts can be generally clustered into two major categories in terms of coding schemes being employed: rate and temporal. The rate-based schemes suffer inefficiency, whereas the temporal-based ones typically end with a relatively poor performance in accuracy. It is intriguing and important to develop an SNN with both efficiency and efficacy being considered. In this article, we focus on the temporal-based approaches in a way to advance their accuracy performance by a great margin while keeping the efficiency on the other hand. A new temporal-based framework integrated with the multispike learning is developed for efficient recognition of visual patterns. Different approaches of encoding and learning under our framework are evaluated with the MNIST and Fashion-MNIST data sets. Experimental results demonstrate the efficient and effective performance of our temporal-based approaches across a variety of conditions, improving accuracies to higher levels that are even comparable to rate-based ones but importantly with a lighter network structure and far less number of spikes. This article attempts to extend the advanced multispike learning to the challenging task of image recognition and bring state of the arts in temporal-based approaches to a novel level. The experimental results could be potentially favorable to low-power and high-speed requirements in the field of artificial intelligence and contribute to attract more efforts toward brain-like computing.
Collapse
|