1
|
Madison A, Callahan-Flintoft C, Thurman SM, Hoffing RAC, Touryan J, Ries AJ. Fixation-related potentials during a virtual navigation task: The influence of image statistics on early cortical processing. Atten Percept Psychophys 2025:10.3758/s13414-024-03002-5. [PMID: 39849263 DOI: 10.3758/s13414-024-03002-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2024] [Indexed: 01/25/2025]
Abstract
Historically, electrophysiological correlates of scene processing have been studied with experiments using static stimuli presented for discrete timescales where participants maintain a fixed eye position. Gaps remain in generalizing these findings to real-world conditions where eye movements are made to select new visual information and where the environment remains stable but changes with our position and orientation in space, driving dynamic visual stimulation. Co-recording of eye movements and electroencephalography (EEG) is an approach to leverage fixations as time-locking events in the EEG recording under free-viewing conditions to create fixation-related potentials (FRPs), providing a neural snapshot in which to study visual processing under naturalistic conditions. The current experiment aimed to explore the influence of low-level image statistics-specifically, luminance and a metric of spatial frequency (slope of the amplitude spectrum)-on the early visual components evoked from fixation onsets in a free-viewing visual search and navigation task using a virtual environment. This research combines FRPs with an optimized approach to remove ocular artifacts and deconvolution modeling to correct for overlapping neural activity inherent in any free-viewing paradigm. The results suggest that early visual components-namely, the lambda response and N1-of the FRPs are sensitive to luminance and spatial frequency around fixation, separate from modulation due to underlying differences in eye-movement characteristics. Together, our results demonstrate the utility of studying the influence of image statistics on FRPs using a deconvolution modeling approach to control for overlapping neural activity and oculomotor covariates.
Collapse
Affiliation(s)
- Anna Madison
- U.S. DEVCOM Army Research Laboratory, Humans in Complex Systems, Aberdeen Proving Ground, MD, USA
- Warfighter Effectiveness Research Center, Department of Behavioral Sciences & Leadership, 2354 Fairchild Drive, Suite 6, U.S. Air Force Academy, CO, 80840, USA
| | - Chloe Callahan-Flintoft
- Warfighter Effectiveness Research Center, Department of Behavioral Sciences & Leadership, 2354 Fairchild Drive, Suite 6, U.S. Air Force Academy, CO, 80840, USA
| | - Steven M Thurman
- U.S. DEVCOM Army Research Laboratory, Humans in Complex Systems, Aberdeen Proving Ground, MD, USA
| | - Russell A Cohen Hoffing
- U.S. DEVCOM Army Research Laboratory, Humans in Complex Systems, Aberdeen Proving Ground, MD, USA
| | - Jonathan Touryan
- U.S. DEVCOM Army Research Laboratory, Humans in Complex Systems, Aberdeen Proving Ground, MD, USA
| | - Anthony J Ries
- U.S. DEVCOM Army Research Laboratory, Humans in Complex Systems, Aberdeen Proving Ground, MD, USA.
- Warfighter Effectiveness Research Center, Department of Behavioral Sciences & Leadership, 2354 Fairchild Drive, Suite 6, U.S. Air Force Academy, CO, 80840, USA.
| |
Collapse
|
2
|
Badwal MW, Bergmann J, Roth JHR, Doeller CF, Hebart MN. The Scope and Limits of Fine-Grained Image and Category Information in the Ventral Visual Pathway. J Neurosci 2025; 45:e0936242024. [PMID: 39505406 PMCID: PMC11735656 DOI: 10.1523/jneurosci.0936-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 09/15/2024] [Accepted: 09/20/2024] [Indexed: 11/08/2024] Open
Abstract
Humans can easily abstract incoming visual information into discrete semantic categories. Previous research employing functional MRI (fMRI) in humans has identified cortical organizing principles that allow not only for coarse-scale distinctions such as animate versus inanimate objects but also more fine-grained distinctions at the level of individual objects. This suggests that fMRI carries rather fine-grained information about individual objects. However, most previous work investigating fine-grained category representations either additionally included coarse-scale category comparisons of objects, which confounds fine-grained and coarse-scale distinctions, or only used a single exemplar of each object, which confounds visual and semantic information. To address these challenges, here we used multisession human fMRI (female and male) paired with a broad yet homogenous stimulus class of 48 terrestrial mammals, with two exemplars per mammal. Multivariate decoding and representational similarity analysis revealed high image-specific reliability in low- and high-level visual regions, indicating stable representational patterns at the image level. In contrast, analyses across exemplars of the same animal yielded only small effects in the lateral occipital complex (LOC), indicating rather subtle category effects in this region. Variance partitioning with a deep neural network and shape model showed that across-exemplar effects in the early visual cortex were largely explained by low-level visual appearance, while representations in LOC appeared to also contain higher category-specific information. These results suggest that representations typically measured with fMRI are dominated by image-specific visual or coarse-grained category information but indicate that commonly employed fMRI protocols may reveal subtle yet reliable distinctions between individual objects.
Collapse
Affiliation(s)
- Markus W Badwal
- Department of Psychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
- Vision & Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
- Department of Neurosurgery, University of Leipzig Medical Center, Leipzig 04103, Germany
| | - Johanna Bergmann
- Department of Psychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
| | - Johannes H R Roth
- Vision & Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
- Department of Medicine, Justus Liebig University, Giessen 35390 Germany
| | - Christian F Doeller
- Department of Psychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
- Kavli Institute for Systems Neuroscience, Norwegian University of Science and Technology, Trondheim 7030, Norway
| | - Martin N Hebart
- Vision & Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig 04103, Germany
- Department of Medicine, Justus Liebig University, Giessen 35390 Germany
- Center for Mind, Brain and Behavior, Universities of Marburg, Giessen, and Darmstadt, Marburg 35032, Germany
| |
Collapse
|
3
|
Allegretti E, D'Innocenzo G, Coco MI. The Visual Integration of Semantic and Spatial Information of Objects in Naturalistic Scenes (VISIONS) database: attentional, conceptual, and perceptual norms. Behav Res Methods 2025; 57:42. [PMID: 39753746 DOI: 10.3758/s13428-024-02535-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/23/2024] [Indexed: 01/11/2025]
Abstract
The complex interplay between low- and high-level mechanisms governing our visual system can only be fully understood within ecologically valid naturalistic contexts. For this reason, in recent years, substantial efforts have been devoted to equipping the scientific community with datasets of realistic images normed on semantic or spatial features. Here, we introduce VISIONS, an extensive database of 1136 naturalistic scenes normed on a wide range of perceptual and conceptual norms by 185 English speakers across three levels of granularity: isolated object, whole scene, and object-in-scene. Each naturalistic scene contains a critical object systematically manipulated and normed regarding its semantic consistency (e.g., a toothbrush vs. a flashlight in a bathroom) and spatial position (i.e., left, right). Normative data are also available for low- (i.e., clarity, visual complexity) and high-level (i.e., name agreement, confidence, familiarity, prototypicality, manipulability) features of the critical object and its embedding scene context. Eye-tracking data during a free-viewing task further confirms the experimental validity of our manipulations while theoretically demonstrating that object semantics is acquired in extra-foveal vision and used to guide early overt attention. To our knowledge, VISIONS is the first database exhaustively covering norms about integrating objects in scenes and providing several perceptual and conceptual norms of the two as independently taken. We expect VISIONS to become an invaluable image dataset to examine and answer timely questions above and beyond vision science, where a diversity of perceptual, attentive, mnemonic, or linguistic processes could be explored as they develop, age, or become neuropathological.
Collapse
Affiliation(s)
- Elena Allegretti
- Department of Psychology, Sapienza, University of Rome, Rome, Italy.
| | | | - Moreno I Coco
- Department of Psychology, Sapienza, University of Rome, Rome, Italy.
- I.R.C.C.S. Fondazione Santa Lucia, Rome, Italy.
| |
Collapse
|
4
|
Koc AN, Urgen BA, Afacan Y. Task-modulated neural responses in scene-selective regions of the human brain. Vision Res 2024; 227:108539. [PMID: 39733756 DOI: 10.1016/j.visres.2024.108539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 10/29/2024] [Accepted: 12/20/2024] [Indexed: 12/31/2024]
Abstract
The study of scene perception is crucial to the understanding of how one interprets and interacts with their environment, and how the environment impacts various cognitive functions. The literature so far has mainly focused on the impact of low-level and categorical properties of scenes and how they are represented in the scene-selective regions in the brain, PPA, RSC, and OPA. However, higher-level scene perception and the impact of behavioral goals is a developing research area. Moreover, the selection of the stimuli has not been systematic and mainly focused on outdoor environments. In this fMRI experiment, we adopted multiple behavioral tasks, selected real-life indoor stimuli with a systematic categorization approach, and used various multivariate analysis techniques to explain the neural modulation of scene perception in the scene-selective regions of the human brain. Participants (N = 21) performed categorization and approach-avoidance tasks during fMRI scans while they were viewing scenes from built environment categories based on different affordances ((i)access and (ii)circulation elements, (iii)restrooms and (iv)eating/seating areas). ROI-based classification analysis revealed that the OPA was significantly successful in decoding scene category regardless of the task, and that the task condition affected category decoding performances of all the scene-selective regions. Model-based representational similarity analysis (RSA) revealed that the activity patterns in scene-selective regions are best explained by task. These results contribute to the literature by extending the task and stimulus content of scene perception research, and uncovering the impact of behavioral goals on the scene-selective regions of the brain.
Collapse
Affiliation(s)
- Aysu Nur Koc
- Department of Psychology, Justus Liebig University Giessen, Giessen, Germany; Interdisciplinary Neuroscience Program, Bilkent University, Ankara, Turkey.
| | - Burcu A Urgen
- Interdisciplinary Neuroscience Program, Bilkent University, Ankara, Turkey; Department of Psychology, Bilkent University, Ankara, Turkey; Aysel Sabuncu Brain Research Center and National Magnetic Resonance Imaging Center, Bilkent University, Ankara, Turkey.
| | - Yasemin Afacan
- Interdisciplinary Neuroscience Program, Bilkent University, Ankara, Turkey; Department of Interior Architecture and Environmental Design, Bilkent University, Ankara, Turkey; Aysel Sabuncu Brain Research Center and National Magnetic Resonance Imaging Center, Bilkent University, Ankara, Turkey.
| |
Collapse
|
5
|
Contier O, Baker CI, Hebart MN. Distributed representations of behaviour-derived object dimensions in the human visual system. Nat Hum Behav 2024; 8:2179-2193. [PMID: 39251723 PMCID: PMC11576512 DOI: 10.1038/s41562-024-01980-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 08/06/2024] [Indexed: 09/11/2024]
Abstract
Object vision is commonly thought to involve a hierarchy of brain regions processing increasingly complex image features, with high-level visual cortex supporting object recognition and categorization. However, object vision supports diverse behavioural goals, suggesting basic limitations of this category-centric framework. To address these limitations, we mapped a series of dimensions derived from a large-scale analysis of human similarity judgements directly onto the brain. Our results reveal broadly distributed representations of behaviourally relevant information, demonstrating selectivity to a wide variety of novel dimensions while capturing known selectivities for visual features and categories. Behaviour-derived dimensions were superior to categories at predicting brain responses, yielding mixed selectivity in much of visual cortex and sparse selectivity in category-selective clusters. This framework reconciles seemingly disparate findings regarding regional specialization, explaining category selectivity as a special case of sparse response profiles among representational dimensions, suggesting a more expansive view on visual processing in the human brain.
Collapse
Affiliation(s)
- Oliver Contier
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
- Max Planck School of Cognition, Leipzig, Germany.
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA
| | - Martin N Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Medicine, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
6
|
Scrivener CL, Zamboni E, Morland AB, Silson EH. Retinotopy drives the variation in scene responses across visual field map divisions of the occipital place area. J Vis 2024; 24:10. [PMID: 39167394 PMCID: PMC11343012 DOI: 10.1167/jov.24.8.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/09/2024] [Indexed: 08/23/2024] Open
Abstract
The occipital place area (OPA) is a scene-selective region on the lateral surface of human occipitotemporal cortex that spatially overlaps multiple visual field maps, as well as portions of cortex that are not currently defined as retinotopic. Here we combined population receptive field modeling and responses to scenes in a representational similarity analysis (RSA) framework to test the prediction that the OPA's visual field map divisions contribute uniquely to the overall pattern of scene selectivity within the OPA. Consistent with this prediction, the patterns of response to a set of complex scenes were heterogeneous between maps. To explain this heterogeneity, we tested the explanatory power of seven candidate models using RSA. These models spanned different scene dimensions (Content, Expanse, Distance), low- and high-level visual features, and navigational affordances. None of the tested models could account for the variation in scene response observed between the OPA's visual field maps. However, the heterogeneity in scene response was correlated with the differences in retinotopic profiles across maps. These data highlight the need to carefully examine the relationship between regions defined as category-selective and the underlying retinotopy, and they suggest that, in the case of the OPA, it may not be appropriate to conceptualize it as a single scene-selective region.
Collapse
Affiliation(s)
| | - Elisa Zamboni
- Department of Psychology, University of York, York, UK
- School of Psychology, University of Nottingham, University Park, Nottingham, UK
| | - Antony B Morland
- Department of Psychology, University of York, York, UK
- York Biomedical Research Institute, University of York, York, UK
- York Neuroimaging Centre, Department of Psychology, University of York, York, UK
| | - Edward H Silson
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
7
|
Contier O, Baker CI, Hebart MN. Distributed representations of behavior-derived object dimensions in the human visual system. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.08.23.553812. [PMID: 37662312 PMCID: PMC10473665 DOI: 10.1101/2023.08.23.553812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Object vision is commonly thought to involve a hierarchy of brain regions processing increasingly complex image features, with high-level visual cortex supporting object recognition and categorization. However, object vision supports diverse behavioral goals, suggesting basic limitations of this category-centric framework. To address these limitations, we mapped a series of dimensions derived from a large-scale analysis of human similarity judgments directly onto the brain. Our results reveal broadly distributed representations of behaviorally-relevant information, demonstrating selectivity to a wide variety of novel dimensions while capturing known selectivities for visual features and categories. Behavior-derived dimensions were superior to categories at predicting brain responses, yielding mixed selectivity in much of visual cortex and sparse selectivity in category-selective clusters. This framework reconciles seemingly disparate findings regarding regional specialization, explaining category selectivity as a special case of sparse response profiles among representational dimensions, suggesting a more expansive view on visual processing in the human brain.
Collapse
Affiliation(s)
- O Contier
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Max Planck School of Cognition, Leipzig, Germany
| | - C I Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda MD, USA
| | - M N Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Medicine, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
8
|
Li SPD, Shao J, Lu Z, McCloskey M, Park S. A scene with an invisible wall - navigational experience shapes visual scene representation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.03.601933. [PMID: 39005327 PMCID: PMC11244994 DOI: 10.1101/2024.07.03.601933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Human navigation heavily relies on visual information. Although many previous studies have investigated how navigational information is inferred from visual features of scenes, little is understood about the impact of navigational experience on visual scene representation. In this study, we examined how navigational experience influences both the behavioral and neural responses to a visual scene. During training, participants navigated in the virtual reality (VR) environments which we manipulated navigational experience while holding the visual properties of scenes constant. Half of the environments allowed free navigation (navigable), while the other half featured an 'invisible wall' preventing the participants to continue forward even though the scene was visually navigable (non-navigable). During testing, participants viewed scene images from the VR environment while completing either a behavioral perceptual identification task (Experimentl) or an fMRI scan (Experiment2). Behaviorally, we found that participants judged a scene pair to be significantly more visually different if their prior navigational experience varied, even after accounting for visual similarities between the scene pairs. Neurally, multi-voxel pattern of the parahippocampal place area (PPA) distinguished visual scenes based on prior navigational experience alone. These results suggest that the human visual scene cortex represents information about navigability obtained through prior experience, beyond those computable from the visual properties of the scene. Taken together, these results suggest that scene representation is modulated by prior navigational experience to help us construct a functionally meaningful visual environment.
Collapse
Affiliation(s)
- Shi Pui Donald Li
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
| | - Jiayu Shao
- laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, MD, USA
| | - Zhengang Lu
- Department of Psychology, New York University, New York City, NY, USA
| | - Michael McCloskey
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
| | - Soojin Park
- Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Psychology, Yonsei University, Seoul, Republic of Korea
| |
Collapse
|
9
|
French LA, Tangen JM, Sewell DK. Modelling the impact of single vs. dual presentation on visual discrimination across resolutions. Q J Exp Psychol (Hove) 2024:17470218241255670. [PMID: 38714527 DOI: 10.1177/17470218241255670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2024]
Abstract
Visual categorisation relies on our ability to extract useful diagnostic information from complex stimuli. To do this, we can utilise both the "high-level" and "low-level" information in a stimulus; however, the extent to which changes in these properties impact the decision-making process is less clear. We manipulated participants' access to high-level category features via gradated reductions to image resolution while exploring the impact of access to additional category features through a dual-stimulus presentation when compared with single stimulus presentation. Results showed that while increasing image resolution consistently resulted in better choice performance, no benefit was found for dual presentation over single presentation, despite responses for dual presentation being slower compared with single presentation. Applying the diffusion decision model revealed increases in drift rate as a function of resolution, but no change in drift rate for single versus dual presentation. The increase in response time for dual presentation was instead accounted for by an increase in response caution for dual presentations. These findings suggest that while increasing access to high-level features (via increased resolution) can improve participants' categorisation performance, increasing access to both high- and low-level features (via an additional stimulus) does not.
Collapse
Affiliation(s)
- Luke A French
- School of Psychology, The University of Queensland, St. Lucia, Queensland, Australia
| | - Jason M Tangen
- School of Psychology, The University of Queensland, St. Lucia, Queensland, Australia
| | - David K Sewell
- School of Psychology, The University of Queensland, St. Lucia, Queensland, Australia
| |
Collapse
|
10
|
Stecher R, Kaiser D. Representations of imaginary scenes and their properties in cortical alpha activity. Sci Rep 2024; 14:12796. [PMID: 38834699 DOI: 10.1038/s41598-024-63320-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 05/28/2024] [Indexed: 06/06/2024] Open
Abstract
Imagining natural scenes enables us to engage with a myriad of simulated environments. How do our brains generate such complex mental images? Recent research suggests that cortical alpha activity carries information about individual objects during visual imagery. However, it remains unclear if more complex imagined contents such as natural scenes are similarly represented in alpha activity. Here, we answer this question by decoding the contents of imagined scenes from rhythmic cortical activity patterns. In an EEG experiment, participants imagined natural scenes based on detailed written descriptions, which conveyed four complementary scene properties: openness, naturalness, clutter level and brightness. By conducting classification analyses on EEG power patterns across neural frequencies, we were able to decode both individual imagined scenes as well as their properties from the alpha band, showing that also the contents of complex visual images are represented in alpha rhythms. A cross-classification analysis between alpha power patterns during the imagery task and during a perception task, in which participants were presented images of the described scenes, showed that scene representations in the alpha band are partly shared between imagery and late stages of perception. This suggests that alpha activity mediates the top-down re-activation of scene-related visual contents during imagery.
Collapse
Affiliation(s)
- Rico Stecher
- Mathematical Institute, Department of Mathematics and Computer Science, Physics, Geography, Justus Liebig University Gießen, 35392, Gießen, Germany.
| | - Daniel Kaiser
- Mathematical Institute, Department of Mathematics and Computer Science, Physics, Geography, Justus Liebig University Gießen, 35392, Gießen, Germany
- Center for Mind, Brain and Behavior (CMBB), Philipps-University Marburg and Justus Liebig University Gießen, 35032, Marburg, Germany
| |
Collapse
|
11
|
Bae AJ, Ferger R, Peña JL. Auditory Competition and Coding of Relative Stimulus Strength across Midbrain Space Maps of Barn Owls. J Neurosci 2024; 44:e2081232024. [PMID: 38664010 PMCID: PMC11112643 DOI: 10.1523/jneurosci.2081-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 03/06/2024] [Accepted: 04/05/2024] [Indexed: 05/24/2024] Open
Abstract
The natural environment challenges the brain to prioritize the processing of salient stimuli. The barn owl, a sound localization specialist, exhibits a circuit called the midbrain stimulus selection network, dedicated to representing locations of the most salient stimulus in circumstances of concurrent stimuli. Previous competition studies using unimodal (visual) and bimodal (visual and auditory) stimuli have shown that relative strength is encoded in spike response rates. However, open questions remain concerning auditory-auditory competition on coding. To this end, we present diverse auditory competitors (concurrent flat noise and amplitude-modulated noise) and record neural responses of awake barn owls of both sexes in subsequent midbrain space maps, the external nucleus of the inferior colliculus (ICx) and optic tectum (OT). While both ICx and OT exhibit a topographic map of auditory space, OT also integrates visual input and is part of the global-inhibitory midbrain stimulus selection network. Through comparative investigation of these regions, we show that while increasing strength of a competitor sound decreases spike response rates of spatially distant neurons in both regions, relative strength determines spike train synchrony of nearby units only in the OT. Furthermore, changes in synchrony by sound competition in the OT are correlated to gamma range oscillations of local field potentials associated with input from the midbrain stimulus selection network. The results of this investigation suggest that modulations in spiking synchrony between units by gamma oscillations are an emergent coding scheme representing relative strength of concurrent stimuli, which may have relevant implications for downstream readout.
Collapse
Affiliation(s)
- Andrea J Bae
- Dominick P Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| | - Roland Ferger
- Dominick P Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| | - José L Peña
- Dominick P Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York 10461
| |
Collapse
|
12
|
Kamps FS, Chen EM, Kanwisher N, Saxe R. Representation of navigational affordances and ego-motion in the occipital place area. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.30.591964. [PMID: 38746251 PMCID: PMC11092631 DOI: 10.1101/2024.04.30.591964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Humans effortlessly use vision to plan and guide navigation through the local environment, or "scene". A network of three cortical regions responds selectively to visual scene information, including the occipital (OPA), parahippocampal (PPA), and medial place areas (MPA) - but how this network supports visually-guided navigation is unclear. Recent evidence suggests that one region in particular, the OPA, supports visual representations for navigation, while PPA and MPA support other aspects of scene processing. However, most previous studies tested only static scene images, which lack the dynamic experience of navigating through scenes. We used dynamic movie stimuli to test whether OPA, PPA, and MPA represent two critical kinds of navigationally-relevant information: navigational affordances (e.g., can I walk to the left, right, or both?) and ego-motion (e.g., am I walking forward or backward? turning left or right?). We found that OPA is sensitive to both affordances and ego-motion, as well as the conflict between these cues - e.g., turning toward versus away from an open doorway. These effects were significantly weaker or absent in PPA and MPA. Responses in OPA were also dissociable from those in early visual cortex, consistent with the idea that OPA responses are not merely explained by lower-level visual features. OPA responses to affordances and ego-motion were stronger in the contralateral than ipsilateral visual field, suggesting that OPA encodes navigationally relevant information within an egocentric reference frame. Taken together, these results support the hypothesis that OPA contains visual representations that are useful for planning and guiding navigation through scenes.
Collapse
|
13
|
Li S, Xu H, Wang J, Xu R, Liu A, He F, Liu X, Tao D. Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:2714-2729. [PMID: 38557629 DOI: 10.1109/tip.2024.3381771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the fingerprint leakage, adversarial attack that involves adding imperceptible perturbations to fingerprint images have emerged as a feasible solution. However, existing works of this kind are either weak in black-box transferability or cause the images to have an unnatural appearance. Motivated by the visual perception hierarchy (i.e., high-level perception exploits model-shared semantics that transfer well across models while low-level perception extracts primitive stimuli that result in high visual sensitivity when a suspicious stimulus is provided), we propose FingerSafe, a hierarchical perceptual protective noise injection framework to address the above mentioned problems. For black-box transferability, we inject protective noises into the fingerprint orientation field to perturb the model-shared high-level semantics (i.e., fingerprint ridges). Considering visual naturalness, we suppress the low-level local contrast stimulus by regularizing the response of the Lateral Geniculate Nucleus. Our proposed FingerSafe is the first to provide feasible fingerprint protection in both digital (up to 94.12%) and realistic scenarios (Twitter and Facebook, up to 68.75%). Our code can be found at https://github.com/nlsde-safety-team/FingerSafe.
Collapse
|
14
|
Wang G, Foxwell MJ, Cichy RM, Pitcher D, Kaiser D. Individual differences in internal models explain idiosyncrasies in scene perception. Cognition 2024; 245:105723. [PMID: 38262271 DOI: 10.1016/j.cognition.2024.105723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 01/12/2024] [Accepted: 01/14/2024] [Indexed: 01/25/2024]
Abstract
According to predictive processing theories, vision is facilitated by predictions derived from our internal models of what the world should look like. However, the contents of these models and how they vary across people remains unclear. Here, we use drawing as a behavioral readout of the contents of the internal models in individual participants. Participants were first asked to draw typical versions of scene categories, as descriptors of their internal models. These drawings were converted into standardized 3d renders, which we used as stimuli in subsequent scene categorization experiments. Across two experiments, participants' scene categorization was more accurate for renders tailored to their own drawings compared to renders based on others' drawings or copies of scene photographs, suggesting that scene perception is determined by a match with idiosyncratic internal models. Using a deep neural network to computationally evaluate similarities between scene renders, we further demonstrate that graded similarity to the render based on participants' own typical drawings (and thus to their internal model) predicts categorization performance across a range of candidate scenes. Together, our results showcase the potential of a new method for understanding individual differences - starting from participants' personal expectations about the structure of real-world scenes.
Collapse
Affiliation(s)
- Gongting Wang
- Department of Education and Psychology, Freie Universität Berlin, Germany; Department of Mathematics and Computer Science, Physics, Geography, Justus-Liebig-Universität Gießen, Germany
| | | | - Radoslaw M Cichy
- Department of Education and Psychology, Freie Universität Berlin, Germany
| | | | - Daniel Kaiser
- Department of Mathematics and Computer Science, Physics, Geography, Justus-Liebig-Universität Gießen, Germany; Center for Mind, Brain and Behavior (CMBB), Philipps-Universität Marburg and Justus-Liebig-Universität Gießen, Germany.
| |
Collapse
|
15
|
Saccone EJ, Tian M, Bedny M. Developing cortex is functionally pluripotent: Evidence from blindness. Dev Cogn Neurosci 2024; 66:101360. [PMID: 38394708 PMCID: PMC10899073 DOI: 10.1016/j.dcn.2024.101360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 01/25/2024] [Accepted: 02/19/2024] [Indexed: 02/25/2024] Open
Abstract
How rigidly does innate architecture constrain function of developing cortex? What is the contribution of early experience? We review insights into these questions from visual cortex function in people born blind. In blindness, occipital cortices are active during auditory and tactile tasks. What 'cross-modal' plasticity tells us about cortical flexibility is debated. On the one hand, visual networks of blind people respond to higher cognitive information, such as sentence grammar, suggesting drastic repurposing. On the other, in line with 'metamodal' accounts, sighted and blind populations show shared domain preferences in ventral occipito-temporal cortex (vOTC), suggesting visual areas switch input modality but perform the same or similar perceptual functions (e.g., face recognition) in blindness. Here we bring these disparate literatures together, reviewing and synthesizing evidence that speaks to whether visual cortices have similar or different functions in blind and sighted people. Together, the evidence suggests that in blindness, visual cortices are incorporated into higher-cognitive (e.g., fronto-parietal) networks, which are a major source long-range input to the visual system. We propose the connectivity-constrained experience-dependent account. Functional development is constrained by innate anatomical connectivity, experience and behavioral needs. Infant cortex is pluripotent, the same anatomical constraints develop into different functional outcomes.
Collapse
Affiliation(s)
- Elizabeth J Saccone
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA.
| | - Mengyu Tian
- Center for Educational Science and Technology, Beijing Normal University at Zhuhai, China
| | - Marina Bedny
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
16
|
Yam J, Gong T, Xu H. A stimulus exposure of 50 ms elicits the uncanny valley effect. Heliyon 2024; 10:e27977. [PMID: 38533075 PMCID: PMC10963319 DOI: 10.1016/j.heliyon.2024.e27977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/04/2024] [Accepted: 03/08/2024] [Indexed: 03/28/2024] Open
Abstract
The uncanny valley (UV) effect captures the observation that artificial entities with near-human appearances tend to create feelings of eeriness. Researchers have proposed many hypotheses to explain the UV effect, but the visual processing mechanisms of the UV have yet to be fully understood. In the present study, we examined if the UV effect is as accessible in brief stimulus exposures compared to long stimulus exposures (Experiment 1). Forty-one participants, aged 21-31, rated each human-robot face presented for either a brief (50 ms) or long duration (3 s) in terms of attractiveness, eeriness, and humanness (UV indices) in a 7-point Likert scale. We found that brief and long exposures to stimuli generated a similar UV effect. This suggests that the UV effect is accessible at early visual processing. We then examined the effect of exposure duration on the categorisation of visual stimuli in Experiment 2. Thirty-three participants, aged 21-31, categorised faces as either human or robot in a two-alternative forced choice task. Their response accuracy and variance were recorded. We found that brief stimulus exposures generated significantly higher response variation and errors than the long exposure condition. This indicated that participants were more uncertain in categorising faces in the brief exposure condition due to insufficient time. Further comparisons between Experiment 1 and 2 revealed that the eeriest faces were not the hardest to categorise. Overall, these findings indicate (1) that both the UV effect and categorical uncertainty can be elicited through brief stimulus exposure, but (2) that categorical uncertainty is unlikely to cause the UV effect. These findings provide insights towards the perception of robotic faces and implications for the design of robots, androids, avatars, and artificial intelligence agents.
Collapse
Affiliation(s)
- Jodie Yam
- Psychology, School of Social Sciences, Nanyang Technological University, Singapore
| | - Tingchen Gong
- Department of Neuroscience, Physiology and Pharmacology, University College London, UK
| | - Hong Xu
- Psychology, School of Social Sciences, Nanyang Technological University, Singapore
| |
Collapse
|
17
|
Jiang C, Chen Z, Wolfe JM. Toward viewing behavior for aerial scene categorization. Cogn Res Princ Implic 2024; 9:17. [PMID: 38530617 PMCID: PMC10965882 DOI: 10.1186/s41235-024-00541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/07/2024] [Indexed: 03/28/2024] Open
Abstract
Previous work has demonstrated similarities and differences between aerial and terrestrial image viewing. Aerial scene categorization, a pivotal visual processing task for gathering geoinformation, heavily depends on rotation-invariant information. Aerial image-centered research has revealed effects of low-level features on performance of various aerial image interpretation tasks. However, there are fewer studies of viewing behavior for aerial scene categorization and of higher-level factors that might influence that categorization. In this paper, experienced subjects' eye movements were recorded while they were asked to categorize aerial scenes. A typical viewing center bias was observed. Eye movement patterns varied among categories. We explored the relationship of nine image statistics to observers' eye movements. Results showed that if the images were less homogeneous, and/or if they contained fewer or no salient diagnostic objects, viewing behavior became more exploratory. Higher- and object-level image statistics were predictive at both the image and scene category levels. Scanpaths were generally organized and small differences in scanpath randomness could be roughly captured by critical object saliency. Participants tended to fixate on critical objects. Image statistics included in this study showed rotational invariance. The results supported our hypothesis that the availability of diagnostic objects strongly influences eye movements in this task. In addition, this study provides supporting evidence for Loschky et al.'s (Journal of Vision, 15(6), 11, 2015) speculation that aerial scenes are categorized on the basis of image parts and individual objects. The findings were discussed in relation to theories of scene perception and their implications for automation development.
Collapse
Affiliation(s)
- Chenxi Jiang
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China
| | - Zhenzhong Chen
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China.
- Hubei Luojia Laboratory, Wuhan, Hubei, China.
| | - Jeremy M Wolfe
- Harvard Medical School, Boston, MA, USA
- Brigham & Women's Hospital, Boston, MA, USA
| |
Collapse
|
18
|
Cobos MI, Melcón M, Rodríguez-San Esteban P, Capilla A, Chica AB. The role of brain oscillations in feature integration. Psychophysiology 2024; 61:e14467. [PMID: 37990794 DOI: 10.1111/psyp.14467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 09/04/2023] [Accepted: 10/05/2023] [Indexed: 11/23/2023]
Abstract
Our sensory system is able to build a unified perception of the world, which although rich, is limited and inaccurate. Sometimes, features from different objects are erroneously combined. At the neural level, the role of the parietal cortex in feature integration is well-known. However, the brain dynamics underlying correct and incorrect feature integration are less clear. To explore the temporal dynamics of feature integration, we studied the modulation of different frequency bands in trials in which feature integration was correct or incorrect. Participants responded to the color of a shape target, surrounded by distractors. A calibration procedure ensured that accuracy was around 70% in each participant. To explore the role of expectancy in feature integration, we introduced an unexpected feature to the target in the last blocks of trials. Results demonstrated the contribution of several frequency bands to feature integration. Alpha and beta power was reduced for hits compared to illusions. Moreover, gamma power was overall larger during the experiment for participants who were aware of the unexpected target presented during the last blocks of trials (as compared to unaware participants). These results demonstrate that feature integration is a complex process that can go wrong at different stages of information processing and is influenced by top-down expectancies.
Collapse
Affiliation(s)
- M I Cobos
- Brain, Mind, and Behavior Research Center (CIMCYC), University of Granada (UGR), Granada, Spain
- Department of Experimental Psychology, University of Granada (UGR), Granada, Spain
| | - M Melcón
- Department of Biological and Health Psychology, Autonomous University of Madrid (UAM), Madrid, Spain
| | - P Rodríguez-San Esteban
- Brain, Mind, and Behavior Research Center (CIMCYC), University of Granada (UGR), Granada, Spain
- Department of Experimental Psychology, University of Granada (UGR), Granada, Spain
| | - A Capilla
- Department of Biological and Health Psychology, Autonomous University of Madrid (UAM), Madrid, Spain
| | - A B Chica
- Brain, Mind, and Behavior Research Center (CIMCYC), University of Granada (UGR), Granada, Spain
- Department of Experimental Psychology, University of Granada (UGR), Granada, Spain
| |
Collapse
|
19
|
Peelen MV, Berlot E, de Lange FP. Predictive processing of scenes and objects. NATURE REVIEWS PSYCHOLOGY 2024; 3:13-26. [PMID: 38989004 PMCID: PMC7616164 DOI: 10.1038/s44159-023-00254-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/25/2023] [Indexed: 07/12/2024]
Abstract
Real-world visual input consists of rich scenes that are meaningfully composed of multiple objects which interact in complex, but predictable, ways. Despite this complexity, we recognize scenes, and objects within these scenes, from a brief glance at an image. In this review, we synthesize recent behavioral and neural findings that elucidate the mechanisms underlying this impressive ability. First, we review evidence that visual object and scene processing is partly implemented in parallel, allowing for a rapid initial gist of both objects and scenes concurrently. Next, we discuss recent evidence for bidirectional interactions between object and scene processing, with scene information modulating the visual processing of objects, and object information modulating the visual processing of scenes. Finally, we review evidence that objects also combine with each other to form object constellations, modulating the processing of individual objects within the object pathway. Altogether, these findings can be understood by conceptualizing object and scene perception as the outcome of a joint probabilistic inference, in which "best guesses" about objects act as priors for scene perception and vice versa, in order to concurrently optimize visual inference of objects and scenes.
Collapse
Affiliation(s)
- Marius V Peelen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Eva Berlot
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Floris P de Lange
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
20
|
Orima T, Motoyoshi I. Spatiotemporal cortical dynamics for visual scene processing as revealed by EEG decoding. Front Neurosci 2023; 17:1167719. [PMID: 38027518 PMCID: PMC10646306 DOI: 10.3389/fnins.2023.1167719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The human visual system rapidly recognizes the categories and global properties of complex natural scenes. The present study investigated the spatiotemporal dynamics of neural signals involved in visual scene processing using electroencephalography (EEG) decoding. We recorded visual evoked potentials from 11 human observers for 232 natural scenes, each of which belonged to one of 13 natural scene categories (e.g., a bedroom or open country) and had three global properties (naturalness, openness, and roughness). We trained a deep convolutional classification model of the natural scene categories and global properties using EEGNet. Having confirmed that the model successfully classified natural scene categories and the three global properties, we applied Grad-CAM to the EEGNet model to visualize the EEG channels and time points that contributed to the classification. The analysis showed that EEG signals in the occipital electrodes at short latencies (approximately 80 ~ ms) contributed to the classifications, whereas those in the frontal electrodes at relatively long latencies (200 ~ ms) contributed to the classification of naturalness and the individual scene category. These results suggest that different global properties are encoded in different cortical areas and with different timings, and that the combination of the EEGNet model and Grad-CAM can be a tool to investigate both temporal and spatial distribution of natural scene processing in the human brain.
Collapse
Affiliation(s)
- Taiki Orima
- Department of Life Sciences, The University of Tokyo, Tokyo, Japan
- Japan Society for the Promotion of Science, Tokyo, Japan
| | - Isamu Motoyoshi
- Department of Life Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
21
|
Velarde OM, Makse HA, Parra LC. Architecture of the brain's visual system enhances network stability and performance through layers, delays, and feedback. PLoS Comput Biol 2023; 19:e1011078. [PMID: 37948463 PMCID: PMC10664920 DOI: 10.1371/journal.pcbi.1011078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/22/2023] [Accepted: 10/19/2023] [Indexed: 11/12/2023] Open
Abstract
In the visual system of primates, image information propagates across successive cortical areas, and there is also local feedback within an area and long-range feedback across areas. Recent findings suggest that the resulting temporal dynamics of neural activity are crucial in several vision tasks. In contrast, artificial neural network models of vision are typically feedforward and do not capitalize on the benefits of temporal dynamics, partly due to concerns about stability and computational costs. In this study, we focus on recurrent networks with feedback connections for visual tasks with static input corresponding to a single fixation. We demonstrate mathematically that a network's dynamics can be stabilized by four key features of biological networks: layer-ordered structure, temporal delays between layers, longer distance feedback across layers, and nonlinear neuronal responses. Conversely, when feedback has a fixed distance, one can omit delays in feedforward connections to achieve more efficient artificial implementations. We also evaluated the effect of feedback connections on object detection and classification performance using standard benchmarks, specifically the COCO and CIFAR10 datasets. Our findings indicate that feedback connections improved the detection of small objects, and classification performance became more robust to noise. We found that performance increased with the temporal dynamics, not unlike what is observed in core vision of primates. These results suggest that delays and layered organization are crucial features for stability and performance in both biological and artificial recurrent neural networks.
Collapse
Affiliation(s)
- Osvaldo Matias Velarde
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| | - Hernán A. Makse
- Levich Institute and Physics Department, The City College of New York, New York, New York, United States of America
| | - Lucas C. Parra
- Biomedical Engineering Department, The City College of New York, New York, New York, United States of America
| |
Collapse
|
22
|
Sendjasni A, Larabi MC. Attention-Aware Patch-Based CNN for Blind 360-Degree Image Quality Assessment. SENSORS (BASEL, SWITZERLAND) 2023; 23:8676. [PMID: 37960376 PMCID: PMC10647793 DOI: 10.3390/s23218676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/05/2023] [Accepted: 10/08/2023] [Indexed: 11/15/2023]
Abstract
An attention-aware patch-based deep-learning model for a blind 360-degree image quality assessment (360-IQA) is introduced in this paper. It employs spatial attention mechanisms to focus on spatially significant features, in addition to short skip connections to align them. A long skip connection is adopted to allow features from the earliest layers to be used at the final level. Patches are properly sampled on the sphere to correspond to the viewports displayed to the user using head-mounted displays. The sampling incorporates the relevance of patches by considering (i) the exploration behavior and (ii) a latitude-based selection. An adaptive strategy is applied to improve the pooling of local patch qualities to global image quality. This includes an outlier score rejection step relying on the standard deviation of the obtained scores to consider the agreement, as well as a saliency to weigh them based on their visual significance. Experiments on available 360-IQA databases show that our model outperforms the state of the art in terms of accuracy and generalization ability. This is valid for general deep-learning-based models, multichannel models, and natural scene statistic-based models. Furthermore, when compared to multichannel models, the computational complexity is significantly reduced. Finally, an extensive ablation study gives insights into the efficacy of each component of the proposed model.
Collapse
|
23
|
Schmid AC, Barla P, Doerschner K. Material category of visual objects computed from specular image structure. Nat Hum Behav 2023:10.1038/s41562-023-01601-0. [PMID: 37386108 PMCID: PMC10365995 DOI: 10.1038/s41562-023-01601-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 04/14/2023] [Indexed: 07/01/2023]
Abstract
Recognizing materials and their properties visually is vital for successful interactions with our environment, from avoiding slippery floors to handling fragile objects. Yet there is no simple mapping of retinal image intensities to physical properties. Here, we investigated what image information drives material perception by collecting human psychophysical judgements about complex glossy objects. Variations in specular image structure-produced either by manipulating reflectance properties or visual features directly-caused categorical shifts in material appearance, suggesting that specular reflections provide diagnostic information about a wide range of material classes. Perceived material category appeared to mediate cues for surface gloss, providing evidence against a purely feedforward view of neural processing. Our results suggest that the image structure that triggers our perception of surface gloss plays a direct role in visual categorization, and that the perception and neural processing of stimulus properties should be studied in the context of recognition, not in isolation.
Collapse
Affiliation(s)
- Alexandra C Schmid
- Department of Psychology, Justus Liebig University Giessen, Giessen, Germany.
| | | | - Katja Doerschner
- Department of Psychology, Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
24
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
25
|
Josephs EL, Hebart MN, Konkle T. Dimensions underlying human understanding of the reachable world. Cognition 2023; 234:105368. [PMID: 36641868 PMCID: PMC11562958 DOI: 10.1016/j.cognition.2023.105368] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 12/20/2022] [Accepted: 01/09/2023] [Indexed: 01/15/2023]
Abstract
Near-scale environments, like work desks, restaurant place settings or lab benches, are the interface of our hand-based interactions with the world. How are our conceptual representations of these environments organized? What properties distinguish among reachspaces, and why? We obtained 1.25 million similarity judgments on 990 reachspace images, and generated a 30-dimensional embedding which accurately predicts these judgments. Examination of the embedding dimensions revealed key properties underlying these judgments, such as reachspace layout, affordance, and visual appearance. Clustering performed over the embedding revealed four distinct interpretable classes of reachspaces, distinguishing among spaces related to food, electronics, analog activities, and storage or display. Finally, we found that reachspace similarity ratings were better predicted by the function of the spaces than their locations, suggesting that reachspaces are largely conceptualized in terms of the actions they support. Altogether, these results reveal the behaviorally-relevant principles that structure our internal representations of reach-relevant environments.
Collapse
Affiliation(s)
- Emilie L Josephs
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA; Psychology Department, Harvard University, Cambridge, USA.
| | - Martin N Hebart
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| | - Talia Konkle
- Psychology Department, Harvard University, Cambridge, USA.
| |
Collapse
|
26
|
Cheng A, Chen Z, Dilks DD. A stimulus-driven approach reveals vertical luminance gradient as a stimulus feature that drives human cortical scene selectivity. Neuroimage 2023; 269:119935. [PMID: 36764369 PMCID: PMC10044493 DOI: 10.1016/j.neuroimage.2023.119935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/19/2023] [Accepted: 02/07/2023] [Indexed: 02/11/2023] Open
Abstract
Human neuroimaging studies have revealed a dedicated cortical system for visual scene processing. But what is a "scene"? Here, we use a stimulus-driven approach to identify a stimulus feature that selectively drives cortical scene processing. Specifically, using fMRI data from BOLD5000, we examined the images that elicited the greatest response in the cortical scene processing system, and found that there is a common "vertical luminance gradient" (VLG), with the top half of a scene image brighter than the bottom half; moreover, across the entire set of images, VLG systematically increases with the neural response in the scene-selective regions (Study 1). Thus, we hypothesized that VLG is a stimulus feature that selectively engages cortical scene processing, and directly tested the role of VLG in driving cortical scene selectivity using tightly controlled VLG stimuli (Study 2). Consistent with our hypothesis, we found that the scene-selective cortical regions-but not an object-selective region or early visual cortex-responded significantly more to images of VLG over control stimuli with minimal VLG. Interestingly, such selectivity was also found for images with an "inverted" VLG, resembling the luminance gradient in night scenes. Finally, we also tested the behavioral relevance of VLG for visual scene recognition (Study 3); we found that participants even categorized tightly controlled stimuli of both upright and inverted VLG to be a place more than an object, indicating that VLG is also used for behavioral scene recognition. Taken together, these results reveal that VLG is a stimulus feature that selectively engages cortical scene processing, and provide evidence for a recent proposal that visual scenes can be characterized by a set of common and unique visual features.
Collapse
Affiliation(s)
- Annie Cheng
- Department of Psychology, Emory University, Atlanta, GA, USA; Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA
| | - Zirui Chen
- Department of Psychology, Emory University, Atlanta, GA, USA; Department of Cognitive Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel D Dilks
- Department of Psychology, Emory University, Atlanta, GA, USA.
| |
Collapse
|
27
|
Hebart MN, Contier O, Teichmann L, Rockter AH, Zheng CY, Kidder A, Corriveau A, Vaziri-Pashkam M, Baker CI. THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. eLife 2023; 12:e82580. [PMID: 36847339 PMCID: PMC10038662 DOI: 10.7554/elife.82580] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 02/25/2023] [Indexed: 03/01/2023] Open
Abstract
Understanding object representations requires a broad, comprehensive sampling of the objects in our visual world with dense measurements of brain activity and behavior. Here, we present THINGS-data, a multimodal collection of large-scale neuroimaging and behavioral datasets in humans, comprising densely sampled functional MRI and magnetoencephalographic recordings, as well as 4.70 million similarity judgments in response to thousands of photographic images for up to 1,854 object concepts. THINGS-data is unique in its breadth of richly annotated objects, allowing for testing countless hypotheses at scale while assessing the reproducibility of previous findings. Beyond the unique insights promised by each individual dataset, the multimodality of THINGS-data allows combining datasets for a much broader view into object processing than previously possible. Our analyses demonstrate the high quality of the datasets and provide five examples of hypothesis-driven and data-driven applications. THINGS-data constitutes the core public release of the THINGS initiative (https://things-initiative.org) for bridging the gap between disciplines and the advancement of cognitive neuroscience.
Collapse
Affiliation(s)
- Martin N Hebart
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
- Department of Medicine, Justus Liebig University GiessenGiessenGermany
| | - Oliver Contier
- Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
- Max Planck School of Cognition, Max Planck Institute for Human Cognitive and Brain SciencesLeipzigGermany
| | - Lina Teichmann
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Adam H Rockter
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Charles Y Zheng
- Machine Learning Core, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Alexis Kidder
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Anna Corriveau
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Maryam Vaziri-Pashkam
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of HealthBethesdaUnited States
| |
Collapse
|
28
|
Rodríguez-San Esteban P, Chica AB, Paz-Alonso PM. Functional characterization of correct and incorrect feature integration. Cereb Cortex 2023; 33:1440-1451. [PMID: 35510933 DOI: 10.1093/cercor/bhac147] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 03/22/2022] [Accepted: 03/24/2022] [Indexed: 11/14/2022] Open
Abstract
Our sensory system constantly receives information from the environment and our own body. Despite our impression to the contrary, we remain largely unaware of this information and often cannot report it correctly. Although perceptual processing does not require conscious effort on the part of the observer, it is often complex, giving rise to errors such as incorrect integration of features (illusory conjunctions). In the present study, we use functional magnetic resonance imaging to study the neural bases of feature integration in a dual task that produced ~30% illusions. A distributed set of regions demonstrated increased activity for correct compared to incorrect (illusory) feature integration, with increased functional coupling between occipital and parietal regions. In contrast, incorrect feature integration (illusions) was associated with increased occipital (V1-V2) responses at early stages, reduced functional connectivity between right occipital regions and the frontal eye field at later stages, and an overall decrease in coactivation between occipital and parietal regions. These results underscore the role of parietal regions in feature integration and highlight the relevance of functional occipito-frontal interactions in perceptual processing.
Collapse
Affiliation(s)
- Pablo Rodríguez-San Esteban
- Department of Experiment Psychology and Brain, Mind and Behavior Research Center (CIMCYC), Universidad de Granada, Campus de Cartuja S/N, 18071 Granada, Spain
| | - Ana B Chica
- Department of Experiment Psychology and Brain, Mind and Behavior Research Center (CIMCYC), Universidad de Granada, Campus de Cartuja S/N, 18071 Granada, Spain
| | - Pedro M Paz-Alonso
- BCBL-Basque Center on Cognition, Brain and Language, Mikeletegi Pasealekua 69, 20009 Donostia, Gipuzkoa, Spain.,IKERBASQUE-Basque Foundation for Science, 48013 Bilbo, Bizkaia, Spain
| |
Collapse
|
29
|
The Spatiotemporal Neural Dynamics of Object Recognition for Natural Images and Line Drawings. J Neurosci 2023; 43:484-500. [PMID: 36535769 PMCID: PMC9864561 DOI: 10.1523/jneurosci.1546-22.2022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 11/18/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
Drawings offer a simple and efficient way to communicate meaning. While line drawings capture only coarsely how objects look in reality, we still perceive them as resembling real-world objects. Previous work has shown that this perceived similarity is mirrored by shared neural representations for drawings and natural images, which suggests that similar mechanisms underlie the recognition of both. However, other work has proposed that representations of drawings and natural images become similar only after substantial processing has taken place, suggesting distinct mechanisms. To arbitrate between those alternatives, we measured brain responses resolved in space and time using fMRI and MEG, respectively, while human participants (female and male) viewed images of objects depicted as photographs, line drawings, or sketch-like drawings. Using multivariate decoding, we demonstrate that object category information emerged similarly fast and across overlapping regions in occipital, ventral-temporal, and posterior parietal cortex for all types of depiction, yet with smaller effects at higher levels of visual abstraction. In addition, cross-decoding between depiction types revealed strong generalization of object category information from early processing stages on. Finally, by combining fMRI and MEG data using representational similarity analysis, we found that visual information traversed similar processing stages for all types of depiction, yet with an overall stronger representation for photographs. Together, our results demonstrate broad commonalities in the neural dynamics of object recognition across types of depiction, thus providing clear evidence for shared neural mechanisms underlying recognition of natural object images and abstract drawings.SIGNIFICANCE STATEMENT When we see a line drawing, we effortlessly recognize it as an object in the world despite its simple and abstract style. Here we asked to what extent this correspondence in perception is reflected in the brain. To answer this question, we measured how neural processing of objects depicted as photographs and line drawings with varying levels of detail (from natural images to abstract line drawings) evolves over space and time. We find broad commonalities in the spatiotemporal dynamics and the neural representations underlying the perception of photographs and even abstract drawings. These results indicate a shared basic mechanism supporting recognition of drawings and natural images.
Collapse
|
30
|
Sztuka IM, Örken A, Sudimac S, Kühn S. The other blue: Role of sky in the perception of nature. Front Psychol 2022; 13:932507. [PMID: 36389494 PMCID: PMC9651055 DOI: 10.3389/fpsyg.2022.932507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 08/31/2022] [Indexed: 12/05/2022] Open
Abstract
Nature is frequently operationalized as greenery or water to estimate the restorativeness of the environment. Pursuing a deeper understanding of the connection between representation of naturalness and its relationship with restoration, we conducted an experiment aimed to investigate if the sky is perceived as an element of nature. The main goal of this study was to understand how the composition of the environment guides people’s selection of sky as nature in an explicit task. Moreover, we investigated how the amount of visible sky determines this relationship. One hundred five participants participated in a novel explicit judgment task conducted online. In this task, we prepared a set of images trimmed out of 360-degree high dynamic range images. The images were classified according to two primary independent variables representing type of environment (four levels: Nature, Some Nature, Some Urban and Urban) and horizon level (three levels: Low, Medium and High). Each participant was asked to select, by clicking on the image, what they consider as “nature.” In addition, they were asked to judge images on five visual analogue scales: emotional response, aesthetic preference, feeling of familiarity, the openness of the space and naturalness. For analysis, images were segmented into 11 semantic categories (e.g., trees, sky, and water) with each pixel being assigned one semantic label. Our results show that, sky is associated with selections of nature in a specific pattern. The relationship is dependent on the particular set of conditions that are present in the environment (i.e., weather, season of the year) rather than the type of the environment (urban, nature). The availability of sky on the image affects the selection of other nature labels with selections more likely when only a small amount of sky was available. Furthermore, we found that the amount of sky had a significant positive association with the naturalness rating of the whole image, but the effect was small. Our results also indicate that subjective selections of sky predict the naturalness better than trees and water. On the other hand, objective presence of trees and water has a stronger positive association with naturalness while objective presence of sky is positively associated with naturalness. The results show that, relative to its availability sky is considered as nature.
Collapse
Affiliation(s)
- Izabela Maria Sztuka
- Lise Meitner Group for Environmental Neuroscience, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck Dahlem Campus of Cognition (MPDCC), Max Planck Institute for Human Development, Berlin, Germany
- *Correspondence: Izabela Maria Sztuka, ; Simone Kühn,
| | - Ada Örken
- Lise Meitner Group for Environmental Neuroscience, Max Planck Institute for Human Development, Berlin, Germany
| | - Sonja Sudimac
- Lise Meitner Group for Environmental Neuroscience, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck Dahlem Campus of Cognition (MPDCC), Max Planck Institute for Human Development, Berlin, Germany
- Max Planck Institute for Human Development, International Max Planck Research School on the Life Course (LIFE), Berlin, Germany
| | - Simone Kühn
- Lise Meitner Group for Environmental Neuroscience, Max Planck Institute for Human Development, Berlin, Germany
- Max Planck Dahlem Campus of Cognition (MPDCC), Max Planck Institute for Human Development, Berlin, Germany
- Department of Psychiatry and Psychotherapy, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research Berlin, Germany and London, UK, Berlin, Germany
- *Correspondence: Izabela Maria Sztuka, ; Simone Kühn,
| |
Collapse
|
31
|
Park J, Josephs E, Konkle T. Ramp-shaped neural tuning supports graded population-level representation of the object-to-scene continuum. Sci Rep 2022; 12:18081. [PMID: 36302932 PMCID: PMC9613906 DOI: 10.1038/s41598-022-21768-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/30/2022] [Indexed: 01/24/2023] Open
Abstract
We can easily perceive the spatial scale depicted in a picture, regardless of whether it is a small space (e.g., a close-up view of a chair) or a much larger space (e.g., an entire class room). How does the human visual system encode this continuous dimension? Here, we investigated the underlying neural coding of depicted spatial scale, by examining the voxel tuning and topographic organization of brain responses. We created naturalistic yet carefully-controlled stimuli by constructing virtual indoor environments, and rendered a series of snapshots to smoothly sample between a close-up view of the central object and far-scale view of the full environment (object-to-scene continuum). Human brain responses were measured to each position using functional magnetic resonance imaging. We did not find evidence for a smooth topographic mapping for the object-to-scene continuum on the cortex. Instead, we observed large swaths of cortex with opposing ramp-shaped profiles, with highest responses to one end of the object-to-scene continuum or the other, and a small region showing a weak tuning to intermediate scale views. However, when we considered the population code of the entire ventral occipito-temporal cortex, we found smooth and linear representation of the object-to-scene continuum. Our results together suggest that depicted spatial scale information is encoded parametrically in large-scale population codes across the entire ventral occipito-temporal cortex.
Collapse
Affiliation(s)
- Jeongho Park
- Department of Psychology, Harvard University, Cambridge, USA.
| | - Emilie Josephs
- Computer Science & Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, USA
| | - Talia Konkle
- Department of Psychology, Harvard University, Cambridge, USA
| |
Collapse
|
32
|
Brain Symmetry in Alpha Band When Watching Cuts in Movies. Symmetry (Basel) 2022. [DOI: 10.3390/sym14101980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The purpose of this study is to determine if there is asymmetry in the brain activity between both hemispheres while watching cuts in movies. We presented videos with cuts to 36 participants, registered electrical brain activity through electroencephalography (EEG) and analyzed asymmetry in frontal, somatomotor, temporal, parietal and occipital areas. EEG power and alpha (8–13 Hz) asymmetry were analyzed based on 4032 epochs (112 epochs from videos × 36 participants) in each hemisphere. On average, we found negative asymmetry, indicating a greater alpha power in the left hemisphere and a greater activity in the right hemisphere in frontal, temporal and occipital areas. The opposite was found in somatomotor and temporal areas. However, with a high inter-subjects variability, these asymmetries did not seem to be significant. Our results suggest that cuts in audiovisuals do not provoke any specific asymmetrical brain activity in the alpha band in viewers. We conclude that brain asymmetry when decoding audiovisual content may be more related with narrative content than with formal style.
Collapse
|
33
|
Li M, Huang H, Guo B, Meng M. Distinct response properties between the FFA to faces and the PPA to houses. Brain Behav 2022; 12:e2706. [PMID: 35848943 PMCID: PMC9392545 DOI: 10.1002/brb3.2706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 05/24/2022] [Accepted: 06/25/2022] [Indexed: 11/09/2022] Open
Abstract
INTRODUCTION The object recognition system involves both selectivity to specific object category and invariance to changes in low-level visual features. Mounting neuroimaging evidence supports that brain areas in the ventral temporal cortex, such as the FFA and PPA, respond preferentially to faces and houses, respectively. However, how regions in human ventral temporal cortex partitioned and functionally organized to selectively and invariantly respond to different object categories remains unclear. What are the changes of response properties at the intersection of adjacent but distinctively-selective regions? METHOD Here, we conducted an fMRI study and three-pronged analyses to compare the brain mapping relationships between the FFA to faces and the PPA to houses. Specifically, we examined: 1) the response properties of object selectivity to the preferred category; 2) the response properties of invariance to contrast and a concurrently presented non-preferred category; 3) whether there are asymmetrical changes of response properties across the boundary from the FFA to PPA versus from the PPA to FFA. RESULTS We found that the response properties of FFA are highly selective and reliably invariant, whereas the responses of PPA vary with the image contrast and concurrently presented face. Moreover, the response properties across the boundary between the FFA and PPA are asymmetrical from face-selective to house-selective relative to from house-selective to face-selective. CONCLUSIONS These results convergently revealed distinct response properties between the FFA to faces and the PPA to houses, implying a combination of spatially discrete domain-specific and relatively distributed domain-general organization mapping in human ventral temporal cortex.
Collapse
Affiliation(s)
- Mengjin Li
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, China
| | - Hong Huang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, China
| | - Bingbing Guo
- School of Teacher Education, Nanjing Xiaozhuang University, Nanjing, China
| | - Ming Meng
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
| |
Collapse
|
34
|
Wang R, Janini D, Konkle T. Mid-level Feature Differences Support Early Animacy and Object Size Distinctions: Evidence from Electroencephalography Decoding. J Cogn Neurosci 2022; 34:1670-1680. [PMID: 35704550 PMCID: PMC9438936 DOI: 10.1162/jocn_a_01883] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Responses to visually presented objects along the cortical surface of the human brain have a large-scale organization reflecting the broad categorical divisions of animacy and object size. Emerging evidence indicates that this topographical organization is supported by differences between objects in mid-level perceptual features. With regard to the timing of neural responses, images of objects quickly evoke neural responses with decodable information about animacy and object size, but are mid-level features sufficient to evoke these rapid neural responses? Or is slower iterative neural processing required to untangle information about animacy and object size from mid-level features, requiring hundreds of milliseconds more processing time? To answer this question, we used EEG to measure human neural responses to images of objects and their texform counterparts-unrecognizable images that preserve some mid-level feature information about texture and coarse form. We found that texform images evoked neural responses with early decodable information about both animacy and real-world size, as early as responses evoked by original images. Furthermore, successful cross-decoding indicates that both texform and original images evoke information about animacy and size through a common underlying neural basis. Broadly, these results indicate that the visual system contains a mid-level feature bank carrying linearly decodable information on animacy and size, which can be rapidly activated without requiring explicit recognition or protracted temporal processing.
Collapse
|
35
|
Ramanoël S, Durteste M, Bizeul A, Ozier‐Lafontaine A, Bécu M, Sahel J, Habas C, Arleo A. Selective neural coding of object, feature, and geometry spatial cues in humans. Hum Brain Mapp 2022; 43:5281-5295. [PMID: 35776524 PMCID: PMC9812241 DOI: 10.1002/hbm.26002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/02/2022] [Accepted: 06/20/2022] [Indexed: 01/15/2023] Open
Abstract
Orienting in space requires the processing of visual spatial cues. The dominant hypothesis about the brain structures mediating the coding of spatial cues stipulates the existence of a hippocampal-dependent system for the representation of geometry and a striatal-dependent system for the representation of landmarks. However, this dual-system hypothesis is based on paradigms that presented spatial cues conveying either conflicting or ambiguous spatial information and that used the term landmark to refer to both discrete three-dimensional objects and wall features. Here, we test the hypothesis of complex activation patterns in the hippocampus and the striatum during visual coding. We also postulate that object-based and feature-based navigation are not equivalent instances of landmark-based navigation. We examined how the neural networks associated with geometry-, object-, and feature-based spatial navigation compared with a control condition in a two-choice behavioral paradigm using fMRI. We showed that the hippocampus was involved in all three types of cue-based navigation, whereas the striatum was more strongly recruited in the presence of geometric cues than object or feature cues. We also found that unique, specific neural signatures were associated with each spatial cue. Object-based navigation elicited a widespread pattern of activity in temporal and occipital regions relative to feature-based navigation. These findings extend the current view of a dual, juxtaposed hippocampal-striatal system for visual spatial coding in humans. They also provide novel insights into the neural networks mediating object versus feature spatial coding, suggesting a need to distinguish these two types of landmarks in the context of human navigation.
Collapse
Affiliation(s)
- Stephen Ramanoël
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance,Université Côte d'Azur, LAMHESSNiceFrance
| | - Marion Durteste
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance
| | - Alice Bizeul
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance
| | | | - Marcia Bécu
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance
| | - José‐Alain Sahel
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance,CHNO des Quinze‐Vingts, INSERM‐DGOS CIC 1423ParisFrance,Fondation Ophtalmologique RothschildParisFrance,Department of OphtalmologyThe University of Pittsburgh School of MedicinePittsburghPennsylvaniaUSA
| | - Christophe Habas
- CHNO des Quinze‐Vingts, INSERM‐DGOS CIC 1423ParisFrance,Université Versailles St Quentin en YvelineParisFrance
| | - Angelo Arleo
- Sorbonne Université, INSERM, CNRS, Institut de la VisionParisFrance
| |
Collapse
|
36
|
Measuring PM2.5 Concentrations from a Single Smartphone Photograph. REMOTE SENSING 2022. [DOI: 10.3390/rs14112572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
PM2.5 participates in light scattering, leading to degraded outdoor views, which forms the basis for estimating PM2.5 from photographs. This paper devises an algorithm to estimate PM2.5 concentrations by extracting visual cues and atmospheric indices from a single photograph. While air quality measurements in the context of complex urban scenes are particularly challenging, when only a single atmospheric index or cue is given, each one can reinforce others to yield a more robust estimator. Therefore, we selected an appropriate atmospheric index in various outdoor scenes to identify reasonable cue combinations for measuring PM2.5. A PM2.5 dataset (PhotoPM-daytime) was built and used to evaluate performance and validate efficacy of cue combinations. Furthermore, a city-wide experiment was conducted using photographs crawled from the Internet to demonstrate the applicability of the algorithm in large-area PM2.5 monitoring. Results show that smartphones equipped with the developed method could potentially be used as PM2.5 sensors.
Collapse
|
37
|
The spatiotemporal neural dynamics of object location representations in the human brain. Nat Hum Behav 2022; 6:796-811. [PMID: 35210593 PMCID: PMC9225954 DOI: 10.1038/s41562-022-01302-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 01/14/2022] [Indexed: 12/30/2022]
Abstract
To interact with objects in complex environments, we must know what they are and where they are in spite of challenging viewing conditions. Here, we investigated where, how and when representations of object location and category emerge in the human brain when objects appear on cluttered natural scene images using a combination of functional magnetic resonance imaging, electroencephalography and computational models. We found location representations to emerge along the ventral visual stream towards lateral occipital complex, mirrored by gradual emergence in deep neural networks. Time-resolved analysis suggested that computing object location representations involves recurrent processing in high-level visual cortex. Object category representations also emerged gradually along the ventral visual stream, with evidence for recurrent computations. These results resolve the spatiotemporal dynamics of the ventral visual stream that give rise to representations of where and what objects are present in a scene under challenging viewing conditions.
Collapse
|
38
|
Woolnough O, Kadipasaoglu CM, Conner CR, Forseth KJ, Rollo PS, Rollo MJ, Baboyan VG, Tandon N. Dataset of human intracranial recordings during famous landmark identification. Sci Data 2022; 9:28. [PMID: 35102154 PMCID: PMC8803828 DOI: 10.1038/s41597-022-01125-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 11/03/2021] [Indexed: 11/18/2022] Open
Abstract
For most people, recalling information about familiar items in a visual scene is an effortless task, but it is one that depends on coordinated interactions of multiple, distributed neural components. We leveraged the high spatiotemporal resolution of direct intracranial recordings to better delineate the network dynamics underpinning visual scene recognition. We present a dataset of recordings from a large cohort of humans while they identified images of famous landmarks (50 individuals, 52 recording sessions, 6,775 electrodes, 6,541 trials). This dataset contains local field potential recordings derived from subdural and penetrating electrodes covering broad areas of cortex across both hemispheres. We provide this pre-processed data with behavioural metrics (correct/incorrect, response times) and electrode localisation in a population-normalised cortical surface space. This rich dataset will allow further investigation into the spatiotemporal progression of multiple neural processes underlying visual processing, scene recognition and cued memory recall.
Collapse
Affiliation(s)
- Oscar Woolnough
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Cihan M Kadipasaoglu
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Christopher R Conner
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
- Memorial Hermann Hospital, Texas Medical Center, Houston, TX, 77030, United States of America
| | - Kiefer J Forseth
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Patrick S Rollo
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America
| | - Matthew J Rollo
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
| | - Vatche G Baboyan
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America
| | - Nitin Tandon
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School at UT Health Houston, Houston, TX, 77030, United States of America.
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX, 77030, United States of America.
- Memorial Hermann Hospital, Texas Medical Center, Houston, TX, 77030, United States of America.
| |
Collapse
|
39
|
Harel A, Nador JD, Bonner MF, Epstein RA. Early Electrophysiological Markers of Navigational Affordances in Scenes. J Cogn Neurosci 2021; 34:397-410. [PMID: 35015877 DOI: 10.1162/jocn_a_01810] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Scene perception and spatial navigation are interdependent cognitive functions, and there is increasing evidence that cortical areas that process perceptual scene properties also carry information about the potential for navigation in the environment (navigational affordances). However, the temporal stages by which visual information is transformed into navigationally relevant information are not yet known. We hypothesized that navigational affordances are encoded during perceptual processing and therefore should modulate early visually evoked ERPs, especially the scene-selective P2 component. To test this idea, we recorded ERPs from participants while they passively viewed computer-generated room scenes matched in visual complexity. By simply changing the number of doors (no doors, 1 door, 2 doors, 3 doors), we were able to systematically vary the number of pathways that afford movement in the local environment, while keeping the overall size and shape of the environment constant. We found that rooms with no doors evoked a higher P2 response than rooms with three doors, consistent with prior research reporting higher P2 amplitude to closed relative to open scenes. Moreover, we found P2 amplitude scaled linearly with the number of doors in the scenes. Navigability effects on the ERP waveform were also observed in a multivariate analysis, which showed significant decoding of the number of doors and their location at earlier time windows. Together, our results suggest that navigational affordances are represented in the early stages of scene perception. This complements research showing that the occipital place area automatically encodes the structure of navigable space and strengthens the link between scene perception and navigation.
Collapse
|
40
|
Raffin E, Witon A, Salamanca-Giron RF, Huxlin KR, Hummel FC. Functional Segregation within the Dorsal Frontoparietal Network: A Multimodal Dynamic Causal Modeling Study. Cereb Cortex 2021; 32:3187-3205. [PMID: 34864941 DOI: 10.1093/cercor/bhab409] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 10/12/2021] [Accepted: 10/15/2021] [Indexed: 12/27/2022] Open
Abstract
Discrimination and integration of motion direction requires the interplay of multiple brain areas. Theoretical accounts of perception suggest that stimulus-related (i.e., exogenous) and decision-related (i.e., endogenous) factors affect distributed neuronal processing at different levels of the visual hierarchy. To test these predictions, we measured brain activity of healthy participants during a motion discrimination task, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). We independently modeled the impact of exogenous factors (task demand) and endogenous factors (perceptual decision-making) on the activity of the motion discrimination network and applied Dynamic Causal Modeling (DCM) to both modalities. DCM for event-related potentials (DCM-ERP) revealed that task demand impacted the reciprocal connections between the primary visual cortex (V1) and medial temporal areas (V5). With practice, higher visual areas were increasingly involved, as revealed by DCM-fMRI. Perceptual decision-making modulated higher levels (e.g., V5-to-Frontal Eye Fields, FEF), in a manner predictive of performance. Our data suggest that lower levels of the visual network support early, feature-based selection of responses, especially when learning strategies have not been implemented. In contrast, perceptual decision-making operates at higher levels of the visual hierarchy by integrating sensory information with the internal state of the subject.
Collapse
Affiliation(s)
- Estelle Raffin
- Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, EPFL, Geneva CH-1201, Switzerland.,Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, Clinique Romande de Readaptation (CRR), EPFL Valais, Sion CH-1950, Switzerland
| | - Adrien Witon
- Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, EPFL, Geneva CH-1201, Switzerland.,Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, Clinique Romande de Readaptation (CRR), EPFL Valais, Sion CH-1950, Switzerland.,Health IT, IT Department, Hôpital du Valais, Sion, Switzerland
| | - Roberto F Salamanca-Giron
- Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, EPFL, Geneva CH-1201, Switzerland.,Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, Clinique Romande de Readaptation (CRR), EPFL Valais, Sion CH-1950, Switzerland
| | - Krystel R Huxlin
- The Flaum Eye Institute and Center for Visual Science, University of Rochester, Rochester, NY-14642, USA
| | - Friedhelm C Hummel
- Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, EPFL, Geneva CH-1201, Switzerland.,Defitech Chair in Clinical Neuroengineering, Center for Neuroprosthetics and Brain Mind Institute, Clinique Romande de Readaptation (CRR), EPFL Valais, Sion CH-1950, Switzerland.,Clinical Neuroscience, University of Geneva Medical School, Geneva CH-1205, Switzerland
| |
Collapse
|
41
|
Groen IIA, Dekker TM, Knapen T, Silson EH. Visuospatial coding as ubiquitous scaffolding for human cognition. Trends Cogn Sci 2021; 26:81-96. [PMID: 34799253 DOI: 10.1016/j.tics.2021.10.011] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 10/19/2021] [Accepted: 10/19/2021] [Indexed: 01/28/2023]
Abstract
For more than 100 years we have known that the visual field is mapped onto the surface of visual cortex, imposing an inherently spatial reference frame on visual information processing. Recent studies highlight visuospatial coding not only throughout visual cortex, but also brain areas not typically considered visual. Such widespread access to visuospatial coding raises important questions about its role in wider cognitive functioning. Here, we synthesise these recent developments and propose that visuospatial coding scaffolds human cognition by providing a reference frame through which neural computations interface with environmental statistics and task demands via perception-action loops.
Collapse
Affiliation(s)
- Iris I A Groen
- Institute for Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Tessa M Dekker
- Institute of Ophthalmology, University College London, London, UK
| | - Tomas Knapen
- Behavioral and Movement Sciences, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands; Spinoza Centre for NeuroImaging, Royal Dutch Academy of Sciences, Amsterdam, The Netherlands
| | - Edward H Silson
- Department of Psychology, School of Philosophy, Psychology & Language Sciences, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
42
|
Direct comparison of contralateral bias and face/scene selectivity in human occipitotemporal cortex. Brain Struct Funct 2021; 227:1405-1421. [PMID: 34727232 PMCID: PMC9046350 DOI: 10.1007/s00429-021-02411-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 10/08/2021] [Indexed: 10/27/2022]
Abstract
Human visual cortex is organised broadly according to two major principles: retinotopy (the spatial mapping of the retina in cortex) and category-selectivity (preferential responses to specific categories of stimuli). Historically, these principles were considered anatomically separate, with retinotopy restricted to the occipital cortex and category-selectivity emerging in the lateral-occipital and ventral-temporal cortex. However, recent studies show that category-selective regions exhibit systematic retinotopic biases, for example exhibiting stronger activation for stimuli presented in the contra- compared to the ipsilateral visual field. It is unclear, however, whether responses within category-selective regions are more strongly driven by retinotopic location or by category preference, and if there are systematic differences between category-selective regions in the relative strengths of these preferences. Here, we directly compare contralateral and category preferences by measuring fMRI responses to scene and face stimuli presented in the left or right visual field and computing two bias indices: a contralateral bias (response to the contralateral minus ipsilateral visual field) and a face/scene bias (preferred response to scenes compared to faces, or vice versa). We compare these biases within and between scene- and face-selective regions and across the lateral and ventral surfaces of the visual cortex more broadly. We find an interaction between surface and bias: lateral surface regions show a stronger contralateral than face/scene bias, whilst ventral surface regions show the opposite. These effects are robust across and within subjects, and appear to reflect large-scale, smoothly varying gradients. Together, these findings support distinct functional roles for the lateral and ventral visual cortex in terms of the relative importance of the spatial location of stimuli during visual information processing.
Collapse
|
43
|
Words as Visual Objects: Neural and Behavioral Evidence for High-Level Visual Impairments in Dyslexia. Brain Sci 2021; 11:brainsci11111427. [PMID: 34827427 PMCID: PMC8615820 DOI: 10.3390/brainsci11111427] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 01/23/2023] Open
Abstract
Developmental dyslexia is defined by reading impairments that are disproportionate to intelligence, motivation, and the educational opportunities considered necessary for reading. Its cause has traditionally been considered to be a phonological deficit, where people have difficulties with differentiating the sounds of spoken language. However, reading is a multidimensional skill and relies on various cognitive abilities. These may include high-level vision—the processes that support visual recognition despite innumerable image variations, such as in viewpoint, position, or size. According to our high-level visual dysfunction hypothesis, reading problems of some people with dyslexia can be a salient manifestation of a more general deficit of high-level vision. This paper provides a perspective on how such non-phonological impairments could, in some cases, cause dyslexia. To argue in favor of this hypothesis, we will discuss work on functional neuroimaging, structural imaging, electrophysiology, and behavior that provides evidence for a link between high-level visual impairment and dyslexia.
Collapse
|
44
|
Abstract
During natural vision, our brains are constantly exposed to complex, but regularly structured environments. Real-world scenes are defined by typical part-whole relationships, where the meaning of the whole scene emerges from configurations of localized information present in individual parts of the scene. Such typical part-whole relationships suggest that information from individual scene parts is not processed independently, but that there are mutual influences between the parts and the whole during scene analysis. Here, we review recent research that used a straightforward, but effective approach to study such mutual influences: By dissecting scenes into multiple arbitrary pieces, these studies provide new insights into how the processing of whole scenes is shaped by their constituent parts and, conversely, how the processing of individual parts is determined by their role within the whole scene. We highlight three facets of this research: First, we discuss studies demonstrating that the spatial configuration of multiple scene parts has a profound impact on the neural processing of the whole scene. Second, we review work showing that cortical responses to individual scene parts are shaped by the context in which these parts typically appear within the environment. Third, we discuss studies demonstrating that missing scene parts are interpolated from the surrounding scene context. Bridging these findings, we argue that efficient scene processing relies on an active use of the scene's part-whole structure, where the visual brain matches scene inputs with internal models of what the world should look like.
Collapse
Affiliation(s)
- Daniel Kaiser
- Justus-Liebig-Universität Gießen, Germany.,Philipps-Universität Marburg, Germany.,University of York, United Kingdom
| | - Radoslaw M Cichy
- Freie Universität Berlin, Germany.,Humboldt-Universität zu Berlin, Germany.,Bernstein Centre for Computational Neuroscience Berlin, Germany
| |
Collapse
|
45
|
Non-invasive neurostimulation modulates processing of spatial frequency information in rapid perception of faces. Atten Percept Psychophys 2021; 84:150-160. [PMID: 34668174 DOI: 10.3758/s13414-021-02384-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2021] [Indexed: 11/08/2022]
Abstract
This study used high-frequency transcranial random noise stimulation (tRNS) to examine how low and high spatial frequency filtered faces are processed. Response times were measured in a task where healthy young adults categorised spatially filtered hybrid faces, presented at foveal and peripheral blocks, while sham and high-frequency random noise was applied to a lateral occipito-temporal location on their scalp. Both the Frequentist and Bayesian approaches show that in contrast to sham, active stimulation significantly reduced response times to peripherally presented low spatial frequency information. This finding points to a possible plasticity in targeted regions induced by non-invasive neuromodulation of spatial frequency information in rapid perception of faces.
Collapse
|
46
|
Cermeño-Aínsa S. The perception/cognition distincton: Challenging the representational account. Conscious Cogn 2021; 95:103216. [PMID: 34649065 DOI: 10.1016/j.concog.2021.103216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 09/22/2021] [Accepted: 09/23/2021] [Indexed: 11/30/2022]
Abstract
A central goal for cognitive science and philosophy of mind is to distinguish between perception and cognition. The representational approach has emerged as a prominent candidate to draw such a distinction. The idea is that perception and cognition differ in the content and the format in which the information is represented -just as perceptual representations are nonconceptual in content and iconic in format, cognitive representations are conceptual in content and discursive in format. This paper argues against this view. I argue that both perception and cognition can use conceptual and nonconceptual contents and be vehiculated in iconic and discursive formats. If correct, the representational strategy to distinguish perception from cognition fails.
Collapse
Affiliation(s)
- Sergio Cermeño-Aínsa
- Autonomous University of Barcelona, Cognitive Science and Language (CCiL), Edifici B, Campus de la UAB, 08193 Bellaterra, (Cerdanyola del Vallès), Spain.
| |
Collapse
|
47
|
Chaisilprungraung T, Park S. "Scene" from inside: The representation of Observer's space in high-level visual cortex. Neuropsychologia 2021; 161:108010. [PMID: 34454940 DOI: 10.1016/j.neuropsychologia.2021.108010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 07/30/2021] [Accepted: 08/23/2021] [Indexed: 10/20/2022]
Abstract
Human observers are remarkably adept at perceiving and interacting with visual stimuli around them. Compared to visual stimuli like objects or faces, scenes are unique in that they provide enclosures for observers. An observer looks at a scene by being physically inside the scene. The current research explored this unique observer-scene relationship by studying the neural representation of scenes' spatial boundaries. Previous studies hypothesized that scenes' boundaries were processed in sets of high-level visual cortices. Notably, the parahippocampal place area (PPA), exhibited neural sensitivity to scenes that had closed vs. open spatial boundaries (Kravitz et al., 2011; Park et al., 2011). We asked whether this sensitivity reflected the openness of landscape (e.g., forest vs. beach), or the openness of the environment immediately surrounding the observer (i.e., whether a scene was viewed from inside vs. outside a room). Across two human fMRI experiments, we found that the PPA, as well as another well-known navigation-processing area, the occipital place area (OPA), processed scenes' boundaries according to the observer's space rather than the landscape. Moreover, we found that the PPA's activation pattern was susceptible to manipulations involving mid-level perceptual properties of scenes (e.g., rectilinear pattern of window frames), while the OPA's response was not. Our results have important implications for research in visual scene processing and suggest an important role of an observer's location in representing the spatial boundary, beyond the low-level visual input of a landscape.
Collapse
Affiliation(s)
| | - Soojin Park
- Department of Psychology, Yonsei University, Seoul, South Korea.
| |
Collapse
|
48
|
Hansen BC, Greene MR, Field DJ. Dynamic Electrode-to-Image (DETI) mapping reveals the human brain's spatiotemporal code of visual information. PLoS Comput Biol 2021; 17:e1009456. [PMID: 34570753 PMCID: PMC8496831 DOI: 10.1371/journal.pcbi.1009456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 10/07/2021] [Accepted: 09/16/2021] [Indexed: 11/18/2022] Open
Abstract
A number of neuroimaging techniques have been employed to understand how visual information is transformed along the visual pathway. Although each technique has spatial and temporal limitations, they can each provide important insights into the visual code. While the BOLD signal of fMRI can be quite informative, the visual code is not static and this can be obscured by fMRI’s poor temporal resolution. In this study, we leveraged the high temporal resolution of EEG to develop an encoding technique based on the distribution of responses generated by a population of real-world scenes. This approach maps neural signals to each pixel within a given image and reveals location-specific transformations of the visual code, providing a spatiotemporal signature for the image at each electrode. Our analyses of the mapping results revealed that scenes undergo a series of nonuniform transformations that prioritize different spatial frequencies at different regions of scenes over time. This mapping technique offers a potential avenue for future studies to explore how dynamic feedforward and recurrent processes inform and refine high-level representations of our visual world. The visual information that we sample from our environment undergoes a series of neural modifications, with each modification state (or visual code) consisting of a unique distribution of responses across neurons along the visual pathway. However, current noninvasive neuroimaging techniques provide an account of that code that is coarse with respect to time or space. Here, we present dynamic electrode-to-image (DETI) mapping, an analysis technique that capitalizes on the high temporal resolution of EEG to map neural signals to each pixel within a given image to reveal location-specific modifications of the visual code. The DETI technique reveals maps of features that are associated with the neural signal at each pixel and at each time point. DETI mapping shows that real-world scenes undergo a series of nonuniform modifications over both space and time. Specifically, we find that the visual code varies in a location-specific manner, likely reflecting that neural processing prioritizes different features at different image locations over time. DETI mapping therefore offers a potential avenue for future studies to explore how each modification state informs and refines the conceptual meaning of our visual world.
Collapse
Affiliation(s)
- Bruce C. Hansen
- Colgate University, Department of Psychological & Brain Sciences, Neuroscience Program, Hamilton New York, United States of America
- * E-mail:
| | - Michelle R. Greene
- Bates College, Neuroscience Program, Lewiston, Maine, United States of America
| | - David J. Field
- Cornell University, Department of Psychology, Ithaca, New York, United States of America
| |
Collapse
|
49
|
Menser T, Baek J, Siahaan J, Kolman JM, Delgado D, Kash B. Validating Visual Stimuli of Nature Images and Identifying the Representative Characteristics. Front Psychol 2021; 12:685815. [PMID: 34566764 PMCID: PMC8460908 DOI: 10.3389/fpsyg.2021.685815] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 08/20/2021] [Indexed: 12/22/2022] Open
Abstract
This study fills a void in the literature by both validating images of nature for use in future research experiments and examining which characteristics of these visual stimuli are found to be most representative of nature. We utilized a convenience sample of university students to assess 129 different nature images on which best represented nature. Participants (n = 40) viewed one image per question (n = 129) and were asked to rate images using a 5-point Likert scale, with the anchors “best represents nature” (5) and “least represents nature” (1). Average ratings across participants were calculated for each image. Canopies, mountains, bodies of water, and unnatural elements were identified as semantic categories of interest, as well as atmospheric perspectives and close-range views. We conducted the ordinary least squares (OLS) regression and the ordered logistic regression analyses to identify semantic categories highly representative of nature, controlling for the presence/absence of other semantic categories. The results showed that canopies, bodies of water, and mountains were found to be highly representative of nature, whereas unnatural elements and close-range views were inversely related. Understanding semantic categories most representative of nature is useful in developing nature-centered interventions in behavioral performance research and other neuroimaging modalities. All images are housed in an online repository and we welcome the use of the final 10 highly representative nature images by other researchers, which will hopefully prompt and expedite future examinations of nature across multiple research formats.
Collapse
Affiliation(s)
- Terri Menser
- Center for Outcomes Research, Houston Methodist, Houston, TX, United States.,Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, United States
| | - Juha Baek
- Center for Outcomes Research, Houston Methodist, Houston, TX, United States
| | - Jacob Siahaan
- Center for Outcomes Research, Houston Methodist, Houston, TX, United States
| | - Jacob M Kolman
- Center for Outcomes Research, Houston Methodist, Houston, TX, United States
| | - Domenica Delgado
- Center for Innovation, MD Anderson Cancer Center, Houston, TX, United States
| | - Bita Kash
- Center for Outcomes Research, Houston Methodist, Houston, TX, United States.,Department of Health Policy and Management, Texas A&M University, College Station, TX, United States
| |
Collapse
|
50
|
Çelik E, Keles U, Kiremitçi İ, Gallant JL, Çukur T. Cortical networks of dynamic scene category representation in the human brain. Cortex 2021; 143:127-147. [PMID: 34411847 DOI: 10.1016/j.cortex.2021.07.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 06/28/2021] [Accepted: 07/14/2021] [Indexed: 10/20/2022]
Abstract
Humans have an impressive ability to rapidly process global information in natural scenes to infer their category. Yet, it remains unclear whether and how scene categories observed dynamically in the natural world are represented in cerebral cortex beyond few canonical scene-selective areas. To address this question, here we examined the representation of dynamic visual scenes by recording whole-brain blood oxygenation level-dependent (BOLD) responses while subjects viewed natural movies. We fit voxelwise encoding models to estimate tuning for scene categories that reflect statistical ensembles of objects and actions in the natural world. We find that this scene-category model explains a significant portion of the response variance broadly across cerebral cortex. Cluster analysis of scene-category tuning profiles across cortex reveals nine spatially-segregated networks of brain regions consistently across subjects. These networks show heterogeneous tuning for a diverse set of dynamic scene categories related to navigation, human activity, social interaction, civilization, natural environment, non-human animals, motion-energy, and texture, suggesting that the organization of scene category representation is quite complex.
Collapse
Affiliation(s)
- Emin Çelik
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey.
| | - Umit Keles
- National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey; Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - İbrahim Kiremitçi
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey
| | - Jack L Gallant
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA; Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA; Department of Psychology, University of California, Berkeley, CA, USA
| | - Tolga Çukur
- Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey; Department of Electrical and Electronics Engineering, Bilkent University, Ankara, Turkey
| |
Collapse
|