1
|
Walper D, Bendixen A, Grimm S, Schubö A, Einhäuser W. Attention deployment in natural scenes: Higher-order scene statistics rather than semantics modulate the N2pc component. J Vis 2024; 24:7. [PMID: 38848099 PMCID: PMC11166226 DOI: 10.1167/jov.24.6.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 04/19/2024] [Indexed: 06/13/2024] Open
Abstract
Which properties of a natural scene affect visual search? We consider the alternative hypotheses that low-level statistics, higher-level statistics, semantics, or layout affect search difficulty in natural scenes. Across three experiments (n = 20 each), we used four different backgrounds that preserve distinct scene properties: (a) natural scenes (all experiments); (b) 1/f noise (pink noise, which preserves only low-level statistics and was used in Experiments 1 and 2); (c) textures that preserve low-level and higher-level statistics but not semantics or layout (Experiments 2 and 3); and (d) inverted (upside-down) scenes that preserve statistics and semantics but not layout (Experiment 2). We included "split scenes" that contained different backgrounds left and right of the midline (Experiment 1, natural/noise; Experiment 3, natural/texture). Participants searched for a Gabor patch that occurred at one of six locations (all experiments). Reaction times were faster for targets on noise and slower on inverted images, compared to natural scenes and textures. The N2pc component of the event-related potential, a marker of attentional selection, had a shorter latency and a higher amplitude for targets in noise than for all other backgrounds. The background contralateral to the target had an effect similar to that on the target side: noise led to faster reactions and shorter N2pc latencies than natural scenes, although we observed no difference in N2pc amplitude. There were no interactions between the target side and the non-target side. Together, this shows that-at least when searching simple targets without own semantic content-natural scenes are more effective distractors than noise and that this results from higher-order statistics rather than from semantics or layout.
Collapse
Affiliation(s)
- Daniel Walper
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
| | - Alexandra Bendixen
- Cognitive Systems Lab, Chemnitz University of Technology, Chemnitz, Germany
- https://www.tu-chemnitz.de/physik/SFKS/index.html.en
| | - Sabine Grimm
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
- Cognitive Systems Lab, Chemnitz University of Technology, Chemnitz, Germany
| | - Anna Schubö
- Cognitive Neuroscience of Perception & Action, Philipps University Marburg, Marburg, Germany
- https://www.uni-marburg.de/en/fb04/team-schuboe
| | - Wolfgang Einhäuser
- Physics of Cognition Group, Chemnitz University of Technology, Chemnitz, Germany
- https://www.tu-chemnitz.de/physik/PHKP/index.html.en
| |
Collapse
|
2
|
Margolles P, Elosegi P, Mei N, Soto D. Unconscious Manipulation of Conceptual Representations with Decoded Neurofeedback Impacts Search Behavior. J Neurosci 2024; 44:e1235232023. [PMID: 37985180 PMCID: PMC10866193 DOI: 10.1523/jneurosci.1235-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 10/04/2023] [Accepted: 10/26/2023] [Indexed: 11/22/2023] Open
Abstract
The necessity of conscious awareness in human learning has been a long-standing topic in psychology and neuroscience. Previous research on non-conscious associative learning is limited by the low signal-to-noise ratio of the subliminal stimulus, and the evidence remains controversial, including failures to replicate. Using functional MRI decoded neurofeedback, we guided participants from both sexes to generate neural patterns akin to those observed when visually perceiving real-world entities (e.g., dogs). Importantly, participants remained unaware of the actual content represented by these patterns. We utilized an associative DecNef approach to imbue perceptual meaning (e.g., dogs) into Japanese hiragana characters that held no inherent meaning for our participants, bypassing a conscious link between the characters and the dogs concept. Despite their lack of awareness regarding the neurofeedback objective, participants successfully learned to activate the target perceptual representations in the bilateral fusiform. The behavioral significance of our training was evaluated in a visual search task. DecNef and control participants searched for dogs or scissors targets that were pre-cued by the hiragana used during DecNef training or by a control hiragana. The DecNef hiragana did not prime search for its associated target but, strikingly, participants were impaired at searching for the targeted perceptual category. Hence, conscious awareness may function to support higher-order associative learning. Meanwhile, lower-level forms of re-learning, modification, or plasticity in existing neural representations can occur unconsciously, with behavioral consequences outside the original training context. The work also provides an account of DecNef effects in terms of neural representational drift.
Collapse
Affiliation(s)
- Pedro Margolles
- Basque Center on Cognition, Brain and Language (BCBL), Donostia - San Sebastián, Gipuzkoa 20009, Spain
- Universidad del País Vasco/Euskal Herriko Unibertsitatea (UPV/EHU), Leioa, Bizkaia 48940, Spain
| | - Patxi Elosegi
- Basque Center on Cognition, Brain and Language (BCBL), Donostia - San Sebastián, Gipuzkoa 20009, Spain
- Universidad del País Vasco/Euskal Herriko Unibertsitatea (UPV/EHU), Leioa, Bizkaia 48940, Spain
| | - Ning Mei
- Basque Center on Cognition, Brain and Language (BCBL), Donostia - San Sebastián, Gipuzkoa 20009, Spain
| | - David Soto
- Basque Center on Cognition, Brain and Language (BCBL), Donostia - San Sebastián, Gipuzkoa 20009, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Bizkaia 48009, Spain
| |
Collapse
|
3
|
Wang L, Chen S, Liu L, Yin X, Shi G, Mo J. Axial super-resolution optical coherence tomography via complex-valued network. Phys Med Biol 2023; 68:235016. [PMID: 37922558 DOI: 10.1088/1361-6560/ad0997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 11/03/2023] [Indexed: 11/07/2023]
Abstract
Optical coherence tomography (OCT) is a fast and non-invasive optical interferometric imaging technique that can provide high-resolution cross-sectional images of biological tissues. OCT's key strength is its depth resolving capability which remains invariant along the imaging depth and is determined by the axial resolution. The axial resolution is inversely proportional to the bandwidth of the OCT light source. Thus, the use of broadband light sources can effectively improve the axial resolution and however leads to an increased cost. In recent years, real-valued deep learning technique has been introduced to obtain super-resolution optical imaging. In this study, we proposed a complex-valued super-resolution network (CVSR-Net) to achieve an axial super-resolution for OCT by fully utilizing the amplitude and phase of OCT signal. The method was evaluated on three OCT datasets. The results show that the CVSR-Net outperforms its real-valued counterpart with a better depth resolving capability. Furthermore, comparisons were made between our network, six prevailing real-valued networks and their complex-valued counterparts. The results demonstrate that the complex-valued network exhibited a better super-resolution performance than its real-valued counterpart and our proposed CVSR-Net achieved the best performance. In addition, the CVSR-Net was tested on out-of-distribution domain datasets and its super-resolution performance was well maintained as compared to that on source domain datasets, indicating a good generalization capability.
Collapse
Affiliation(s)
- Lingyun Wang
- School of Electronics and Information Engineering, Soochow University, Suzhou, People's Republic of China
| | - Si Chen
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
| | - Linbo Liu
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
| | - Xue Yin
- The First Affiliated Hospital of Soochow University, Suzhou, People's Republic of China
| | - Guohua Shi
- Jiangsu Key Laboratory of Medical Optics, Suzhou Institute of Biomedical Engineering and Technology, Suzhou, People's Republic of China
| | - Jianhua Mo
- School of Electronics and Information Engineering, Soochow University, Suzhou, People's Republic of China
| |
Collapse
|
4
|
Adriano A, Girelli L, Rinaldi L. The ratio effect in visual numerosity comparisons is preserved despite spatial frequency equalisation. Vision Res 2021; 183:41-52. [PMID: 33676137 DOI: 10.1016/j.visres.2021.01.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 01/27/2021] [Accepted: 01/29/2021] [Indexed: 11/30/2022]
Abstract
How non-symbolic numerosity is visually extracted remains a matter of intense debate. Most evidence suggests that numerosity is directly extracted on individual objects following Weber's law, at least for a moderate numerical range. Alternative accounts propose that, whatever the range, numerosity is indirectly derived from summary texture-statistics of the raw image such as spatial frequency (SF). Here, to disentangle these accounts, we tested whether the well-known behavioural signature of numerosity encoding (ratio effect) is preserved despite the equalisation of the SF content. In Experiment 1, participants had to select the numerically larger of two briefly presented moderate-range numerical sets (i.e., 8-18 dots) carefully matched for SF; the ratio between numerosities was manipulated by levels of increasing difficulty (e.g., 0.66, 0.75, 0.8). In Experiment 2, participants performed the same task, but they were presented with both the original and SF equalised stimuli. In both experiments, the results clearly showed a ratio-dependence of the performance: numerosity discrimination became harder and slower as the ratio between numerosities increased. Moreover, this effect was found to be independent of the stimulus type, although the overall performance was better with the original rather than the SF equalised stimuli (Experiment 2). Taken together, these findings indicate that the power spectrum per se cannot explain the main behavioural signature of Weber-like encoding of numerosities (the ratio effect), at least over the tested numerical range, partially challenging alternative indirect accounts of numerosity processing.
Collapse
Affiliation(s)
- Andrea Adriano
- Department of Psychology, University of Milano-Bicocca, Italy.
| | - Luisa Girelli
- Department of Psychology, University of Milano-Bicocca, Italy; NeuroMI, Milan Center for Neuroscience, Milano, Italy
| | - Luca Rinaldi
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
| |
Collapse
|
5
|
Hunt C, Meinhardt G. Synergy of spatial frequency and orientation bandwidth in texture segregation. J Vis 2021; 21:5. [PMID: 33560290 PMCID: PMC7873498 DOI: 10.1167/jov.21.2.5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 12/23/2020] [Indexed: 11/28/2022] Open
Abstract
Defining target textures by increased bandwidths in spatial frequency and orientation, we observed strong cue combination effects in a combined texture figure detection and discrimination task. Performance for double-cue targets was better than predicted by independent processing of either cue and even better than predicted from linear cue integration. Application of a texture-processing model revealed that the oversummative cue combination effect is captured by calculating a low-level summary statistic (\(\Delta CE_m\)), which describes the differential contrast energy to target and reference textures, from multiple scales and orientations, and integrating this statistic across channels with a winner-take-all rule. Modeling detection performance using a signal detection theory framework showed that the observers' sensitivity to single-cue and double-cue texture targets, measured in \(d^{\prime }\) units, could be reproduced with plausible settings for filter and noise parameters. These results challenge models assuming separate channeling of elementary features and their later integration, since oversummative cue combination effects appear as an inherent property of local energy mechanisms, at least for spatial frequency and orientation bandwidth-modulated textures.
Collapse
Affiliation(s)
- Cordula Hunt
- Department of Psychology, Methods Section, Johannes Gutenberg-Universität, Mainz, Germany
| | - Günter Meinhardt
- Department of Psychology, Methods Section, Johannes Gutenberg-Universität, Mainz, Germany
| |
Collapse
|
6
|
Codispoti M, Micucci A, De Cesarei A. Time will tell: Object categorization and emotional engagement during processing of degraded natural scenes. Psychophysiology 2020; 58:e13704. [PMID: 33090526 DOI: 10.1111/psyp.13704] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 09/01/2020] [Accepted: 09/08/2020] [Indexed: 11/27/2022]
Abstract
The aim of the present study was to examine the relationship between object categorization in natural scenes and the engagement of cortico-limbic appetitive and defensive systems (emotional engagement) by manipulating both the bottom-up information and the top-down context. Concerning the bottom-up information, we manipulated the computational load by scrambling the phase of the spatial frequency spectrum, and asked participants to classify natural scenes as containing an animal or a person. The role of the top-down context was assessed by comparing an incremental condition, in which pictures were progressively revealed, to a condition in which no probabilistic relationship existed between each stimulus and the following one. In two experiments, the categorization and response to emotional and neutral scenes were similarly modulated by the computational load. The Late Positive Potential (LPP) was affected by the emotional content of the scenes, and by categorization accuracy. When the phase of the spatial frequency spectrum was scrambled by a large amount (>58%), chance categorization resulted, and affective LPP modulation was eliminated. With less degraded scenes, categorization accuracy was higher (.82 in Experiment 1, .86 in Experiment 2) and affective modulation of the LPP was observed at a late window (>800 ms), indicating that it is possible to delay the time of engagement of the motivational systems which are responsible for the LPP affective modulation. The present data strongly support the view that semantic analysis of visual scenes, operationalized here as object categorization, is a necessary condition for emotional engagement at the electrocortical level (LPP).
Collapse
Affiliation(s)
| | - Antonia Micucci
- Department of Psychology, University of Bologna, Bologna, Italy
| | | |
Collapse
|
7
|
Pipitone RN, DiMattina C. Object Clusters or Spectral Energy? Assessing the Relative Contributions of Image Phase and Amplitude Spectra to Trypophobia. Front Psychol 2020; 11:1847. [PMID: 32793086 PMCID: PMC7393229 DOI: 10.3389/fpsyg.2020.01847] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 07/06/2020] [Indexed: 12/16/2022] Open
Abstract
Trypophobia refers to the visual discomfort experienced by some people when viewing clustered patterns (e.g., clusters of holes). Trypophobic images deviate from the 1/f amplitude spectra typically characterizing natural images by containing excess energy at mid-range spatial frequencies. While recent work provides partial support for the idea of excess mid-range spatial frequency energy causing visual discomfort when viewing trypophobic images, a full factorial manipulation of image phase and amplitude spectra has yet to be conducted in order to determine whether the phase spectrum (sinusoidal waveform patterns that comprise image details like edge and texture elements) also plays a role in trypophobic discomfort. Here, we independently manipulated the phase and amplitude spectra of 31 Trypophobic images using a standard Fast Fourier Transform (FFT). Participants rated the four different versions of each image for levels of visual comfort, and completed the Trypophobia Questionnaire (TQ). Images having the original phase spectra intact (with either original or 1/f amplitude) explained the most variance in comfort ratings and were rated lowest in comfort. However, images with the original amplitude spectra but scrambled phase spectra were rated higher in comfort, with a smaller amount of variance in comfort attributed to the amplitude spectrum. Participant TQ scores correlated with comfort ratings only for images having the original phase spectra intact. There was no correlation between TQ scores and comfort levels when participants viewed the original amplitude / phase-scrambled images. Taken together, the present findings show that the phase spectrum of trypophobic images, which determines the pattern of small clusters of objects, plays a much larger role than the amplitude spectrum in determining visual discomfort.
Collapse
Affiliation(s)
- R Nathan Pipitone
- Department of Psychology, Florida Gulf Coast University, Fort Myers, FL, United States
| | - Christopher DiMattina
- Department of Psychology, Florida Gulf Coast University, Fort Myers, FL, United States
| |
Collapse
|
8
|
Rhodes LJ, Ríos M, Williams J, Quiñones G, Rao PK, Miskovic V. The role of low-level image features in the affective categorization of rapidly presented scenes. PLoS One 2019; 14:e0215975. [PMID: 31042739 PMCID: PMC6494199 DOI: 10.1371/journal.pone.0215975] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2018] [Accepted: 04/04/2019] [Indexed: 11/30/2022] Open
Abstract
It remains unclear how the visual system is able to extract affective content from complex scenes even with extremely brief (< 100 millisecond) exposures. One possibility, suggested by findings in machine vision, is that low-level features such as unlocalized, two-dimensional (2-D) Fourier spectra can be diagnostic of scene content. To determine whether Fourier image amplitude carries any information about the affective quality of scenes, we first validated the existence of image category differences through a support vector machine (SVM) model that was able to discriminate our intact aversive and neutral images with ~ 70% accuracy using amplitude-only features as inputs. This model allowed us to confirm that scenes belonging to different affective categories could be mathematically distinguished on the basis of amplitude spectra alone. The next question is whether these same features are also exploited by the human visual system. Subsequently, we tested observers' rapid classification of affective and neutral naturalistic scenes, presented briefly (~33.3 ms) and backward masked with synthetic textures. We tested categorization accuracy across three distinct experimental conditions, using: (i) original images, (ii) images having their amplitude spectra swapped within a single affective image category (e.g., an aversive image whose amplitude spectrum has been swapped with another aversive image) or (iii) images having their amplitude spectra swapped between affective categories (e.g., an aversive image containing the amplitude spectrum of a neutral image). Despite its discriminative potential, the human visual system does not seem to use Fourier amplitude differences as the chief strategy for affectively categorizing scenes at a glance. The contribution of image amplitude to affective categorization is largely dependent on interactions with the phase spectrum, although it is impossible to completely rule out a residual role for unlocalized 2-D amplitude measures.
Collapse
Affiliation(s)
- L. Jack Rhodes
- Department of Psychology, State University of New York at Binghamton, Binghamton, New York, United States of America
| | - Matthew Ríos
- Department of Psychology, State University of New York at Binghamton, Binghamton, New York, United States of America
| | - Jacob Williams
- Computer Science and Engineering, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Gonzalo Quiñones
- Department of Psychology, State University of New York at Binghamton, Binghamton, New York, United States of America
| | - Prahalada K. Rao
- Mechanical and Materials Engineering, University of Nebraska, Lincoln, Nebraska, United States of America
| | - Vladimir Miskovic
- Department of Psychology, State University of New York at Binghamton, Binghamton, New York, United States of America
| |
Collapse
|
9
|
Fruend I, Stalker E. Human sensitivity to perturbations constrained by a model of the natural image manifold. J Vis 2018; 18:20. [PMID: 30383190 DOI: 10.1167/18.11.20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Humans are remarkably well tuned to the statistical properties of natural images. However, quantitative characterization of processing within the domain of natural images has been difficult because most parametric manipulations of a natural image make that image appear less natural. We used generative adversarial networks (GANs) to constrain parametric manipulations to remain within an approximation of the manifold of natural images. In the first experiment, seven observers decided which one of two synthetic perturbed images matched a synthetic unperturbed comparison image. Observers were significantly more sensitive to perturbations that were constrained to an approximate manifold of natural images than they were to perturbations applied directly in pixel space. Trial-by-trial errors were consistent with the idea that these perturbations disrupt configural aspects of visual structure used in image segmentation. In a second experiment, five observers discriminated paths along the image manifold as recovered by the GAN. Observers were remarkably good at this task, confirming that observers are tuned to fairly detailed properties of an approximate manifold of natural images. We conclude that human tuning to natural images is more general than detecting deviations from natural appearance, and that humans have, to some extent, access to detailed interrelations between natural images.
Collapse
Affiliation(s)
- Ingo Fruend
- Centre for Vision Research and Department of Psychology, York University, Toronto, Ontario, Canada
| | - Elee Stalker
- Department of Psychology York University, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Fournier J, Müller CM, Schneider I, Laurent G. Spatial Information in a Non-retinotopic Visual Cortex. Neuron 2018; 97:164-180.e7. [DOI: 10.1016/j.neuron.2017.11.017] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Revised: 08/25/2017] [Accepted: 11/10/2017] [Indexed: 02/04/2023]
|
11
|
Grootswagers T, Ritchie JB, Wardle SG, Heathcote A, Carlson TA. Asymmetric Compression of Representational Space for Object Animacy Categorization under Degraded Viewing Conditions. J Cogn Neurosci 2017; 29:1995-2010. [DOI: 10.1162/jocn_a_01177] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Animacy is a robust organizing principle among object category representations in the human brain. Using multivariate pattern analysis methods, it has been shown that distance to the decision boundary of a classifier trained to discriminate neural activation patterns for animate and inanimate objects correlates with observer RTs for the same animacy categorization task [Ritchie, J. B., Tovar, D. A., & Carlson, T. A. Emerging object representations in the visual system predict reaction times for categorization. PLoS Computational Biology, 11, e1004316, 2015; Carlson, T. A., Ritchie, J. B., Kriegeskorte, N., Durvasula, S., & Ma, J. Reaction time for object categorization is predicted by representational distance. Journal of Cognitive Neuroscience, 26, 132–142, 2014]. Using MEG decoding, we tested if the same relationship holds when a stimulus manipulation (degradation) increases task difficulty, which we predicted would systematically decrease the distance of activation patterns from the decision boundary and increase RTs. In addition, we tested whether distance to the classifier boundary correlates with drift rates in the linear ballistic accumulator [Brown, S. D., & Heathcote, A. The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology, 57, 153–178, 2008]. We found that distance to the classifier boundary correlated with RT, accuracy, and drift rates in an animacy categorization task. Split by animacy, the correlations between brain and behavior were sustained longer over the time course for animate than for inanimate stimuli. Interestingly, when examining the distance to the classifier boundary during the peak correlation between brain and behavior, we found that only degraded versions of animate, but not inanimate, objects had systematically shifted toward the classifier decision boundary as predicted. Our results support an asymmetry in the representation of animate and inanimate object categories in the human brain.
Collapse
Affiliation(s)
- Tijl Grootswagers
- Macquarie University, Australia
- ARC Centre of Excellence in Cognition and Its Disorders, Australia
- University of Sydney
| | | | - Susan G. Wardle
- Macquarie University, Australia
- ARC Centre of Excellence in Cognition and Its Disorders, Australia
| | | | - Thomas A. Carlson
- ARC Centre of Excellence in Cognition and Its Disorders, Australia
- University of Sydney
| |
Collapse
|
12
|
Ratan Murty NA, Arun SP. Effect of silhouetting and inversion on view invariance in the monkey inferotemporal cortex. J Neurophysiol 2017; 118:353-362. [PMID: 28381484 PMCID: PMC5501916 DOI: 10.1152/jn.00008.2017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Revised: 03/31/2017] [Accepted: 04/01/2017] [Indexed: 11/23/2022] Open
Abstract
We easily recognize objects across changes in viewpoint, but the underlying features are unknown. Here, we show that view invariance in monkey inferotemporal cortex is driven mainly by external object contours and is not specialized for object orientation. We also find that the responses to natural objects match with that of their silhouettes early in the response, and with inverted versions later in the response—indicative of a coarse-to-fine processing sequence in the brain. We effortlessly recognize objects across changes in viewpoint, but we know relatively little about the features that underlie viewpoint invariance in the brain. Here, we set out to characterize how viewpoint invariance in monkey inferior temporal (IT) neurons is influenced by two image manipulations—silhouetting and inversion. Reducing an object into its silhouette removes internal detail, so this would reveal how much viewpoint invariance depends on the external contours. Inverting an object retains but rearranges features, so this would reveal how much viewpoint invariance depends on the arrangement and orientation of features. Our main findings are 1) view invariance is weakened by silhouetting but not by inversion; 2) view invariance was stronger in neurons that generalized across silhouetting and inversion; 3) neuronal responses to natural objects matched early with that of silhouettes and only later to that of inverted objects, indicative of coarse-to-fine processing; and 4) the impact of silhouetting and inversion depended on object structure. Taken together, our results elucidate the underlying features and dynamics of view-invariant object representations in the brain. NEW & NOTEWORTHY We easily recognize objects across changes in viewpoint, but the underlying features are unknown. Here, we show that view invariance in the monkey inferotemporal cortex is driven mainly by external object contours and is not specialized for object orientation. We also find that the responses to natural objects match with that of their silhouettes early in the response, and with inverted versions later in the response—indicative of a coarse-to-fine processing sequence in the brain.
Collapse
Affiliation(s)
| | - S P Arun
- Centre for Neuroscience, Indian Institute of Science, Bangalore, India
| |
Collapse
|
13
|
Abstract
While most typically developing (TD) participants have a coarse-to-fine processing style, people with autism spectrum disorder (ASD) seem to be less globally and more locally biased when processing visual information. The stimulus-specific spatial frequency content might be directly relevant to determine this temporal hierarchy of visual information processing in people with and without ASD. We implemented a semantic priming task in which (in)congruent coarse and/or fine spatial information preceded target categorization. Our results indicated that adolescents with ASD made more categorization errors than TD adolescents and needed more time to process the prime stimuli. Simultaneously, however, our findings argued for a processing advantage in ASD, when the prime stimulus contains detailed spatial information and presentation time permits explicit visual processing.
Collapse
|
14
|
Farivar R, Clavagnier S, Hansen BC, Thompson B, Hess RF. Non-uniform phase sensitivity in spatial frequency maps of the human visual cortex. J Physiol 2017; 595:1351-1363. [PMID: 27748961 DOI: 10.1113/jp273206] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 10/11/2016] [Indexed: 11/08/2022] Open
Abstract
KEY POINTS Just as a portrait painting can come from a collection of coarse and fine details, natural vision can be decomposed into coarse and fine components. Previous studies have shown that the early visual areas in the brain represent these components in a map-like fashion. Other studies have shown that these same visual areas can be sensitive to how coarse and fine features line up in space. We found that the brain actually jointly represents both the scale of the feature (fine, medium, or coarse) and the alignment of these features in space. The results suggest that the visual cortex has an optimized representation particularly for the alignment of fine details, which are crucial in understanding the visual scene. ABSTRACT Complex natural scenes can be decomposed into their oriented spatial frequency (SF) and phase relationships, both of which are represented locally at the earliest stages of cortical visual processing. The SF preference map in the human cortex, obtained using synthetic stimuli, is orderly and correlates strongly with eccentricity. In addition, early visual areas show sensitivity to the phase information that describes the relationship between SFs and thereby dictates the structure of the image. Taken together, two possibilities arise for the joint representation of SF and phase: either the entirety of the cortical SF map is uniformly sensitive to phase, or a particular set of SFs is selectively phase sensitive - for example, greater phase sensitivity for higher SFs that define fine-scale edges in a complex scene. To test between these two possibilities, we constructed a novel continuous natural scene video whereby phase information was maintained in one SF band but scrambled elsewhere. By shifting the central frequency of the phase-aligned band in time, we mapped the phase-sensitive SF preference of the visual cortex. Using functional magnetic resonance imaging, we found that phase sensitivity in early visual areas is biased toward higher SFs. Compared to a SF map of the same scene obtained using linear-filtered stimuli, a much larger patch of areas V1 and V2 is sensitive to the phase alignment of higher SFs. The results of early areas cannot be explained by attention. Our results suggest non-uniform sensitivity to phase alignment in population-level SF representations, with phase alignment being particularly important for fine-scale edge representations of natural scenes.
Collapse
Affiliation(s)
- Reza Farivar
- McGill Vision Research, McGill University, Quebec, Canada
| | | | - Bruce C Hansen
- Department of Psychology, Neuroscience Program, Colgate University, NY, USA
| | - Ben Thompson
- Department of Optometry and Visual Science, University of Waterloo, Ontario, Canada
| | - Robert F Hess
- McGill Vision Research, McGill University, Quebec, Canada
| |
Collapse
|
15
|
Stimulus-Driven Population Activity Patterns in Macaque Primary Visual Cortex. PLoS Comput Biol 2016; 12:e1005185. [PMID: 27935935 PMCID: PMC5147778 DOI: 10.1371/journal.pcbi.1005185] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Accepted: 10/07/2016] [Indexed: 11/19/2022] Open
Abstract
Dimensionality reduction has been applied in various brain areas to study the activity of populations of neurons. To interpret the outputs of dimensionality reduction, it is important to first understand its outputs for brain areas for which the relationship between the stimulus and neural response is well characterized. Here, we applied principal component analysis (PCA) to trial-averaged neural responses in macaque primary visual cortex (V1) to study two fundamental, population-level questions. First, we characterized how neural complexity relates to stimulus complexity, where complexity is measured using relative comparisons of dimensionality. Second, we assessed the extent to which responses to different stimuli occupy similar dimensions of the population activity space using a novel statistical method. For comparison, we performed the same dimensionality reduction analyses on the activity of a recently-proposed V1 receptive field model and a deep convolutional neural network. Our results show that the dimensionality of the population response changes systematically with alterations in the properties and complexity of the visual stimulus. A central goal in systems neuroscience is to understand how large populations of neurons work together to enable us to sense, to reason, and to act. To go beyond single-neuron and pairwise analyses, recent studies have applied dimensionality reduction methods to neural population activity to reveal tantalizing evidence of neural mechanisms underlying a wide range of brain functions. To aid in interpreting the outputs of dimensionality reduction, it is important to vary the inputs to a brain area and ask whether the outputs of dimensionality reduction change in a sensible manner, which has not yet been shown. In this study, we recorded the activity of tens of neurons in the primary visual cortex (V1) of macaque monkeys while presenting different visual stimuli. We found that the dimensionality of the population activity grows with stimulus complexity, and that the population responses to different stimuli occupy similar dimensions of the population firing rate space, in accordance with the visual stimuli themselves. For comparison, we applied the same analysis methods to the activity of a recently-proposed V1 receptive field model and a deep convolutional neural network. Overall, we found dimensionality reduction to yield interpretable results, providing encouragement for the use of dimensionality reduction in other brain areas.
Collapse
|
16
|
Vanmarcke S, Calders F, Wagemans J. The Time-Course of Ultrarapid Categorization: The Influence of Scene Congruency and Top-Down Processing. Iperception 2016; 7:2041669516673384. [PMID: 27803794 PMCID: PMC5076752 DOI: 10.1177/2041669516673384] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Although categorization can take place at different levels of abstraction, classic studies on semantic labeling identified the basic level, for example, dog, as entry point for categorization. Ultrarapid categorization tasks have contradicted these findings, indicating that participants are faster at detecting superordinate-level information, for example, animal, in a complex visual image. We argue that both seemingly contradictive findings can be reconciled within the framework of parallel distributed processing and its successor Leabra (Local, Error-driven and Associative, Biologically Realistic Algorithm). The current study aimed at verifying this prediction in an ultrarapid categorization task with a dynamically changing presentation time (PT) for each briefly presented object, followed by a perceptual mask. Furthermore, we manipulated two defining task variables: level of categorization (basic vs. superordinate categorization) and object presentation mode (object-in-isolation vs. object-in-context). In contradiction with previous ultrarapid categorization research, focusing on reaction time, we used accuracy as our main dependent variable. Results indicated a consistent superordinate processing advantage, coinciding with an overall improvement in performance with longer PT and a significantly more accurate detection of objects in isolation, compared with objects in context, at lower stimulus PT. This contextual disadvantage disappeared when PT increased, indicating that figure-ground separation with recurrent processing is vital for meaningful contextual processing to occur.
Collapse
|
17
|
Vanmarcke S, Wagemans J. Individual differences in spatial frequency processing in scene perception: the influence of autism-related traits. VISUAL COGNITION 2016. [DOI: 10.1080/13506285.2016.1199625] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Hesslinger VM, Carbon CC. #TheDress: The Role of Illumination Information and Individual Differences in the Psychophysics of Perceiving White-Blue Ambiguities. Iperception 2016; 7:2041669516645592. [PMID: 27433328 PMCID: PMC4934678 DOI: 10.1177/2041669516645592] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In early 2015, a public debate about a perceptual phenomenon that impressively demonstrated the subjective nature of human perception was running round the globe: the debate about #TheDress, a poorly lit photograph of a lace dress that was perceived as white–gold by some, but as blue–black by others. In the present research (N = 48), we found that the perceptual difference between white–gold perceivers (n1 = 24, 12 women, Mage = 25.4 years) and blue–black perceivers (n2 = 24, 12 women, Mage = 24.3 years) decreased significantly when the illumination information provided by the original digital photo was reduced by means of image scrambling (Experiment 1). This indicates that the illumination information is one potentially important factor contributing to the color ambiguity of #TheDress—possibly by amplification of a slight principal difference in psychophysics of color perception which the two observer groups showed for abstract uniformly colored fields displaying a white–blue ambiguity (Experiment 2).
Collapse
Affiliation(s)
- Vera M Hesslinger
- Department of General Psychology and Methodology, University of Bamberg, Bamberg, Germany
| | - Claus-Christian Carbon
- Department of General Psychology and Methodology, University of Bamberg, Bamberg, Germany
| |
Collapse
|
19
|
In the Eye of the Beholder: Rapid Visual Perception of Real-Life Scenes by Young Adults with and Without ASD. J Autism Dev Disord 2016; 46:2635-2652. [DOI: 10.1007/s10803-016-2802-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
20
|
|
21
|
Vanmarcke S, Van Der Hallen R, Evers K, Noens I, Steyaert J, Wagemans J. Ultra-Rapid Categorization of Meaningful Real-Life Scenes in Adults With and Without ASD. J Autism Dev Disord 2015; 46:450-66. [DOI: 10.1007/s10803-015-2583-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
22
|
MaBouDi H, Shimazaki H, Amari SI, Soltanian-Zadeh H. Representation of higher-order statistical structures in natural scenes via spatial phase distributions. Vision Res 2015; 120:61-73. [PMID: 26278166 DOI: 10.1016/j.visres.2015.06.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Revised: 06/13/2015] [Accepted: 06/14/2015] [Indexed: 10/23/2022]
Abstract
Natural scenes contain richer perceptual information in their spatial phase structure than their amplitudes. Modeling phase structure of natural scenes may explain higher-order structure inherent to the natural scenes, which is neglected in most classical models of redundancy reduction. Only recently, a few models have represented images using a complex form of receptive fields (RFs) and analyze their complex responses in terms of amplitude and phase. However, these complex representation models often tacitly assume a uniform phase distribution without empirical support. The structure of spatial phase distributions of natural scenes in the form of relative contributions of paired responses of RFs in quadrature has not been explored statistically until now. Here, we investigate the spatial phase structure of natural scenes using complex forms of various Gabor-like RFs. To analyze distributions of the spatial phase responses, we constructed a mixture model that accounts for multi-modal circular distributions, and the EM algorithm for estimation of the model parameters. Based on the likelihood, we report presence of both uniform and structured bimodal phase distributions in natural scenes. The latter bimodal distributions were symmetric with two peaks separated by about 180°. Thus, the redundancy in the natural scenes can be further removed by using the bimodal phase distributions obtained from these RFs in the complex representation models. These results predict that both phase invariant and phase sensitive complex cells are required to represent the regularities of natural scenes in visual systems.
Collapse
Affiliation(s)
- HaDi MaBouDi
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | | | | | - Hamid Soltanian-Zadeh
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran; Control and Intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran; Image Analysis Laboratory, Department of Radiology, Henry Ford Health System, Detroit, MI, United States.
| |
Collapse
|
23
|
Edge co-occurrences can account for rapid categorization of natural versus animal images. Sci Rep 2015; 5:11400. [PMID: 26096913 PMCID: PMC4476147 DOI: 10.1038/srep11400] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 05/13/2015] [Indexed: 12/04/2022] Open
Abstract
Making a judgment about the semantic category of a visual scene, such as whether it contains an animal, is typically assumed to involve high-level associative brain areas. Previous explanations require progressively analyzing the scene hierarchically at increasing levels of abstraction, from edge extraction to mid-level object recognition and then object categorization. Here we show that the statistics of edge co-occurrences alone are sufficient to perform a rough yet robust (translation, scale, and rotation invariant) scene categorization. We first extracted the edges from images using a scale-space analysis coupled with a sparse coding algorithm. We then computed the “association field” for different categories (natural, man-made, or containing an animal) by computing the statistics of edge co-occurrences. These differed strongly, with animal images having more curved configurations. We show that this geometry alone is sufficient for categorization, and that the pattern of errors made by humans is consistent with this procedure. Because these statistics could be measured as early as the primary visual cortex, the results challenge widely held assumptions about the flow of computations in the visual system. The results also suggest new algorithms for image classification and signal processing that exploit correlations between low-level structure and the underlying semantic category.
Collapse
|
24
|
Wallis TSA, Dorr M, Bex PJ. Sensitivity to gaze-contingent contrast increments in naturalistic movies: An exploratory report and model comparison. J Vis 2015; 15:3. [PMID: 26057546 DOI: 10.1167/15.8.3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Sensitivity to luminance contrast is a prerequisite for all but the simplest visual systems. To examine contrast increment detection performance in a way that approximates the natural environmental input of the human visual system, we presented contrast increments gaze-contingently within naturalistic video freely viewed by observers. A band-limited contrast increment was applied to a local region of the video relative to the observer's current gaze point, and the observer made a forced-choice response to the location of the target (≈25,000 trials across five observers). We present exploratory analyses showing that performance improved as a function of the magnitude of the increment and depended on the direction of eye movements relative to the target location, the timing of eye movements relative to target presentation, and the spatiotemporal image structure at the target location. Contrast discrimination performance can be modeled by assuming that the underlying contrast response is an accelerating nonlinearity (arising from a nonlinear transducer or gain control). We implemented one such model and examined the posterior over model parameters, estimated using Markov-chain Monte Carlo methods. The parameters were poorly constrained by our data; parameters constrained using strong priors taken from previous research showed poor cross-validated prediction performance. Atheoretical logistic regression models were better constrained and provided similar prediction performance to the nonlinear transducer model. Finally, we explored the properties of an extended logistic regression that incorporates both eye movement and image content features. Models of contrast transduction may be better constrained by incorporating data from both artificial and natural contrast perception settings.
Collapse
|
25
|
Vanmarcke S, Wagemans J. Rapid gist perception of meaningful real-life scenes: Exploring individual and gender differences in multiple categorization tasks. Iperception 2015; 6:19-37. [PMID: 26034569 PMCID: PMC4441019 DOI: 10.1068/i0682] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 12/20/2014] [Indexed: 10/28/2022] Open
Abstract
In everyday life, we are generally able to dynamically understand and adapt to socially (ir)elevant encounters, and to make appropriate decisions about these. All of this requires an impressive ability to directly filter and obtain the most informative aspects of a complex visual scene. Such rapid gist perception can be assessed in multiple ways. In the ultrafast categorization paradigm developed by Simon Thorpe et al. (1996), participants get a clear categorization task in advance and succeed at detecting the target object of interest (animal) almost perfectly (even with 20 ms exposures). Since this pioneering work, follow-up studies consistently reported population-level reaction time differences on different categorization tasks, indicating a superordinate advantage (animal versus dog) and effects of perceptual similarity (animals versus vehicles) and object category size (natural versus animal versus dog). In this study, we replicated and extended these separate findings by using a systematic collection of different categorization tasks (varying in presentation time, task demands, and stimuli) and focusing on individual differences in terms of e.g., gender and intelligence. In addition to replicating the main findings from the literature, we find subtle, yet consistent gender differences (women faster than men).
Collapse
Affiliation(s)
- Steven Vanmarcke
- Laboratory of Experimental Psychology, University of Leuven (KU Leuven), Leuven, Belgium, e-mail:
| | - Johan Wagemans
- Laboratory of Experimental Psychology, University of Leuven (KU Leuven), Leuven, Belgium, e-mail:
| |
Collapse
|
26
|
Groen II, Ghebreab S, Prins H, Lamme VA, Scholte HS. From image statistics to scene gist: evoked neural activity reveals transition from low-level natural image structure to scene category. J Neurosci 2013; 33:18814-24. [PMID: 24285888 PMCID: PMC6618700 DOI: 10.1523/jneurosci.3128-13.2013] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2013] [Revised: 10/07/2013] [Accepted: 10/24/2013] [Indexed: 11/21/2022] Open
Abstract
The visual system processes natural scenes in a split second. Part of this process is the extraction of "gist," a global first impression. It is unclear, however, how the human visual system computes this information. Here, we show that, when human observers categorize global information in real-world scenes, the brain exhibits strong sensitivity to low-level summary statistics. Subjects rated a specific instance of a global scene property, naturalness, for a large set of natural scenes while EEG was recorded. For each individual scene, we derived two physiologically plausible summary statistics by spatially pooling local contrast filter outputs: contrast energy (CE), indexing contrast strength, and spatial coherence (SC), indexing scene fragmentation. We show that behavioral performance is directly related to these statistics, with naturalness rating being influenced in particular by SC. At the neural level, both statistics parametrically modulated single-trial event-related potential amplitudes during an early, transient window (100-150 ms), but SC continued to influence activity levels later in time (up to 250 ms). In addition, the magnitude of neural activity that discriminated between man-made versus natural ratings of individual trials was related to SC, but not CE. These results suggest that global scene information may be computed by spatial pooling of responses from early visual areas (e.g., LGN or V1). The increased sensitivity over time to SC in particular, which reflects scene fragmentation, suggests that this statistic is actively exploited to estimate scene naturalness.
Collapse
Affiliation(s)
- Iris I.A. Groen
- Cognitive Neuroscience Group, Department of Psychology
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| | - Sennay Ghebreab
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
- Intelligent Systems Laboratory Amsterdam, Institute of Informatics, University of Amsterdam, 1018 WS, Amsterdam, The Netherlands
| | - Hielke Prins
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| | | | - H. Steven Scholte
- Cognitive Neuroscience Group, Department of Psychology
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| |
Collapse
|
27
|
Mashhadi PS, Shoorehdeli MA, Teshnehlab M. Patterns with different phases but same statistics. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2013; 30:1796-1805. [PMID: 24323261 DOI: 10.1364/josaa.30.001796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Many successful methods in various vision tasks rely on statistical analysis of visual patterns. However, we are interested in covering the gap between the underlying mathematical representation of the visual patterns and their statistics. With this general trend, in this paper a relationship between phase structure of a class of patterns and their moments after and before filtering have been considered. First, a general formula between the phase structure and moments of the images is obtained. Second, a theorem is developed that states under which conditions two visual patterns with the same frequencies but different phases have the same moments up to a certain moment. Finally, a theorem is developed that explains, given a set of filters, under which conditions two visual patterns with both different frequencies and different phases have the same subband statistics.
Collapse
|
28
|
Castaldi E, Frijia F, Montanaro D, Tosetti M, Morrone MC. BOLD human responses to chromatic spatial features. Eur J Neurosci 2013; 38:2290-9. [PMID: 23600977 DOI: 10.1111/ejn.12223] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Revised: 03/11/2013] [Accepted: 03/20/2013] [Indexed: 11/28/2022]
Abstract
Animal physiological and human psychophysical studies suggest that an early step in visual processing involves the detection and identification of features such as lines and edges, by neural mechanisms with even- and odd-symmetric receptive fields. Functional imaging studies also demonstrate mechanisms with even- and odd-receptive fields in early visual areas, in response to luminance-modulated stimuli. In this study we measured fMRI BOLD responses to 2-D stimuli composed of only even or only odd symmetric features, and to an amplitude-matched random noise control, modulated in red-green equiluminant colour contrast. All these stimuli had identical power but different phase spectra, either highly congruent (even or odd symmetry stimuli) or random (noise). At equiluminance, V1 BOLD activity showed no preference between congruent- and random-phase stimuli, as well as no preference between even and odd symmetric stimuli. Areas higher in the visual hierarchy, both along the dorsal pathway (caudal part of the intraparietal sulcus, dorsal LO and V3A) and the ventral pathway (V4), responded preferentially to odd symmetry over even symmetry stimuli, and to congruent over random phase stimuli. Interestingly, V1 showed an equal increase in BOLD activity at each alternation between stimuli of different symmetry, suggesting the existence of specialised mechanisms for the detection of edges and lines such as even- and odd-chromatic receptive fields. Overall the results indicate a high selectivity of colour-selective neurons to spatial phase along both the dorsal and the ventral pathways in humans.
Collapse
Affiliation(s)
- E Castaldi
- Department of Neuroscience, Psychology, Pharmacology and Child health, University of Florence, Firenze, Italy
| | | | | | | | | |
Collapse
|
29
|
How sensitive is the human visual system to the local statistics of natural images? PLoS Comput Biol 2013; 9:e1002873. [PMID: 23358106 PMCID: PMC3554546 DOI: 10.1371/journal.pcbi.1002873] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 11/21/2012] [Indexed: 11/19/2022] Open
Abstract
A key hypothesis in sensory system neuroscience is that sensory representations are adapted to the statistical regularities in sensory signals and thereby incorporate knowledge about the outside world. Supporting this hypothesis, several probabilistic models of local natural image regularities have been proposed that reproduce neural response properties. Although many such physiological links have been made, these models have not been linked directly to visual sensitivity. Previous psychophysical studies of sensitivity to natural image regularities focus on global perception of large images, but much less is known about sensitivity to local natural image regularities. We present a new paradigm for controlled psychophysical studies of local natural image regularities and compare how well such models capture perceptually relevant image content. To produce stimuli with precise statistics, we start with a set of patches cut from natural images and alter their content to generate a matched set whose joint statistics are equally likely under a probabilistic natural image model. The task is forced choice to discriminate natural patches from model patches. The results show that human observers can learn to discriminate the higher-order regularities in natural images from those of model samples after very few exposures and that no current model is perfect for patches as small as 5 by 5 pixels or larger. Discrimination performance was accurately predicted by model likelihood, an information theoretic measure of model efficacy, indicating that the visual system possesses a surprisingly detailed knowledge of natural image higher-order correlations, much more so than current image models. We also perform three cue identification experiments to interpret how model features correspond to perceptually relevant image features. Several aspects of primate visual physiology have been identified as adaptations to local regularities of natural images. However, much less work has measured visual sensitivity to local natural image regularities. Most previous work focuses on global perception of large images and shows that observers are more sensitive to visual information when image properties resemble those of natural images. In this work we measure human sensitivity to local natural image regularities using stimuli generated by patch-based probabilistic natural image models that have been related to primate visual physiology. We find that human observers can learn to discriminate the statistical regularities of natural image patches from those represented by current natural image models after very few exposures and that discriminability depends on the degree of regularities captured by the model. The quick learning we observed suggests that the human visual system is biased for processing natural images, even at very fine spatial scales, and that it has a surprisingly large knowledge of the regularities in natural images, at least in comparison to the state-of-the-art statistical models of natural images.
Collapse
|
30
|
Ossandón JP, Onat S, Cazzoli D, Nyffeler T, Müri R, König P. Unmasking the contribution of low-level features to the guidance of attention. Neuropsychologia 2012; 50:3478-87. [PMID: 23044277 DOI: 10.1016/j.neuropsychologia.2012.09.043] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2011] [Revised: 09/19/2012] [Accepted: 09/26/2012] [Indexed: 11/18/2022]
Affiliation(s)
- José P Ossandón
- Universität Osnabrück, Institut für Kognitionswissenschaft, Albrechtstr. 28, 49076 Osnabrück, Germany.
| | | | | | | | | | | |
Collapse
|
31
|
Mohan K, Arun SP. Similarity relations in visual search predict rapid visual categorization. J Vis 2012; 12:19. [PMID: 23092947 PMCID: PMC3586997 DOI: 10.1167/12.11.19] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2012] [Accepted: 09/17/2012] [Indexed: 11/24/2022] Open
Abstract
How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation.
Collapse
Affiliation(s)
- Krithika Mohan
- Indian Institute of Science Education and Research, Pune, India
- Centre for Neuroscience, Indian Institute of Science, Bangalore, India
| | - S. P. Arun
- Centre for Neuroscience, Indian Institute of Science, Bangalore, India
| |
Collapse
|
32
|
Groen IIA, Ghebreab S, Lamme VAF, Scholte HS. Spatially pooled contrast responses predict neural and perceptual similarity of naturalistic image categories. PLoS Comput Biol 2012; 8:e1002726. [PMID: 23093921 PMCID: PMC3475684 DOI: 10.1371/journal.pcbi.1002726] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Accepted: 08/02/2012] [Indexed: 11/22/2022] Open
Abstract
The visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task. Humans excel in rapid and accurate processing of visual scenes. However, it is unclear which computations allow the visual system to convert light hitting the retina into a coherent representation of visual input in a rapid and efficient way. Here we used simple, computer-generated image categories with similar low-level structure as natural scenes to test whether a model of early integration of low-level information can predict perceived category similarity. Specifically, we show that summarized (spatially pooled) responses of model neurons covering the entire visual field (the population response) to low-level properties of visual input (contrasts) can already be informative about differences in early visual evoked activity as well as behavioral confusions of these categories. These results suggest that low-level population responses can carry relevant information to estimate similarity of controlled images, and put forward the exciting hypothesis that the visual system may exploit these responses to rapidly process real natural scenes. We propose that the spatial pooling that allows for the extraction of this information may be a plausible first step in extracting scene gist to form a rapid impression of the visual input.
Collapse
Affiliation(s)
- Iris I. A. Groen
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- * E-mail:
| | - Sennay Ghebreab
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Intelligent Systems Lab Amsterdam, Institute of Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Victor A. F. Lamme
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - H. Steven Scholte
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
33
|
Exploiting sparsity and low-rank structure for the recovery of multi-slice breast MRIs with reduced sampling error. Med Biol Eng Comput 2012; 50:991-1000. [PMID: 22644257 DOI: 10.1007/s11517-012-0920-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Accepted: 05/13/2012] [Indexed: 10/28/2022]
Abstract
It has been shown that, magnetic resonance images (MRIs) with sparsity representation in a transformed domain, e.g. spatial finite-differences (FD), or discrete cosine transform (DCT), can be restored from undersampled k-space via applying current compressive sampling theory. The paper presents a model-based method for the restoration of MRIs. The reduced-order model, in which a full-system-response is projected onto a subspace of lower dimensionality, has been used to accelerate image reconstruction by reducing the size of the involved linear system. In this paper, the singular value threshold (SVT) technique is applied as a denoising scheme to reduce and select the model order of the inverse Fourier transform image, and to restore multi-slice breast MRIs that have been compressively sampled in k-space. The restored MRIs with SVT for denoising show reduced sampling errors compared to the direct MRI restoration methods via spatial FD, or DCT. Compressive sampling is a technique for finding sparse solutions to underdetermined linear systems. The sparsity that is implicit in MRIs is to explore the solution to MRI reconstruction after transformation from significantly undersampled k-space. The challenge, however, is that, since some incoherent artifacts result from the random undersampling, noise-like interference is added to the image with sparse representation. These recovery algorithms in the literature are not capable of fully removing the artifacts. It is necessary to introduce a denoising procedure to improve the quality of image recovery. This paper applies a singular value threshold algorithm to reduce the model order of image basis functions, which allows further improvement of the quality of image reconstruction with removal of noise artifacts. The principle of the denoising scheme is to reconstruct the sparse MRI matrices optimally with a lower rank via selecting smaller number of dominant singular values. The singular value threshold algorithm is performed by minimizing the nuclear norm of difference between the sampled image and the recovered image. It has been illustrated that this algorithm improves the ability of previous image reconstruction algorithms to remove noise artifacts while significantly improving the quality of MRI recovery.
Collapse
|
34
|
Fraedrich EM, Flanagin VL, Duann JR, Brandt T, Glasauer S. Hippocampal involvement in processing of indistinct visual motion stimuli. J Cogn Neurosci 2012; 24:1344-57. [PMID: 22524276 DOI: 10.1162/jocn_a_00226] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Perception of known patterns results from the interaction of current sensory input with existing internal representations. It is unclear how perceptual and mnemonic processes interact when visual input is dynamic and structured such that it does not allow immediate recognition of obvious objects and forms. In an fMRI experiment, meaningful visual motion stimuli depicting movement through a virtual tunnel and indistinct, meaningless visual motion stimuli, achieved through phase scrambling of the same stimuli, were presented while participants performed an optic flow task. We found that our indistinct visual motion stimuli evoked hippocampal activation, whereas the corresponding meaningful stimuli did not. Using independent component analysis, we were able to demonstrate a functional connectivity between the hippocampus and early visual areas, with increased activity for indistinct stimuli. In a second experiment, we used the same stimuli to test whether our results depended on the participants' task. We found task-independent bilateral hippocampal activation in response to indistinct motion stimuli. For both experiments, psychophysiological interaction analysis revealed a coupling from posterior hippocampus to dorsal visuospatial and ventral visual object processing areas when viewing indistinct stimuli. These results indicate a close functional link between stimulus-dependent perceptual and mnemonic processes. The observed pattern of hippocampal functional connectivity, in the absence of an explicit memory task, suggests that cortical-hippocampal networks are recruited when visual stimuli are temporally uncertain and do not immediately reveal a clear meaning.
Collapse
|
35
|
Crouzet SM, Thorpe SJ. Low-level cues and ultra-fast face detection. Front Psychol 2011; 2:342. [PMID: 22125544 PMCID: PMC3221302 DOI: 10.3389/fpsyg.2011.00342] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 11/01/2011] [Indexed: 11/24/2022] Open
Abstract
Recent experimental work has demonstrated the existence of extremely rapid saccades toward faces in natural scenes that can be initiated only 100 ms after image onset (Crouzet et al., 2010). These ultra-rapid saccades constitute a major challenge to current models of processing in the visual system because they do not seem to leave enough time for even a single feed-forward pass through the ventral stream. Here we explore the possibility that the information required to trigger these very fast saccades could be extracted very early on in visual processing using relatively low-level amplitude spectrum (AS) information in the Fourier domain. Experiment 1 showed that AS normalization can significantly alter face-detection performance. However, a decrease of performance following AS normalization does not alone prove that AS-based information is used (Gaspar and Rousselet, 2009). In Experiment 2, following the Gaspar and Rousselet paper, we used a swapping procedure to clarify the role of AS information in fast object detection. Our experiment is composed of three conditions: (i) original images, (ii) category swapped, in which the face image has the AS of a vehicle, and the vehicle has the AS of a face, and (iii) identity swapped, where the face has the AS of another face image, and the vehicle has the AS of another vehicle image. The results showed very similar levels of performance in the original and identity swapped conditions, and a clear drop in the category swapped condition. This result demonstrates that, in the early temporal window offered by the saccadic choice task, the visual saccadic system does indeed rely on low-level AS information in order to rapidly detect faces. This sort of crude diagnostic information could potentially be derived very early on in the visual system, possibly as early as V1 and V2.
Collapse
Affiliation(s)
- Sébastien M. Crouzet
- Department of Cognitive, Linguistic and Psychological Science, Brown UniversityProvidence, RI, USA
| | - Simon J. Thorpe
- Centre de Recherche Cerveau et Cognition, Université de Toulouse, UPSToulouse, France
- Centre National de la Recherche Scientifique, CerCoToulouse, France
| |
Collapse
|
36
|
Abstract
The hemodynamic response of the visual cortex to continuously moving spatial stimuli of virtual tunnels and phase-scrambled versions thereof was examined using functional magnetic resonance imaging. Earlier functional magnetic resonance imaging studies found either no difference or less early visual cortex (VC) activation when presenting normal versus phase-manipulated static natural images. Here we describe an increase in VC activation while viewing phase-scrambled films compared with normal films, although basic image statistics and average local flow were the same. The normal films, in contrast, resulted in an increased lateral occipital and precuneus activity sparing VC. In summary, our results show that earlier findings for scrambling of static images no longer hold for spatiotemporal stimuli.
Collapse
|
37
|
Emrith K, Chantler MJ, Green PR, Maloney LT, Clarke ADF. Measuring perceived differences in surface texture due to changes in higher order statistics. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2010; 27:1232-1244. [PMID: 20448792 DOI: 10.1364/josaa.27.001232] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
We investigate the ability of humans to perceive changes in the appearance of images of surface texture caused by the variation of their higher order statistics. We incrementally randomize their phase spectra while holding their first and second order statistics constant in order to ensure that the change in the appearance is due solely to changes in third and other higher order statistics. Stimuli comprise both natural and synthetically generated naturalistic images, with the latter being used to prevent observers from making pixel-wise comparisons. A difference scaling method is used to derive the perceptual scales for each observer, which show a sigmoidal relationship with the degree of randomization. Observers were maximally sensitive to changes within the 20%-60% randomization range. In order to account for this behavior we propose a biologically plausible model that computes the variance of local measurements of phase congruency.
Collapse
Affiliation(s)
- K Emrith
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK.
| | | | | | | | | |
Collapse
|
38
|
Abstract
An image patch can be locally decomposed into sinusoidal waves of different orientations, spatial frequencies, amplitudes, and phases. The local phase information is essential for perception, because important visual features like edges emerge at locations of maximal local phase coherence. Detection of phase coherence requires integration of spatial frequency information across multiple spatial scales. Models of early visual processing suggest that the visual system should implement phase-sensitive pooling of spatial frequency information in the identification of broadband edges. We used functional magnetic resonance imaging (fMRI) adaptation to look for phase-sensitive neural responses in the human visual cortex. We found sensitivity to the phase difference between spatial frequency components in all studied visual areas, including the primary visual cortex (V1). Control experiments demonstrated that these results were not explained by differences in contrast or position. Next, we compared fMRI responses for broadband compound grating stimuli with congruent and random phase structures. All studied visual areas showed stronger responses for the stimuli with congruent phase structure. In addition, selectivity to phase congruency increased from V1 to higher-level visual areas along both the ventral and dorsal streams. We conclude that human V1 already shows phase-sensitive pooling of spatial frequencies, but only higher-level visual areas might be capable of pooling spatial frequency information across spatial scales typical for broadband natural images.
Collapse
|
39
|
Gaspar CM, Rousselet GA. How do amplitude spectra influence rapid animal detection? Vision Res 2009; 49:3001-12. [PMID: 19818804 DOI: 10.1016/j.visres.2009.09.021] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Revised: 09/23/2009] [Accepted: 09/25/2009] [Indexed: 10/20/2022]
Abstract
Amplitude spectra might provide information for natural scene classification. Amplitude does play a role in animal detection because accuracy suffers when amplitude is normalized. However, this effect could be due to an interaction between phase and amplitude, rather than to a loss of amplitude-only information. We used an amplitude-swapping paradigm to establish that animal detection is partly based on an interaction between phase and amplitude. A difference in false alarms for two subsets of our distractor stimuli suggests that the classification of scene environment (man-made versus natural) may also be based on an interaction between phase and amplitude. Examples of interaction between amplitude and phase are discussed.
Collapse
Affiliation(s)
- Carl M Gaspar
- Centre for Cognitive Neuroimaging (CCNi), Department of Psychology, University of Glasgow, G12 8QB Glasgow, UK
| | | |
Collapse
|
40
|
Açik A, Onat S, Schumann F, Einhäuser W, König P. Effects of luminance contrast and its modifications on fixation behavior during free viewing of images from different categories. Vision Res 2009; 49:1541-53. [DOI: 10.1016/j.visres.2009.03.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2008] [Revised: 03/12/2009] [Accepted: 03/12/2009] [Indexed: 11/29/2022]
Affiliation(s)
- Alper Açik
- University of Osnabrück, Institute of Cognitive Science, Osnabrück, Germany.
| | | | | | | | | |
Collapse
|
41
|
Rousselet GA, Pernet CR, Bennett PJ, Sekuler AB. Parametric study of EEG sensitivity to phase noise during face processing. BMC Neurosci 2008; 9:98. [PMID: 18834518 PMCID: PMC2573889 DOI: 10.1186/1471-2202-9-98] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2008] [Accepted: 10/03/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The present paper examines the visual processing speed of complex objects, here faces, by mapping the relationship between object physical properties and single-trial brain responses. Measuring visual processing speed is challenging because uncontrolled physical differences that co-vary with object categories might affect brain measurements, thus biasing our speed estimates. Recently, we demonstrated that early event-related potential (ERP) differences between faces and objects are preserved even when images differ only in phase information, and amplitude spectra are equated across image categories. Here, we use a parametric design to study how early ERP to faces are shaped by phase information. Subjects performed a two-alternative force choice discrimination between two faces (Experiment 1) or textures (two control experiments). All stimuli had the same amplitude spectrum and were presented at 11 phase noise levels, varying from 0% to 100% in 10% increments, using a linear phase interpolation technique. Single-trial ERP data from each subject were analysed using a multiple linear regression model. RESULTS Our results show that sensitivity to phase noise in faces emerges progressively in a short time window between the P1 and the N170 ERP visual components. The sensitivity to phase noise starts at about 120-130 ms after stimulus onset and continues for another 25-40 ms. This result was robust both within and across subjects. A control experiment using pink noise textures, which had the same second-order statistics as the faces used in Experiment 1, demonstrated that the sensitivity to phase noise observed for faces cannot be explained by the presence of global image structure alone. A second control experiment used wavelet textures that were matched to the face stimuli in terms of second- and higher-order image statistics. Results from this experiment suggest that higher-order statistics of faces are necessary but not sufficient to obtain the sensitivity to phase noise function observed in response to faces. CONCLUSION Our results constitute the first quantitative assessment of the time course of phase information processing by the human visual brain. We interpret our results in a framework that focuses on image statistics and single-trial analyses.
Collapse
Affiliation(s)
- Guillaume A Rousselet
- Centre for Cognitive Neuroimaging (CCNi) and Department of Psychology, University of Glasgow, Glasgow, UK.
| | | | | | | |
Collapse
|
42
|
Clarke A, Green P, Chantler M, Emrith K. Visual search for a target against a 1/fβ continuous textured background. Vision Res 2008; 48:2193-203. [DOI: 10.1016/j.visres.2008.06.019] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2007] [Revised: 06/11/2008] [Accepted: 06/17/2008] [Indexed: 11/26/2022]
|
43
|
Hansen BC, Hess RF. Structural sparseness and spatial phase alignment in natural scenes. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2007; 24:1873-85. [PMID: 17728809 DOI: 10.1364/josaa.24.001873] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
The Fourier phase spectrum plays a central role regarding where in an image contours occur, thereby defining the spatial relationship between those structures in the overall scene. Only a handful of studies have demonstrated psychophysically the relevance of the Fourier phase spectrum with respect to human visual processing, and none have demonstrated the relative amount of local cross-scale spatial phase alignment needed to perceptually extract meaningful structure from an image. We investigated the relative amount of spatial phase alignment needed for humans to perceptually match natural scene image structures at three different spatial frequencies [3, 6, and 12 cycles per degree (cpd)] as a function of the number of structures within the image (i.e., "structural sparseness"). The results showed that (1) the amount of spatial phase alignment needed to match structures depends on structural sparseness, with a bias for matching structures at 6 cpd and (2) the ability to match partially phase-randomized images at a given spatial frequency is independent of structural sparseness at other spatial frequencies. The findings of the current study are discussed in terms of a network of feature integrators in the human visual system.
Collapse
Affiliation(s)
- Bruce C Hansen
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Quebec, Canada.
| | | |
Collapse
|