1
|
Time-of-day perception in paintings. J Vis 2024; 24:1. [PMID: 38165679 PMCID: PMC10768702 DOI: 10.1167/jov.24.1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 11/15/2023] [Indexed: 01/04/2024] Open
Abstract
The spectral shape, irradiance, direction, and diffuseness of daylight vary regularly throughout the day. The variations in illumination and their effect on the light reflected from objects may in turn provide visual information as to the time of day. We suggest that artists' color choices for paintings of outdoor scenes might convey this information and that therefore the time of day might be decoded from the colors of paintings. Here we investigate whether human viewers' estimates of the depicted time of day in paintings correlate with their image statistics, specifically chromaticity and luminance variations. We tested time-of-day perception in 17th- to 20th-century Western European paintings via two online rating experiments. In Experiment 1, viewers' ratings from seven time choices varied significantly and largely consistently across paintings but with some ambiguity between morning and evening depictions. Analysis of the relationship between image statistics and ratings revealed correlations with the perceived time of day: higher "morningness" ratings associated with higher brightness, contrast, and saturation and darker yellow/brighter blue hues; "eveningness" with lower brightness, contrast, and saturation and darker blue/brighter yellow hues. Multiple linear regressions of extracted principal components yielded a predictive model that explained 76% of the variance in time-of-day perception. In Experiment 2, viewers rated paintings as morning or evening only; rating distributions differed significantly across paintings, and image statistics predicted people's perceptions. These results suggest that artists used different color palettes and patterns to depict different times of day, and the human visual system holds consistent assumptions about the variation of natural light depicted in paintings.
Collapse
|
2
|
Sensitivity to naturalistic texture relies primarily on high spatial frequencies. J Vis 2023; 23:4. [PMID: 36745452 PMCID: PMC9910384 DOI: 10.1167/jov.23.2.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 11/19/2022] [Indexed: 02/07/2023] Open
Abstract
Natural images contain information at multiple spatial scales. Though we understand how early visual mechanisms split multiscale images into distinct spatial frequency channels, we do not know how the outputs of these channels are processed further by mid-level visual mechanisms. We have recently developed a texture discrimination task that uses synthetic, multi-scale, "naturalistic" textures to isolate these mid-level mechanisms. Here, we use three experimental manipulations (image blur, image rescaling, and eccentric viewing) to show that perceptual sensitivity to naturalistic structure is strongly dependent on features at high object spatial frequencies (measured in cycles/image). As a result, sensitivity depends on a texture acuity limit, a property of the visual system that sets the highest retinal spatial frequency (measured in cycles/degree) at which observers can detect naturalistic features. Analysis of the texture images using a model observer analysis shows that naturalistic image features at high object spatial frequencies carry more task-relevant information than those at low object spatial frequencies. That is, the dependence of sensitivity on high object spatial frequencies is a property of the texture images, rather than a property of the visual system. Accordingly, we find human observers' ability to extract naturalistic information (their efficiency) is similar for all object spatial frequencies. We conclude that the mid-level mechanisms that underlie perceptual sensitivity effectively extract information from all image features below the texture acuity limit, regardless of their retinal and object spatial frequency.
Collapse
|
3
|
Luminance contrast provides metric depth information. ROYAL SOCIETY OPEN SCIENCE 2023; 10:220567. [PMID: 36816842 PMCID: PMC9929495 DOI: 10.1098/rsos.220567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
The perception of depth from retinal images depends on information from multiple visual cues. One potential depth cue is the statistical relationship between luminance and distance; darker points in a local region of an image tend to be farther away than brighter points. We establish that this statistical relationship acts as a quantitative cue to depth. We show that luminance variations affect depth in naturalistic scenes containing multiple cues to depth. This occurred when the correlation between variations of luminance and depth was manipulated within an object, but not between objects. This is consistent with the local nature of the statistical relationship in natural scenes. We also showed that perceived depth increases as contrast is increased, but only when the depth signalled by luminance and binocular disparity are consistent. Our results show that the negative correlation between luminance and distance, as found under diffuse lighting, provides a depth cue that is combined with depth from binocular disparity, in a way that is consistent with the simultaneous estimation of surface depth and reflectance variations. Adopting more complex lighting models such as ambient occlusion in computer rendering will thus contribute to the accuracy as well as the aesthetic appearance of three-dimensional graphics.
Collapse
|
4
|
No-Reference Quality Assessment of Authentically Distorted Images Based on Local and Global Features. J Imaging 2022; 8:jimaging8060173. [PMID: 35735972 PMCID: PMC9224559 DOI: 10.3390/jimaging8060173] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 06/15/2022] [Accepted: 06/17/2022] [Indexed: 01/25/2023] Open
Abstract
With the development of digital imaging techniques, image quality assessment methods are receiving more attention in the literature. Since distortion-free versions of camera images in many practical, everyday applications are not available, the need for effective no-reference image quality assessment algorithms is growing. Therefore, this paper introduces a novel no-reference image quality assessment algorithm for the objective evaluation of authentically distorted images. Specifically, we apply a broad spectrum of local and global feature vectors to characterize the variety of authentic distortions. Among the employed local features, the statistics of popular local feature descriptors, such as SURF, FAST, BRISK, or KAZE, are proposed for NR-IQA; other features are also introduced to boost the performances of local features. The proposed method was compared to 12 other state-of-the-art algorithms on popular and accepted benchmark datasets containing RGB images with authentic distortions (CLIVE, KonIQ-10k, and SPAQ). The introduced algorithm significantly outperforms the state-of-the-art in terms of correlation with human perceptual quality ratings.
Collapse
|
5
|
An image reconstruction framework for characterizing initial visual encoding. eLife 2022; 11:e71132. [PMID: 35037622 PMCID: PMC8846596 DOI: 10.7554/elife.71132] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 01/14/2022] [Indexed: 11/13/2022] Open
Abstract
We developed an image-computable observer model of the initial visual encoding that operates on natural image input, based on the framework of Bayesian image reconstruction from the excitations of the retinal cone mosaic. Our model extends previous work on ideal observer analysis and evaluation of performance beyond psychophysical discrimination, takes into account the statistical regularities of the visual environment, and provides a unifying framework for answering a wide range of questions regarding the visual front end. Using the error in the reconstructions as a metric, we analyzed variations of the number of different photoreceptor types on human retina as an optimal design problem. In addition, the reconstructions allow both visualization and quantification of information loss due to physiological optics and cone mosaic sampling, and how these vary with eccentricity. Furthermore, in simulations of color deficiencies and interferometric experiments, we found that the reconstructed images provide a reasonable proxy for modeling subjects' percepts. Lastly, we used the reconstruction-based observer for the analysis of psychophysical threshold, and found notable interactions between spatial frequency and chromatic direction in the resulting spatial contrast sensitivity function. Our method is widely applicable to experiments and applications in which the initial visual encoding plays an important role.
Collapse
|
6
|
Abstract
The sensitivity of the human visual system is thought to be shaped by environmental statistics. A major endeavor in vision science, therefore, is to uncover the image statistics that predict perceptual and cognitive function. When searching for targets in natural images, for example, it has recently been proposed that target detection is inversely related to the spatial similarity of the target to its local background. We tested this hypothesis by measuring observers' sensitivity to targets that were blended with natural image backgrounds. Targets were designed to have a spatial structure that was either similar or dissimilar to the background. Contrary to masking from similarity, we found that observers were most sensitive to targets that were most similar to their backgrounds. We hypothesized that a coincidence of phase alignment between target and background results in a local contrast signal that facilitates detection when target-background similarity is high. We confirmed this prediction in a second experiment. Indeed, we show that, by solely manipulating the phase of a target relative to its background, the target can be rendered easily visible or undetectable. Our study thus reveals that, in addition to its structural similarity, the phase of the target relative to the background must be considered when predicting detection sensitivity in natural images.
Collapse
|
7
|
Visual Discomfort and Variations in Chromaticity in Art and Nature. Front Neurosci 2021; 15:711064. [PMID: 34987354 PMCID: PMC8720932 DOI: 10.3389/fnins.2021.711064] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 11/22/2021] [Indexed: 11/13/2022] Open
Abstract
Visual discomfort is related to the statistical regularity of visual images. The contribution of luminance contrast to visual discomfort is well understood and can be framed in terms of a theory of efficient coding of natural stimuli, and linked to metabolic demand. While color is important in our interaction with nature, the effect of color on visual discomfort has received less attention. In this study, we build on the established association between visual discomfort and differences in chromaticity across space. We average the local differences in chromaticity in an image and show that this average is a good predictor of visual discomfort from the image. It accounts for part of the variance left unexplained by variations in luminance. We show that the local chromaticity difference in uncomfortable stimuli is high compared to that typical in natural scenes, except in particular infrequent conditions such as the arrangement of colorful fruits against foliage. Overall, our study discloses a new link between visual ecology and discomfort whereby discomfort arises when adaptive perceptual mechanisms are overstimulated by specific classes of stimuli rarely found in nature.
Collapse
|
8
|
Rat sensitivity to multipoint statistics is predicted by efficient coding of natural scenes. eLife 2021; 10:e72081. [PMID: 34872633 PMCID: PMC8651284 DOI: 10.7554/elife.72081] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 11/18/2021] [Indexed: 01/23/2023] Open
Abstract
Efficient processing of sensory data requires adapting the neuronal encoding strategy to the statistics of natural stimuli. Previously, in Hermundstad et al., 2014, we showed that local multipoint correlation patterns that are most variable in natural images are also the most perceptually salient for human observers, in a way that is compatible with the efficient coding principle. Understanding the neuronal mechanisms underlying such adaptation to image statistics will require performing invasive experiments that are impossible in humans. Therefore, it is important to understand whether a similar phenomenon can be detected in animal species that allow for powerful experimental manipulations, such as rodents. Here we selected four image statistics (from single- to four-point correlations) and trained four groups of rats to discriminate between white noise patterns and binary textures containing variable intensity levels of one of such statistics. We interpreted the resulting psychometric data with an ideal observer model, finding a sharp decrease in sensitivity from two- to four-point correlations and a further decrease from four- to three-point. This ranking fully reproduces the trend we previously observed in humans, thus extending a direct demonstration of efficient coding to a species where neuronal and developmental processes can be interrogated and causally manipulated.
Collapse
|
9
|
Analysis and Synthesis of Natural Texture Perception From Visual Evoked Potentials. Front Neurosci 2021; 15:698940. [PMID: 34381330 PMCID: PMC8350323 DOI: 10.3389/fnins.2021.698940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
The primate visual system analyzes statistical information in natural images and uses it for the immediate perception of scenes, objects, and surface materials. To investigate the dynamical encoding of image statistics in the human brain, we measured visual evoked potentials (VEPs) for 166 natural textures and their synthetic versions, and performed a reverse-correlation analysis of the VEPs and representative texture statistics of the image. The analysis revealed occipital VEP components strongly correlated with particular texture statistics. VEPs correlated with low-level statistics, such as subband SDs, emerged rapidly from 100 to 250 ms in a spatial frequency dependent manner. VEPs correlated with higher-order statistics, such as subband kurtosis and cross-band correlations, were observed at slightly later times. Moreover, these robust correlations enabled us to inversely estimate texture statistics from VEP signals via linear regression and to reconstruct texture images that appear similar to those synthesized with the original statistics. Additionally, we found significant differences in VEPs at 200-300 ms between some natural textures and their Portilla-Simoncelli (PS) synthesized versions, even though they shared almost identical texture statistics. This differential VEP was related to the perceptual "unnaturalness" of PS-synthesized textures. These results suggest that the visual cortex rapidly encodes image statistics hidden in natural textures specifically enough to predict the visual appearance of a texture, while it also represents high-level information beyond image statistics, and that electroencephalography can be used to decode these cortical signals.
Collapse
|
10
|
Human Texture Vision as Multi-Order Spectral Analysis. Front Comput Neurosci 2021; 15:692334. [PMID: 34381346 PMCID: PMC8349988 DOI: 10.3389/fncom.2021.692334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 06/28/2021] [Indexed: 11/13/2022] Open
Abstract
Texture information plays a critical role in the rapid perception of scenes, objects, and materials. Here, we propose a novel model in which visual texture perception is essentially determined by the 1st-order (2D-luminance) and 2nd-order (4D-energy) spectra. This model is an extension of the dimensionality of the Filter-Rectify-Filter (FRF) model, and it also corresponds to the frequency representation of the Portilla-Simoncelli (PS) statistics. We show that preserving two spectra and randomizing phases of a natural texture image result in a perceptually similar texture, strongly supporting the model. Based on only two single spectral spaces, this model provides a simpler framework to describe and predict texture representations in the primate visual system. The idea of multi-order spectral analysis is consistent with the hierarchical processing principle of the visual cortex, which is approximated by a multi-layer convolutional network.
Collapse
|
11
|
The roles of lower- and higher-order surface statistics in tactile texture perception. J Neurophysiol 2021; 126:95-111. [PMID: 34038163 DOI: 10.1152/jn.00577.2020] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Humans can haptically discriminate surface textures when there is a significant difference in the statistics of the surface profile. Previous studies on tactile texture discrimination have emphasized the perceptual effects of lower-order statistical features such as carving depth, inter-ridge distance, and anisotropy, which can be characterized by local amplitude spectra or spatial-frequency/orientation subband histograms. However, the real-world surfaces we encounter in everyday life also differ in the higher-order statistics, such as statistics about correlations of nearby spatial-frequencies/orientations. For another modality, vision, the human brain has the ability to use the textural differences in both higher- and lower-order image statistics. In this work, we examined whether the haptic texture perception can use higher-order surface statistics as visual texture perception does, by three-dimensional (3-D)-printing textured surfaces transcribed from different "photos" of natural scenes such as stones and leaves. Even though the maximum carving depth was well above the haptic detection threshold, some texture pairs were hard to discriminate. Specifically, those texture pairs with similar amplitude spectra were difficult to discriminate, which suggests that the lower-order statistics have the dominant effect on tactile texture discrimination. To directly test the poor sensitivity of the tactile texture perception to higher-order surface statistics, we matched the lower-order statistics across different textures using a texture synthesis algorithm and found that haptic discrimination of the matched textures was nearly impossible unless the stimuli contained salient local features. We found no evidence for the ability of the human tactile system to use higher-order surface statistics for texture discrimination.NEW & NOTEWORTHY Humans can discriminate subtle spatial patterns differences in the surrounding world through their hands, but the underlying computation remains poorly understood. Here, we 3-D-printed textured surfaces and analyzed the tactile discrimination performance regarding the sensitivity to surface statistics. The results suggest that observers have sensitivity to lower-order statistics whereas not to higher-order statistics. That is, touch differs from vision not only in spatiotemporal resolution but also in (in)sensitivity to high-level surface statistics.
Collapse
|
12
|
No-Reference Image Quality Assessment with Global Statistical Features. J Imaging 2021; 7:29. [PMID: 34460628 PMCID: PMC8321268 DOI: 10.3390/jimaging7020029] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 01/22/2021] [Accepted: 02/02/2021] [Indexed: 11/16/2022] Open
Abstract
The perceptual quality of digital images is often deteriorated during storage, compression, and transmission. The most reliable way of assessing image quality is to ask people to provide their opinions on a number of test images. However, this is an expensive and time-consuming process which cannot be applied in real-time systems. In this study, a novel no-reference image quality assessment method is proposed. The introduced method uses a set of novel quality-aware features which globally characterizes the statistics of a given test image, such as extended local fractal dimension distribution feature, extended first digit distribution features using different domains, Bilaplacian features, image moments, and a wide variety of perceptual features. Experimental results are demonstrated on five publicly available benchmark image quality assessment databases: CSIQ, MDID, KADID-10k, LIVE In the Wild, and KonIQ-10k.
Collapse
|
13
|
The nature effect in motion: visual exposure to environmental scenes impacts cognitive load and human gait kinematics. ROYAL SOCIETY OPEN SCIENCE 2021; 8:201100. [PMID: 33614067 PMCID: PMC7890511 DOI: 10.1098/rsos.201100] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 12/04/2020] [Indexed: 06/12/2023]
Abstract
Prolonged exposure to urban environments requires higher cognitive processing resources than exposure to nature environments, even if only visual cues are available. Here, we explored the moment-to-moment impact of environment type on visual cognitive processing load, measuring gait kinematics and reaction times. In Experiment 1, participants (n = 20) walked toward nature and urban images projected in front of them, one image per walk, and rated each image for visual discomfort. Gait speed and step length decreased for exposure to urban as compared with nature scenes in line with gait changes observed during verbal cognitive load tasks. We teased apart factors that might contribute to cognitive load: image statistics and visual discomfort. Gait changes correlated with subjective ratings of visual discomfort and their interaction with the environment but not with low-level image statistics. In Experiment 2, participants (n = 45) performed a classic shape discrimination task with the same environmental scenes serving as task-irrelevant distractors. Shape discrimination was slower when urban scenes were presented, suggesting that it is harder to disengage attention from urban than from nature scenes. This provides converging evidence that increased cognitive demands posed by exposure to urban scenes can be measured with gait kinematics and reaction times even for short exposure times.
Collapse
|
14
|
Abstract
While aesthetic experiences are not limited to any particular context, their sensorial, cognitive and behavioral properties can be profoundly affected by the circumstances in which they occur. Given the ubiquitous nature of contextual effects in nearly all aspects of behavior, investigations aimed at delineating the context-dependent and context-independent aspects of aesthetic experience and engagement with aesthetic objects in a diverse range of settings are important in empirical aesthetics. Here, we analyze the viewing behavior of visitors (N = 19) freely viewing 15 paintings in the 20th-century Australian collection room at the Art Gallery of New South Wales. In particular, we focus on how aspects of viewing behavior including viewing distance in the gallery condition and eye gaze measures such as fixation count, total fixation duration and average fixation duration are affected by the artworks’ physical characteristics including size and image statistics properties such as Fourier amplitude spectrum, fractal dimension and entropy. In addition, the same artworks were viewed in the laboratory, either scaled to fit most of the screen (N = 22) or to preserve their relative size as in the museum condition (N = 17) to assess the robustness of these relationships across different presentation contexts. We find that the effects of presentation context are modulated by the artworks’ physical characteristics.
Collapse
|
15
|
Aesthetic Image Statistics Vary with Artistic Genre. Vision (Basel) 2020; 4:vision4010010. [PMID: 32024058 PMCID: PMC7157489 DOI: 10.3390/vision4010010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 01/28/2020] [Accepted: 01/29/2020] [Indexed: 11/16/2022] Open
Abstract
Research to date has not found strong evidence for a universal link between any single low-level image statistic, such as fractal dimension or Fourier spectral slope, and aesthetic ratings of images in general. This study assessed whether different image statistics are important for artistic images containing different subjects and used partial least squares regression (PLSR) to identify the statistics that correlated most reliably with ratings. Fourier spectral slope, fractal dimension and Shannon entropy were estimated separately for paintings containing landscapes, people, still life, portraits, nudes, animals, buildings and abstracts. Separate analyses were performed on the luminance and colour information in the images. PLSR fits showed shared variance of up to 75% between image statistics and aesthetic ratings. The most important statistics and image planes varied across genres. Variation in statistics may reflect characteristic properties of the different neural sub-systems that process different types of image.
Collapse
|
16
|
Systematic Differences Between Perceptually Relevant Image Statistics of Brain MRI and Natural Images. Front Neuroinform 2019; 13:46. [PMID: 31293409 PMCID: PMC6603243 DOI: 10.3389/fninf.2019.00046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 06/03/2019] [Indexed: 11/13/2022] Open
Abstract
It is well-known that the human visual system is adapted to the statistical structure of natural scenes. Yet there are important classes of images - for example, medical images - that are not natural scenes, and therefore, that are expected to have statistical properties that deviate from the class of images that shaped the evolution and development of human vision. Here, focusing on structural brain MRI images, we quantify and characterize these deviations in terms of a set of local image statistics to which human visual sensitivity has been well-characterized, and that has previously been used for natural image analysis. We analyzed MRI images in multiple databases including T1-weighted and FLAIR sequence types, and simulated MRI images based on a published image simulation procedure for T1 images, which we also modified to generate FLAIR images. We first computed the power spectra of MRI images; spectral slopes were in the range -2.6 to -3.1 for T1 sequences, and -2.2 to -2.7 for FLAIR sequences. Analysis of local image statistics was then carried out on whitened images. For all of the databases as well as for the simulated images, we found that the three-point correlations contributed substantially to the differences between the "texture" of randomly selected ROIs. The informative nature of three-point correlations for brain MRI was greater than for natural images, and also disproportionate to human visual sensitivity. As this finding was consistent across databases, it is likely to result from brain geometry at the scale of brain MRI resolution, rather than characteristics of specific imaging and reconstruction methods.
Collapse
|
17
|
Evaluation of non-Gaussian statistical properties in virtual breast phantoms. J Med Imaging (Bellingham) 2019; 6:025502. [PMID: 31259201 PMCID: PMC6566002 DOI: 10.1117/1.jmi.6.2.025502] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 05/20/2019] [Indexed: 10/13/2023] Open
Abstract
Images derived from a "virtual phantom" can be useful in characterizing the performance of imaging systems. This has driven the development of virtual breast phantoms implemented in simulation environments. In breast imaging, several such phantoms have been proposed. We analyze the non-Gaussian statistical properties from three classes of virtual breast phantoms and compare them to similar statistics from a database of breast images. These include clustered-blob lumpy backgrounds (CBLBs), truncated binary textures, and the UPenn virtual breast phantoms. We use Laplacian fractional entropy (LFE) as a measure of the non-Gaussian statistical properties of each simulation procedure. Our results show that, despite similar power spectra, the simulation approaches differ considerably in LFE with very low scores for the CBLB to high values for the UPenn phantom at certain frequencies. These results suggest that LFE may have value in developing and tuning virtual phantom simulation procedures.
Collapse
|
18
|
Neural Mechanisms of Material Perception: Quest on Shitsukan. Neuroscience 2018; 392:329-347. [PMID: 30213767 DOI: 10.1016/j.neuroscience.2018.09.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 08/13/2018] [Accepted: 09/03/2018] [Indexed: 01/11/2023]
Abstract
In recent years, a growing body of research has addressed the nature and mechanism of material perception. Material perception entails perceiving and recognizing a material, surface quality or internal state of an object based on sensory stimuli such as visual, tactile, and/or auditory sensations. This process is ongoing in every aspect of daily life. We can, for example, easily distinguish whether an object is made of wood or metal, or whether a surface is rough or smooth. Judging whether the ground is wet or dry or whether a fish is fresh also involves material perception. Information obtained through material perception can be used to govern actions toward objects and to make decisions about whether to approach an object or avoid it. Because the physical processes leading to sensory signals related to material perception is complicated, it has been difficult to manipulate experimental stimuli in a rigorous manner. However, that situation is now changing thanks to advances in technology and knowledge in related fields. In this article, we will review what is currently known about the neural mechanisms responsible for material perception. We will show that cortical areas in the ventral visual pathway are strongly involved in material perception. Our main focus is on vision, but every sensory modality is involved in material perception. Information obtained through different sensory modalities is closely linked in material perception. Such cross-modal processing is another important feature of material perception, and will also be covered in this review.
Collapse
|
19
|
Gradual Development of Visual Texture-Selective Properties Between Macaque Areas V2 and V4. Cereb Cortex 2018; 27:4867-4880. [PMID: 27655929 DOI: 10.1093/cercor/bhw282] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 08/18/2016] [Indexed: 11/13/2022] Open
Abstract
Complex shape and texture representations are known to be constructed from V1 along the ventral visual pathway through areas V2 and V4, but the underlying mechanism remains elusive. Recent study suggests that, for processing of textures, a collection of higher-order image statistics computed by combining V1-like filter responses serves as possible representations of textures both in V2 and V4. Here, to gain a clue for how these image statistics are processed in the extrastriate visual areas, we compared neuronal responses to textures in V2 and V4 of macaque monkeys. For individual neurons, we adaptively explored their preferred textures from among thousands of naturalistic textures and fitted the obtained responses using a combination of V1-like filter responses and higher-order statistics. We found that, while the selectivity for image statistics was largely comparable between V2 and V4, V4 showed slightly stronger sensitivity to the higher-order statistics than V2. Consistent with that finding, V4 responses were reduced to a greater extent than V2 responses when the monkeys were shown spectrally matched noise images that lacked higher-order statistics. We therefore suggest that there is a gradual development in representation of higher-order features along the ventral visual hierarchy.
Collapse
|
20
|
Abstract
Visual textures are a class of stimuli with properties that make them well suited for addressing general questions about visual function at the levels of behavior and neural mechanism. They have structure across multiple spatial scales, they put the focus on the inferential nature of visual processing, and they help bridge the gap between stimuli that are analytically convenient and the complex, naturalistic stimuli that have the greatest biological relevance. Key questions that are well suited for analysis via visual textures include the nature and structure of perceptual spaces, modulation of early visual processing by task, and the transformation of sensory stimuli into patterns of population activity that are relevant to perception.
Collapse
|
21
|
Spatial Statistics for Segmenting Histological Structures in H&E Stained Tissue Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2017; 36:1522-1532. [PMID: 28328502 PMCID: PMC5498226 DOI: 10.1109/tmi.2017.2681519] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Segmenting a broad class of histological structures in transmitted light and/or fluorescence-based images is a prerequisite for determining the pathological basis of cancer, elucidating spatial interactions between histological structures in tumor microenvironments (e.g., tumor infiltrating lymphocytes), facilitating precision medicine studies with deep molecular profiling, and providing an exploratory tool for pathologists. This paper focuses on segmenting histological structures in hematoxylin- and eosin-stained images of breast tissues, e.g., invasive carcinoma, carcinoma in situ, atypical and normal ducts, adipose tissue, and lymphocytes. We propose two graph-theoretic segmentation methods based on local spatial color and nuclei neighborhood statistics. For benchmarking, we curated a data set of 232 high-power field breast tissue images together with expertly annotated ground truth. To accurately model the preference for histological structures (ducts, vessels, tumor nets, adipose, etc.) over the remaining connective tissue and non-tissue areas in ground truth annotations, we propose a new region-based score for evaluating segmentation algorithms. We demonstrate the improvement of our proposed methods over the state-of-the-art algorithms in both region- and boundary-based performance measures.
Collapse
|
22
|
Entrainment of visual steady-state responses is modulated by global spatial statistics. J Neurophysiol 2017; 118:344-352. [PMID: 28446580 PMCID: PMC5498732 DOI: 10.1152/jn.00129.2017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Revised: 04/24/2017] [Accepted: 04/25/2017] [Indexed: 11/22/2022] Open
Abstract
The rhythmic delivery of visual stimuli evokes large-scale neuronal entrainment in the form of steady-state oscillatory field potentials. The spatiotemporal properties of stimulus drive appear to constrain the relative degrees of neuronal entrainment. Specific frequency ranges, for example, are uniquely suited for enhancing the strength of stimulus-driven brain oscillations. When it comes to the nature of the visual stimulus itself, studies have used a plethora of inputs ranging from spatially unstructured empty fields to simple contrast patterns (checkerboards, gratings, stripes) and complex arrays (human faces, houses, natural scenes). At present, little is known about how the global spatial statistics of the input stimulus influence entrainment of scalp-recorded electrophysiological signals. In this study, we used rhythmic entrainment source separation of scalp EEG to compare stimulus-driven phase alignment for distinct classes of visual inputs, including broadband spatial noise ensembles with varying second-order statistics, natural scenes, and narrowband sine-wave gratings delivered at a constant flicker frequency. The relative magnitude of visual entrainment was modulated by the global properties of the driving stimulus. Entrainment was strongest for pseudo-naturalistic broadband visual noise patterns in which luminance contrast is greatest at low spatial frequencies (a power spectrum slope characterized by 1/ƒ-2).NEW & NOTEWORTHY Rhythmically modulated visual stimuli entrain the activity of neuronal populations, but the effect of global stimulus statistics on this entrainment is unknown. We assessed entrainment evoked by 1) visual noise ensembles with different spectral slopes, 2) complex natural scenes, and 3) narrowband sinusoidal gratings. Entrainment was most effective for broadband noise with naturalistic luminance contrast. This reveals some global properties shaping stimulus-driven brain oscillations in the human visual system.
Collapse
|
23
|
Inferring Master Painters' Esthetic Biases from the Statistics of Portraits. Front Hum Neurosci 2017; 11:94. [PMID: 28337133 PMCID: PMC5343217 DOI: 10.3389/fnhum.2017.00094] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 02/15/2017] [Indexed: 11/23/2022] Open
Abstract
The Processing Fluency Theory posits that the ease of sensory information processing in the brain facilitates esthetic pleasure. Accordingly, the theory would predict that master painters should display biases toward visual properties such as symmetry, balance, and moderate complexity. Have these biases been occurring and if so, have painters been optimizing these properties (fluency variables)? Here, we address these questions with statistics of portrait paintings from the Early Renaissance period. To do this, we first developed different computational measures for each of the aforementioned fluency variables. Then, we measured their statistics in 153 portraits from 26 master painters, in 27 photographs of people in three controlled poses, and in 38 quickly snapped photographs of individual persons. A statistical comparison between Early Renaissance portraits and quickly snapped photographs revealed that painters showed a bias toward balance, symmetry, and moderate complexity. However, a comparison between portraits and controlled-pose photographs showed that painters did not optimize each of these properties. Instead, different painters presented biases toward different, narrow ranges of fluency variables. Further analysis suggested that the painters' individuality stemmed in part from having to resolve the tension between complexity vs. symmetry and balance. We additionally found that constraints on the use of different painting materials by distinct painters modulated these fluency variables systematically. In conclusion, the Processing Fluency Theory of Esthetic Pleasure would need expansion if we were to apply it to the history of visual art since it cannot explain the lack of optimization of each fluency variables. To expand the theory, we propose the existence of a Neuroesthetic Space, which encompasses the possible values that each of the fluency variables can reach in any given art period. We discuss the neural mechanisms of this Space and propose that it has a distributed representation in the human brain. We further propose that different artists reside in different, small sub-regions of the Space. This Neuroesthetic-Space hypothesis raises the question of how painters and their paintings evolve across art periods.
Collapse
|
24
|
Contributions of low- and high-level properties to neural processing of visual scenes in the human brain. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0102. [PMID: 28044013 DOI: 10.1098/rstb.2016.0102] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2016] [Indexed: 11/12/2022] Open
Abstract
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
|
25
|
Low-Level Contrast Statistics of Natural Images Can Modulate the Frequency of Event-Related Potentials (ERP) in Humans. Front Hum Neurosci 2016; 10:630. [PMID: 28018197 PMCID: PMC5145888 DOI: 10.3389/fnhum.2016.00630] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Accepted: 11/25/2016] [Indexed: 11/20/2022] Open
Abstract
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.
Collapse
|
26
|
Abstract
Natural image statistics play a crucial role in shaping biological visual systems, understanding their function and design principles, and designing effective computer-vision algorithms. High-order statistics are critical for conveying local features, but they are challenging to study – largely because their number and variety is large. Here, via the use of two-dimensional Hermite (TDH) functions, we identify a covert symmetry in high-order statistics of natural images that simplifies this task. This emerges from the structure of TDH functions, which are an orthogonal set of functions that are organized into a hierarchy of ranks. Specifically, we find that the shape (skewness and kurtosis) of the distribution of filter coefficients depends only on the projection of the function onto a 1-dimensional subspace specific to each rank. The characterization of natural image statistics provided by TDH filter coefficients reflects both their phase and amplitude structure, and we suggest an intuitive interpretation for the special subspace within each rank.
Collapse
|
27
|
Image Statistics and the Representation of Material Properties in the Visual Cortex. Front Psychol 2016; 7:1185. [PMID: 27582714 PMCID: PMC4987329 DOI: 10.3389/fpsyg.2016.01185] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 07/26/2016] [Indexed: 11/13/2022] Open
Abstract
We explored perceived material properties (roughness, texturedness, and hardness) with a novel approach that compares perception, image statistics and brain activation, as measured with fMRI. We initially asked participants to rate 84 material images with respect to the above mentioned properties, and then scanned 15 of the participants with fMRI while they viewed the material images. The images were analyzed with a set of image statistics capturing their spatial frequency and texture properties. Linear classifiers were then applied to the image statistics as well as the voxel patterns of visually responsive voxels and early visual areas to discriminate between images with high and low perceptual ratings. Roughness and texturedness could be classified above chance level based on image statistics. Roughness and texturedness could also be classified based on the brain activation patterns in visual cortex, whereas hardness could not. Importantly, the agreement in classification based on image statistics and brain activation was also above chance level. Our results show that information about visual material properties is to a large degree contained in low-level image statistics, and that these image statistics are also partially reflected in brain activity patterns induced by the perception of material images.
Collapse
|
28
|
Abstract
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with "equivalent" 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target.
Collapse
|
29
|
Properties of artificial neurons that report lightness based on accumulated experience with luminance. Front Comput Neurosci 2014; 8:134. [PMID: 25404912 PMCID: PMC4217489 DOI: 10.3389/fncom.2014.00134] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Accepted: 10/01/2014] [Indexed: 11/13/2022] Open
Abstract
The responses of visual neurons in experimental animals have been extensively characterized. To ask whether these responses are consistent with a wholly empirical concept of visual perception, we optimized simple neural networks that responded according to the cumulative frequency of occurrence of local luminance patterns in retinal images. Based on this estimation of accumulated experience, the neuron responses showed classical center-surround receptive fields, luminance gain control and contrast gain control, the key properties of early level visual neurons determined in animal experiments. These results imply that a major purpose of pre-cortical neuronal circuitry is to contend with the inherently uncertain significance of luminance values in natural stimuli.
Collapse
|
30
|
Observer efficiency in free-localization tasks with correlated noise. Front Psychol 2014; 5:345. [PMID: 24817854 PMCID: PMC4013476 DOI: 10.3389/fpsyg.2014.00345] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 04/02/2014] [Indexed: 11/28/2022] Open
Abstract
The efficiency of visual tasks involving localization has traditionally been evaluated using forced choice experiments that capitalize on independence across locations to simplify the performance of the ideal observer. However, developments in ideal observer analysis have shown how an ideal observer can be defined for free-localization tasks, where a target can appear anywhere in a defined search region and subjects respond by localizing the target. Since these tasks are representative of many real-world search tasks, it is of interest to evaluate the efficiency of observer performance in them. The central question of this work is whether humans are able to effectively use the information in a free-localization task relative to a similar task where target location is fixed. We use a yes-no detection task at a cued location as the reference for this comparison. Each of the tasks is evaluated using a Gaussian target profile embedded in four different Gaussian noise backgrounds having power-law noise power spectra with exponents ranging from 0 to 3. The free localization task had a square 6.7° search region. We report on two follow-up studies investigating efficiency in a detect-and-localize task, and the effect of processing the white-noise backgrounds. In the fixed-location detection task, we find average observer efficiency ranges from 35 to 59% for the different noise backgrounds. Observer efficiency improves dramatically in the tasks involving localization, ranging from 63 to 82% in the forced localization tasks and from 78 to 92% in the detect-and- localize tasks. Performance in white noise, the lowest efficiency condition, was improved by filtering to give them a power-law exponent of 2. Classification images, used to examine spatial frequency weights for the tasks, show better tuning to ideal weights in the free-localization tasks. The high absolute levels of efficiency suggest that observers are well-adapted to free-localization tasks.
Collapse
|
31
|
Abstract
How do we find a target embedded in a scene? Within the framework of signal detection theory, this task is carried out by comparing each region of the scene with a "template," i.e., an internal representation of the search target. Here we ask what form this representation takes when the search target is a complex image with uncertain orientation. We examine three possible representations. The first is the matched filter. Such a representation cannot account for the ease with which humans can find a complex search target that is rotated relative to the template. A second representation attempts to deal with this by estimating the relative orientation of target and match and rotating the intensity-based template. No intensity-based template, however, can account for the ability to easily locate targets that are defined categorically and not in terms of a specific arrangement of pixels. Thus, we define a third template that represents the target in terms of image statistics rather than pixel intensities. Subjects performed a two-alternative, forced-choice search task in which they had to localize an image that matched a previously viewed target. Target images were texture patches. In one condition, match images were the same image as the target and distractors were a different image of the same textured material. In the second condition, the match image was of the same texture as the target (but different pixels) and the distractor was an image of a different texture. Match and distractor stimuli were randomly rotated relative to the target. We compared human performance to pixel-based, pixel-based with rotation, and statistic-based search models. The statistic-based search model was most successful at matching human performance. We conclude that humans use summary statistics to search for complex visual targets.
Collapse
|
32
|
Conspicuous visual signals do not coevolve with increased body size in marine sea slugs. J Evol Biol 2014; 27:676-87. [PMID: 24588922 DOI: 10.1111/jeb.12348] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Revised: 01/13/2014] [Accepted: 01/23/2014] [Indexed: 11/29/2022]
Abstract
Many taxa use conspicuous colouration to attract mates, signal chemical defences (aposematism) or for thermoregulation. Conspicuousness is a key feature of aposematic signals, and experimental evidence suggests that predators avoid conspicuous prey more readily when they exhibit larger body size and/or pattern elements. Aposematic prey species may therefore evolve a larger body size due to predatory selection pressures, or alternatively, larger prey species may be more likely to evolve aposematic colouration. Therefore, a positive correlation between conspicuousness and body size should exist. Here, we investigated whether there was a phylogenetic correlation between the conspicuousness of animal patterns and body size using an intriguing, understudied model system to examine questions on the evolution of animal signals, namely nudibranchs (opisthobranch molluscs). We also used new ways to compare animal patterns quantitatively with their background habitat in terms of intensity variance and spatial frequency power spectra. In studies of aposematism, conspicuousness is usually quantified using the spectral contrast of animal colour patches against its background; however, other components of visual signals, such as pattern, luminance and spectral sensitivities of potential observers, are largely ignored. Contrary to our prediction, we found that the conspicuousness of body patterns in over 70 nudibranch species decreased as body size increased, indicating that crypsis was not limited to a smaller body size. Therefore, alternative selective pressures on body size and development of colour patterns, other than those inflicted by visual hunting predators, may act more strongly on the evolution of aposematism in nudibranch molluscs.
Collapse
|
33
|
Abstract
People often make rapid visual judgments of the properties of surfaces they are going to walk on or touch. How do they do this when the interactions of illumination geometry with 3-D material structure and object shape result in images that inverse optics algorithms cannot resolve without externally imposed constraints? A possibly effective strategy would be to use heuristics based on information that can be gleaned rapidly from retinal images. By using perceptual scaling of a large sample of images, combined with correspondence and canonical correlation analyses, we discovered that material properties, such as roughness, thickness, and undulations, are characterized by specific scales of luminance variations. Using movies, we demonstrate that observers' percepts of these 3-D qualities vary continuously as a function of the relative energy in corresponding 2-D frequency bands. In addition, we show that judgments of roughness, thickness, and undulations are predictably altered by adaptation to dynamic noise at the corresponding scales. These results establish that the scale of local 3-D structure is critical in perceiving material properties, and that relative contrast at particular spatial frequencies is important for perceiving the critical 3-D structure from shading cues, so that cortical mechanisms for estimating material properties could be constructed by combining the parallel outputs of sets of frequency-selective neurons. These results also provide methods for remote sensing of material properties in machine vision, and rapid synthesis, editing and transfer of material properties for computer graphics and animation.
Collapse
|
34
|
Image regions contributing to perceptual translucency: A psychophysical reverse-correlation study. Iperception 2013; 4:407-28. [PMID: 24349699 PMCID: PMC3859557 DOI: 10.1068/i0576] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Revised: 07/26/2013] [Indexed: 11/21/2022] Open
Abstract
The spatial luminance relationship between shading patterns and specular highlight is suggested to be a cue for perceptual translucency (Motoyoshi, 2010). Although local image features are also important for translucency perception (Fleming & Bulthoff, 2005), they have rarely been investigated. Here, we aimed to extract spatial regions related to translucency perception from computer graphics (CG) images of objects using a psychophysical reverse-correlation method. From many trials in which the observer compared the perceptual translucency of two CG images, we obtained translucency-related patterns showing which image regions were related to perceptual translucency judgments. An analysis of the luminance statistics calculated within these image regions showed that (1) the global rms contrast within an entire CG image was not related to perceptual translucency and (2) the local mean luminance of specific image regions within the CG images correlated well with perceptual translucency. However, the image regions contributing to perceptual translucency differed greatly between observers. These results suggest that perceptual translucency does not rely on global luminance statistics such as global rms contrast, but rather depends on local image features within specific image regions. There may be some “hot spots” effective for perceptual translucency, although which of many hot spots are used in judging translucency may be observer dependent.
Collapse
|
35
|
Abstract
The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality.
Collapse
|
36
|
Abstract
Natural textures have characteristic image statistics that make them discriminable from unnatural textures. For example, both contrast negation and texture synthesis alter the appearance of natural textures even though each manipulation preserves some features while disrupting others. Here, we examined the extent to which contrast negation and texture synthesis each introduce or remove critical perceptual features for discriminating unnatural textures from natural textures. We find that both manipulations remove information that observers use for distinguishing natural textures from transformed versions of the same patterns, but do so in different ways. Texture synthesis removes information that is relevant for discrimination in both abstract patterns and ecologically valid textures, and we also observe a category-dependent asymmetry for identifying an “oddball” real texture among synthetic distractors. Contrast negation exhibits no such asymmetry, and also does not impact discrimination performance in abstract patterns. We discuss our results in the context of the visual system’s tuning to ecologically relevant patterns and other results describing sensitivity to higher-order statistics in texture patterns.
Collapse
|
37
|
Low-level contrast statistics are diagnostic of invariance of natural textures. Front Comput Neurosci 2012; 6:34. [PMID: 22701419 PMCID: PMC3370418 DOI: 10.3389/fncom.2012.00034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 05/23/2012] [Indexed: 11/13/2022] Open
Abstract
Texture may provide important clues for real world object and scene perception. To be reliable, these clues should ideally be invariant to common viewing variations such as changes in illumination and orientation. In a large image database of natural materials, we found textures with low-level contrast statistics that varied substantially under viewing variations, as well as textures that remained relatively constant. This led us to ask whether textures with constant contrast statistics give rise to more invariant representations compared to other textures. To test this, we selected natural texture images with either high (HV) or low (LV) variance in contrast statistics and presented these to human observers. In two distinct behavioral categorization paradigms, participants more often judged HV textures as "different" compared to LV textures, showing that textures with constant contrast statistics are perceived as being more invariant. In a separate electroencephalogram (EEG) experiment, evoked responses to single texture images (single-image ERPs) were collected. The results show that differences in contrast statistics correlated with both early and late differences in occipital ERP amplitude between individual images. Importantly, ERP differences between images of HV textures were mainly driven by illumination angle, which was not the case for LV images: there, differences were completely driven by texture membership. These converging neural and behavioral results imply that some natural textures are surprisingly invariant to illumination changes and that low-level contrast statistics are diagnostic of the extent of this invariance.
Collapse
|
38
|
Abstract
Theories of efficient sensory processing have considered the regularities of image properties due to the structure of the environment in order to explain properties of neuronal representations of the visual world. The regularities imposed on the input to the visual system due to the regularities of the active selection process mediated by the voluntary movements of the eyes have been considered to a much lesser degree. This is surprising, given that the active nature of vision is well established. The present article investigates statistics of image features at the center of gaze of human subjects navigating through a virtual environment and avoiding and approaching different objects. The analysis shows that contrast can be significantly higher or lower at fixation location compared to random locations, depending on whether subjects avoid or approach targets. Similarly, significant differences in the distribution of responses of model simple and complex cells between horizontal and vertical orientations are found over timescales of tens of seconds. By clustering the model simple cell responses, it is established that gaze was directed toward three distinct features of intermediate complexity the vast majority of time. Thus, this study demonstrates and quantifies how the visuomotor tasks of approaching and avoiding objects during navigation determine feature statistics of the input to the visual system through the combined influence on body and eye movements.
Collapse
|
39
|
Does face image statistics predict a preferred spatial frequency for human face processing? Proc Biol Sci 2008; 275:2095-100. [PMID: 18544506 PMCID: PMC2603213 DOI: 10.1098/rspb.2008.0486] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2008] [Revised: 05/14/2008] [Accepted: 05/20/2008] [Indexed: 11/12/2022] Open
Abstract
Psychophysical experiments suggested a relative importance of a narrow band of spatial frequencies for recognition of face identity in humans. There exists, however, no conclusive evidence of why it is that such frequencies are preferred. To address this question, I examined the amplitude spectra of a large number of face images and observed that face spectra generally fall off more steeply with spatial frequency compared with ordinary natural images. When external face features (such as hair) are suppressed, then whitening of the corresponding mean amplitude spectra revealed higher response amplitudes at those spatial frequencies which are deemed important for processing face identity. The results presented here therefore provide support for that face processing characteristics match corresponding stimulus properties.
Collapse
|
40
|
Abstract
The emotional content of visual images can be parameterized along two dimensions: valence (pleasantness) and arousal (intensity of emotion). In this study we ask how these distinct emotional dimensions affect the short-term memory of human observers viewing a rapid stream of images and trying to remember their content. We show that valence and arousal modulate short-term memory as independent factors. Arousal influences dramatically the average speed of data accumulation in memory: higher arousal results in faster accumulation. Valence has a more interesting effect: while a picture is being viewed, information from positive and neutral scenes accumulates in memory at a constant rate, whereas information from negative scenes is encoded slowly at first, then increasingly faster. We provide evidence showing that neither differences in low-level image properties nor differences in the ability to apprehend the meaning of images at short exposures can account for the observed results, and propose that the effects are specific to the short-term memory mechanism. We interpret this pattern of results to mean that information accumulation in short-term memory is a controlled process, whose gain is modulated by valence and arousal acting as endogenous attentional cues.
Collapse
|