101
|
Loxley PN. The Two-Dimensional Gabor Function Adapted to Natural Image Statistics: A Model of Simple-Cell Receptive Fields and Sparse Structure in Images. Neural Comput 2017; 29:2769-2799. [PMID: 28777727 DOI: 10.1162/neco_a_00997] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The two-dimensional Gabor function is adapted to natural image statistics, leading to a tractable probabilistic generative model that can be used to model simple cell receptive field profiles, or generate basis functions for sparse coding applications. Learning is found to be most pronounced in three Gabor function parameters representing the size and spatial frequency of the two-dimensional Gabor function and characterized by a nonuniform probability distribution with heavy tails. All three parameters are found to be strongly correlated, resulting in a basis of multiscale Gabor functions with similar aspect ratios and size-dependent spatial frequencies. A key finding is that the distribution of receptive-field sizes is scale invariant over a wide range of values, so there is no characteristic receptive field size selected by natural image statistics. The Gabor function aspect ratio is found to be approximately conserved by the learning rules and is therefore not well determined by natural image statistics. This allows for three distinct solutions: a basis of Gabor functions with sharp orientation resolution at the expense of spatial-frequency resolution, a basis of Gabor functions with sharp spatial-frequency resolution at the expense of orientation resolution, or a basis with unit aspect ratio. Arbitrary mixtures of all three cases are also possible. Two parameters controlling the shape of the marginal distributions in a probabilistic generative model fully account for all three solutions. The best-performing probabilistic generative model for sparse coding applications is found to be a gaussian copula with Pareto marginal probability density functions.
Collapse
Affiliation(s)
- P N Loxley
- School of Science and Technology, University of New England, Armidale 2351, NSW, Australia
| |
Collapse
|
102
|
Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization. Nat Commun 2017; 8:149. [PMID: 28747662 PMCID: PMC5529558 DOI: 10.1038/s41467-017-00156-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 06/06/2017] [Indexed: 01/05/2023] Open
Abstract
Neurons in sensory systems often pool inputs over arrays of presynaptic cells, giving rise to functional subunits inside a neuron’s receptive field. The organization of these subunits provides a signature of the neuron’s presynaptic functional connectivity and determines how the neuron integrates sensory stimuli. Here we introduce the method of spike-triggered non-negative matrix factorization for detecting the layout of subunits within a neuron’s receptive field. The method only requires the neuron’s spiking responses under finely structured sensory stimulation and is therefore applicable to large populations of simultaneously recorded neurons. Applied to recordings from ganglion cells in the salamander retina, the method retrieves the receptive fields of presynaptic bipolar cells, as verified by simultaneous bipolar and ganglion cell recordings. The identified subunit layouts allow improved predictions of ganglion cell responses to natural stimuli and reveal shared bipolar cell input into distinct types of ganglion cells. How a neuron integrates sensory information requires knowledge about its functional presynaptic connections. Here the authors report a new method using non-negative matrix factorization to identify the layout of presynaptic bipolar cell inputs onto retinal ganglion cells and predict their responses to natural stimuli.
Collapse
|
103
|
Kawabe T. What Property of the Contour of a Deforming Region Biases Percepts toward Liquid? Front Psychol 2017; 8:1014. [PMID: 28663735 PMCID: PMC5471326 DOI: 10.3389/fpsyg.2017.01014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Accepted: 06/01/2017] [Indexed: 12/05/2022] Open
Abstract
Human observers can perceive the existence of a transparent surface from dynamic image deformation. They can also easily discriminate a transparent solid material such as plastic and glass from a transparent fluid one such as water and shampoo just by viewing them. However, the image information required for material discrimination of this sort is still unclear. A liquid changes its contour shape non-rigidly. We therefore examined whether additional properties of the contour of a deformation-defined region, which indicated contour non-rigidity, biased percepts of the region toward liquid materials. Our stimuli had a translating circular region wherein a natural texture image was deformed at the spatiotemporal deformation frequency that was optimal for the perception of a transparent layer. In Experiment 1, we dynamically deformed the contour of the circular region and found that large deformation of the contour biased the percept toward liquid. In Experiment 2, we manipulated the blurriness of the contour and observed that a strongly blurred contour biased percepts toward liquid. Taken together, the results suggest that a deforming region lacking a discrete contour biases percepts toward liquid.
Collapse
Affiliation(s)
- Takahiro Kawabe
- NTT Communication Science Laboratories, Nippon Telegraph and Telephone CorporationAtsugi, Japan
| |
Collapse
|
104
|
Jennings BJ, Kingdom FAA. Chromatic blur perception in the presence of luminance contrast. Vision Res 2017; 135:34-42. [PMID: 28450052 DOI: 10.1016/j.visres.2017.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 02/09/2017] [Accepted: 04/02/2017] [Indexed: 10/19/2022]
Abstract
Hel-Or showed that blurring the chromatic but not the luminance layer of an image of a natural scene failed to elicit any impression of blur. Subsequent studies have suggested that this effect is due either to chromatic blur being masked by spatially contiguous luminance edges in the scene (Journal of Vision 13 (2013) 14), or to a relatively compressed transducer function for chromatic blur (Journal of Vision 15 (2015) 6). To test between the two explanations we conducted experiments using as stimuli both images of natural scenes as well as simple edges. First, we found that in color-and-luminance images of natural scenes more chromatic blur was needed to perceptually match a given level of blur in an isoluminant, i.e. colour-only scene. However, when the luminance layer in the scene was rotated relative to the chromatic layer, thus removing the colour-luminance edge correlations, the matched blur levels were near equal. Both results are consistent with Sharman et al.'s explanation. Second, when observers matched the blurs of luminance-only with isoluminant scenes, the matched blurs were equal, against Kingdom et al.'s prediction. Third, we measured the perceived blur in a square-wave as a function of (i) contrast (ii) number of luminance edges and (iii) the relative spatial phase between the colour and luminance edges. We found that the perceived chromatic blur was dependent on both relative phase and the number of luminance edges, or dependent on the luminance contrast if only a single edge is present. We conclude that this Hel-Or effect is largely due to masking of chromatic blur by spatially contiguous luminance edges.
Collapse
Affiliation(s)
- Ben J Jennings
- McGill Vision Research, Department of Ophthalmology, Montreal General Hospital, McGill University, Montreal, Quebec, Canada.
| | - Frederick A A Kingdom
- McGill Vision Research, Department of Ophthalmology, Montreal General Hospital, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
105
|
Ito J, Yamane Y, Suzuki M, Maldonado P, Fujita I, Tamura H, Grün S. Switch from ambient to focal processing mode explains the dynamics of free viewing eye movements. Sci Rep 2017; 7:1082. [PMID: 28439075 PMCID: PMC5430715 DOI: 10.1038/s41598-017-01076-w] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/22/2017] [Indexed: 11/21/2022] Open
Abstract
Previous studies have reported that humans employ ambient and focal modes of visual exploration while they freely view natural scenes. These two modes have been characterized based on eye movement parameters such as saccade amplitude and fixation duration, but not by any visual features of the viewed scenes. Here we propose a new characterization of eye movements during free viewing based on how eyes are moved from and to objects in a visual scene. We applied this characterization to data obtained from freely-viewing macaque monkeys. We show that the analysis based on this characterization gives a direct indication of a behavioral shift from ambient to focal processing mode along the course of free viewing exploration. We further propose a stochastic model of saccade sequence generation incorporating a switch between the two processing modes, which quantitatively reproduces the behavioral features observed in the data.
Collapse
Affiliation(s)
- Junji Ito
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA BRAIN Institute I, Jülich Research Centre, Jülich, Germany.
| | - Yukako Yamane
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Mika Suzuki
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Pedro Maldonado
- BNI, CENEM and Programa de Fisiología y Biofísica, ICBM, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Ichiro Fujita
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Hiroshi Tamura
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Sonja Grün
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA BRAIN Institute I, Jülich Research Centre, Jülich, Germany
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Theoretical Systems Neurobiology, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
106
|
End A, Gamer M. Preferential Processing of Social Features and Their Interplay with Physical Saliency in Complex Naturalistic Scenes. Front Psychol 2017; 8:418. [PMID: 28424635 PMCID: PMC5371661 DOI: 10.3389/fpsyg.2017.00418] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 03/06/2017] [Indexed: 11/30/2022] Open
Abstract
According to so-called saliency-based attention models, attention during free viewing of visual scenes is particularly allocated to physically salient image regions. In the present study, we assumed that social features in complex naturalistic scenes would be processed preferentially irrespective of their physical saliency. Therefore, we expected worse prediction of gazing behavior by saliency-based attention models when social information is present in the visual field. To test this hypothesis, participants freely viewed color photographs of complex naturalistic social (e.g., including heads, bodies) and non-social (e.g., including landscapes, objects) scenes while their eye movements were recorded. In agreement with our hypothesis, we found that social features (especially heads) were heavily prioritized during visual exploration. Correspondingly, the presence of social information weakened the influence of low-level saliency on gazing behavior. Importantly, this pattern was most pronounced for the earliest fixations indicating automatic attentional processes. These findings were further corroborated by a linear mixed model approach showing that social features (especially heads) add substantially to the prediction of fixations beyond physical saliency. Taken together, the current study indicates gazing behavior for naturalistic scenes to be better predicted by the interplay of social and physically salient features than by low-level saliency alone. These findings strongly challenge the generalizability of saliency-based attention models and demonstrate the importance of considering social influences when investigating the driving factors of human visual attention.
Collapse
Affiliation(s)
- Albert End
- Department of Systems Neuroscience, University Medical Center Hamburg-EppendorfHamburg, Germany
| | - Matthias Gamer
- Department of Systems Neuroscience, University Medical Center Hamburg-EppendorfHamburg, Germany.,Department of Psychology, Julius Maximilians University of WürzburgWürzburg, Germany
| |
Collapse
|
107
|
Image deformation as a cue to material category judgment. Sci Rep 2017; 7:44274. [PMID: 28276494 PMCID: PMC5343573 DOI: 10.1038/srep44274] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 02/06/2017] [Indexed: 11/25/2022] Open
Abstract
Human observers easily recognize complex natural phenomena, such as flowing water, which often generate highly chaotic dynamic arrays of light on the retina. It has not been clarified how the visual system discerns the source of a fluid flow. Here we show that the magnitude of image deformation caused by light refraction is a critical factor for the visual system to determine the perceptual category of fluid flows. Employing a physics engine, we created computer-rendered scenes of water and hot air flows. For each flow, we manipulated the rendering parameters (distortion factors and the index of refraction) that strongly influence the magnitude of image deformation. The observers rated how strongly they felt impressions of water and hot air in the video clips of the flows. The ratings showed that the water and hot air impressions were positively and negatively related to the magnitude of image deformation. Based on the results, we discuss how the visual system heuristically utilizes image deformation to discern non-rigid materials such as water and hot air flows.
Collapse
|
108
|
Khuu SK, Cham J, Hayes A. The Effect of Local Orientation Change on the Detection of Contours Defined by Constant Curvature: Psychophysics and Image Statistics. Front Psychol 2017; 7:2069. [PMID: 28144224 PMCID: PMC5239794 DOI: 10.3389/fpsyg.2016.02069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 12/21/2016] [Indexed: 11/13/2022] Open
Abstract
In the present study, we investigated the detection of contours defined by constant curvature and the statistics of curved contours in natural scenes. In Experiment 1, we examined the degree to which human sensitivity to contours is affected by changing the curvature angle and disrupting contour curvature continuity by varying the orientation of end elements. We find that (1) changing the angle of contour curvature decreased detection performance, while (2) end elements oriented in the direction (i.e., clockwise) of curvature facilitated contour detection regardless of the curvature angle of the contour. In Experiment 2 we further established that the relative effect of end—element orientation on contour detection was not only dependent on their orientation (collinear or cocircular), but also their spatial separation from the contour, and whether the contour shape was curved or not (i.e., C-shaped or S-shaped). Increasing the spatial separation of end-elements reduced contour detection performance regardless of their orientation or the contour shape. However, at small separations, cocircular end-elements facilitated the detection of C-shaped contours, but not S-shaped contours. The opposite result was observed for collinear end-elements, which improved the detection of S- shaped, but not C-shaped contours. These dissociative results confirmed that the visual system specifically codes contour curvature, but the association of contour elements occurs locally. Finally, we undertook an analysis of natural images that mapped contours with a constant angular change and determined the frequency of occurrence of end elements with different orientations. Analogous to our behavioral data, this image analysis revealed that the mapped end elements of constantly curved contours are likely to be oriented clockwise to the angle of curvature. Our findings indicate that the visual system is selectively sensitive to contours defined by constant curvature and that this might reflect the properties of curved contours in natural images.
Collapse
Affiliation(s)
- Sieu K. Khuu
- School of Optometry and Vision Science, University of New South WalesSydney, NSW, Australia
- *Correspondence: Sieu K. Khuu
| | - Joey Cham
- Department of Psychology, The University of Hong KongHong Kong, Hong Kong
| | - Anthony Hayes
- Department of Psychology, The University of Hong KongHong Kong, Hong Kong
| |
Collapse
|
109
|
Onken A, Liu JK, Karunasekara PPCR, Delis I, Gollisch T, Panzeri S. Using Matrix and Tensor Factorizations for the Single-Trial Analysis of Population Spike Trains. PLoS Comput Biol 2016; 12:e1005189. [PMID: 27814363 PMCID: PMC5096699 DOI: 10.1371/journal.pcbi.1005189] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Accepted: 10/11/2016] [Indexed: 11/21/2022] Open
Abstract
Advances in neuronal recording techniques are leading to ever larger numbers of simultaneously monitored neurons. This poses the important analytical challenge of how to capture compactly all sensory information that neural population codes carry in their spatial dimension (differences in stimulus tuning across neurons at different locations), in their temporal dimension (temporal neural response variations), or in their combination (temporally coordinated neural population firing). Here we investigate the utility of tensor factorizations of population spike trains along space and time. These factorizations decompose a dataset of single-trial population spike trains into spatial firing patterns (combinations of neurons firing together), temporal firing patterns (temporal activation of these groups of neurons) and trial-dependent activation coefficients (strength of recruitment of such neural patterns on each trial). We validated various factorization methods on simulated data and on populations of ganglion cells simultaneously recorded in the salamander retina. We found that single-trial tensor space-by-time decompositions provided low-dimensional data-robust representations of spike trains that capture efficiently both their spatial and temporal information about sensory stimuli. Tensor decompositions with orthogonality constraints were the most efficient in extracting sensory information, whereas non-negative tensor decompositions worked well even on non-independent and overlapping spike patterns, and retrieved informative firing patterns expressed by the same population in response to novel stimuli. Our method showed that populations of retinal ganglion cells carried information in their spike timing on the ten-milliseconds-scale about spatial details of natural images. This information could not be recovered from the spike counts of these cells. First-spike latencies carried the majority of information provided by the whole spike train about fine-scale image features, and supplied almost as much information about coarse natural image features as firing rates. Together, these results highlight the importance of spike timing, and particularly of first-spike latencies, in retinal coding.
Collapse
Affiliation(s)
- Arno Onken
- Neural Computation Laboratory, Center for Neuroscience and Cognitive Systems @UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy
| | - Jian K. Liu
- Department of Ophthalmology, University Medical Center Goettingen, Goettingen, Germany
- Bernstein Center for Computational Neuroscience Goettingen, Goettingen, Germany
| | - P. P. Chamanthi R. Karunasekara
- Neural Computation Laboratory, Center for Neuroscience and Cognitive Systems @UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy
| | - Ioannis Delis
- Department of Biomedical Engineering, Columbia University, New York, New York, United States of America
| | - Tim Gollisch
- Department of Ophthalmology, University Medical Center Goettingen, Goettingen, Germany
- Bernstein Center for Computational Neuroscience Goettingen, Goettingen, Germany
| | - Stefano Panzeri
- Neural Computation Laboratory, Center for Neuroscience and Cognitive Systems @UniTn, Istituto Italiano di Tecnologia, Rovereto, Italy
| |
Collapse
|
110
|
SCEGRAM: An image database for semantic and syntactic inconsistencies in scenes. Behav Res Methods 2016; 49:1780-1791. [DOI: 10.3758/s13428-016-0820-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
111
|
Ernst UA, Schiffer A, Persike M, Meinhardt G. Contextual Interactions in Grating Plaid Configurations Are Explained by Natural Image Statistics and Neural Modeling. Front Syst Neurosci 2016; 10:78. [PMID: 27757076 PMCID: PMC5048088 DOI: 10.3389/fnsys.2016.00078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 09/16/2016] [Indexed: 11/13/2022] Open
Abstract
Processing natural scenes requires the visual system to integrate local features into global object descriptions. To achieve coherent representations, the human brain uses statistical dependencies to guide weighting of local feature conjunctions. Pairwise interactions among feature detectors in early visual areas may form the early substrate of these local feature bindings. To investigate local interaction structures in visual cortex, we combined psychophysical experiments with computational modeling and natural scene analysis. We first measured contrast thresholds for 2 × 2 grating patch arrangements (plaids), which differed in spatial frequency composition (low, high, or mixed), number of grating patch co-alignments (0, 1, or 2), and inter-patch distances (1° and 2° of visual angle). Contrast thresholds for the different configurations were compared to the prediction of probability summation (PS) among detector families tuned to the four retinal positions. For 1° distance the thresholds for all configurations were larger than predicted by PS, indicating inhibitory interactions. For 2° distance, thresholds were significantly lower compared to PS when the plaids were homogeneous in spatial frequency and orientation, but not when spatial frequencies were mixed or there was at least one misalignment. Next, we constructed a neural population model with horizontal laminar structure, which reproduced the detection thresholds after adaptation of connection weights. Consistent with prior work, contextual interactions were medium-range inhibition and long-range, orientation-specific excitation. However, inclusion of orientation-specific, inhibitory interactions between populations with different spatial frequency preferences were crucial for explaining detection thresholds. Finally, for all plaid configurations we computed their likelihood of occurrence in natural images. The likelihoods turned out to be inversely related to the detection thresholds obtained at larger inter-patch distances. However, likelihoods were almost independent of inter-patch distance, implying that natural image statistics could not explain the crowding-like results at short distances. This failure of natural image statistics to resolve the patch distance modulation of plaid visibility remains a challenge to the approach.
Collapse
Affiliation(s)
- Udo A Ernst
- Computational Neuroscience Lab, Department of Physics, Institute for Theoretical Physics, University of Bremen Bremen, Germany
| | - Alina Schiffer
- Computational Neuroscience Lab, Department of Physics, Institute for Theoretical Physics, University of Bremen Bremen, Germany
| | - Malte Persike
- Methods Section, Department of Psychology, Johannes Gutenberg University Mainz Mainz, Germany
| | - Günter Meinhardt
- Methods Section, Department of Psychology, Johannes Gutenberg University Mainz Mainz, Germany
| |
Collapse
|
112
|
Contributions of the hippocampus to feedback learning. COGNITIVE AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2016; 15:861-77. [PMID: 26055632 DOI: 10.3758/s13415-015-0364-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Humans learn about the world in a variety of manners, including by observation, by associating cues in the environment, and via feedback. Across species, two brain structures have been predominantly involved in these learning processes: the hippocampus--supporting learning via observation and paired association--and the striatum--critical for feedback learning. This simple dichotomy, however, has recently been challenged by reports of hippocampal engagement in feedback learning, although the role of the hippocampus is not fully understood. The purpose of this experiment was to characterize the hippocampal response during feedback learning by manipulating varying levels of memory interference. Consistent with prior reports, feedback learning recruited the striatum and midbrain. Notably, feedback learning also engaged the hippocampus. The level of activity in these regions was modulated by the degree of memory interference, such that the greatest activation occurred during the highest level of memory interference. Importantly, the accuracy of information learned via feedback correlated with hippocampal activation and was reduced by the presence of high memory interference. Taken together, these findings provide evidence of hippocampal involvement in feedback learning by demonstrating both its relevance for the accuracy of information learned via feedback and its susceptibility to interference.
Collapse
|
113
|
Chromatic Information and Feature Detection in Fast Visual Analysis. PLoS One 2016; 11:e0159898. [PMID: 27478891 PMCID: PMC4968813 DOI: 10.1371/journal.pone.0159898] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Accepted: 07/11/2016] [Indexed: 12/11/2022] Open
Abstract
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-and-white movies provide compelling representations of real world scenes. Also, the contrast sensitivity of color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. We conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.
Collapse
|
114
|
Abstract
We have revealed a new role for colour vision in visual scene analysis: colour vision facilitates shadow identification. Shadows are important features of the visual scene, providing information about the shape, depth, and movement of objects. To be useful for perception, however, shadows must be distinguished from other types of luminance variation, principally the variation in object reflectance. A potential cue for distinguishing shadows from reflectance variations is colour, since chromatic changes typically occur at object but not shadow boundaries. We tested whether colour cues were exploited by the visual system for shadow identification, by comparing the ability of human test subjects to identify simulated shadows on chromatically variegated versus achromatically variegated backgrounds with identical luminance compositions. Performance was superior with the chromatically variegated backgrounds. Furthermore, introducing random colour contrast across the shadow boundaries degraded their identification. These findings demonstrate that the visual system exploits inbuilt assumptions about the relationships between colour and luminance in the natural visual world.
Collapse
Affiliation(s)
- Frederick A A Kingdom
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, 687 Pine Avenue West, Room H4-14, Montréal, Québec H3A 1A1, Canada.
| | | | | |
Collapse
|
115
|
Zhang J, Wang M, Zhang S, Li X, Wu X. Spatiochromatic Context Modeling for Color Saliency Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:1177-1189. [PMID: 26316225 DOI: 10.1109/tnnls.2015.2464316] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Visual saliency is one of the most noteworthy perceptual abilities of human vision. Recent progress in cognitive psychology suggests that: 1) visual saliency analysis is mainly completed by the bottom-up mechanism consisting of feedforward low-level processing in primary visual cortex (area V1) and 2) color interacts with spatial cues and is influenced by the neighborhood context, and thus it plays an important role in a visual saliency analysis. From a computational perspective, the most existing saliency modeling approaches exploit multiple independent visual cues, irrespective of their interactions (or are not computed explicitly), and ignore contextual influences induced by neighboring colors. In addition, the use of color is often underestimated in the visual saliency analysis. In this paper, we propose a simple yet effective color saliency model that considers color as the only visual cue and mimics the color processing in V1. Our approach uses region-/boundary-defined color features with spatiochromatic filtering by considering local color-orientation interactions, therefore captures homogeneous color elements, subtle textures within the object and the overall salient object from the color image. To account for color contextual influences, we present a divisive normalization method for chromatic stimuli through the pooling of contrary/complementary color units. We further define a color perceptual metric over the entire scene to produce saliency maps for color regions and color boundaries individually. These maps are finally globally integrated into a one single saliency map. The final saliency map is produced by Gaussian blurring for robustness. We evaluate the proposed method on both synthetic stimuli and several benchmark saliency data sets from the visual saliency analysis to salient object detection. The experimental results demonstrate that the use of color as a unique visual cue achieves competitive results on par with or better than 12 state-of-the-art approaches.
Collapse
|
116
|
Zachevsky I, Zeevi YYJ. Statistics of Natural Stochastic Textures and Their Application in Image Denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:2130-2145. [PMID: 27045423 DOI: 10.1109/tip.2016.2539689] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Natural stochastic textures (NSTs), characterized by their fine details, are prone to corruption by artifacts, introduced during the image acquisition process by the combined effect of blur and noise. While many successful algorithms exist for image restoration and enhancement, the restoration of natural textures and textured images based on suitable statistical models has yet to be further improved. We examine the statistical properties of NST using three image databases. We show that the Gaussian distribution is suitable for many NST, while other natural textures can be properly represented by a model that separates the image into two layers; one of these layers contains the structural elements of smooth areas and edges, while the other contains the statistically Gaussian textural details. Based on these statistical properties, an algorithm for the denoising of natural images containing NST is proposed, using patch-based fractional Brownian motion model and regularization by means of anisotropic diffusion. It is illustrated that this algorithm successfully recovers both missing textural details and structural attributes that characterize natural images. The algorithm is compared with classical as well as the state-of-the-art denoising algorithms.
Collapse
|
117
|
Schiller F, Gegenfurtner KR. Perception of saturation in natural scenes. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2016; 33:A194-A206. [PMID: 26974924 DOI: 10.1364/josaa.33.00a194] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We measured how well perception of color saturation in natural scenes can be predicted by different measures that are available in the literature. We presented 80 color images of natural scenes or their gray-scale counterparts to our observers, who were asked to choose the pixel from each image that appeared to be the most saturated. We compared our observers' choices to the predictions of seven popular saturation measures. For the color images, all of the measures predicted perception of saturation quite well, with CIECAM02 performing best. Differences between the measures were small but systematic. When gray-scale images were viewed, observers still chose pixels whose counterparts in the color images were saturated above average. This indicates that image structure and prior knowledge can be relevant to perception of saturation. Nevertheless, our results also show that saturation in natural scenes can be specified quite well without taking these factors into account.
Collapse
|
118
|
Graham D, Schwarz B, Chatterjee A, Leder H. Preference for luminance histogram regularities in natural scenes. Vision Res 2016; 120:11-21. [DOI: 10.1016/j.visres.2015.03.018] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Revised: 03/09/2015] [Accepted: 03/24/2015] [Indexed: 10/23/2022]
|
119
|
On the second order spatiochromatic structure of natural images. Vision Res 2016; 120:22-38. [DOI: 10.1016/j.visres.2015.02.025] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Revised: 02/14/2015] [Accepted: 02/23/2015] [Indexed: 11/22/2022]
|
120
|
Yousaf S, Qin S. Closed-Loop Restoration Approach to Blurry Images Based on Machine Learning and Feedback Optimization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:5928-5941. [PMID: 26513786 DOI: 10.1109/tip.2015.2492825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Blind image deconvolution (BID) aims to remove or reduce the degradations that have occurred during the acquisition or processing. It is a challenging ill-posed problem due to a lack of enough information in degraded image for unambiguous recovery of both point spread function (PSF) and clear image. Although recently many powerful algorithms appeared; however, it is still an active research area due to the diversity of degraded images as well as degradations. Closed-loop control systems are characterized with their powerful ability to stabilize the behavior response and overcome external disturbances by designing an effective feedback optimization. In this paper, we employed feedback control to enhance the stability of BID by driving the current estimation quality of PSF to the desired level without manually selecting restoration parameters and using an effective combination of machine learning with feedback optimization. The foremost challenge when designing a feedback structure is to construct or choose a suitable performance metric as a controlled index and a feedback information. Our proposed quality metric is based on the blur assessment of deconvolved patches to identify the best PSF and computing its relative quality. The Kalman filter-based extremum seeking approach is employed to find the optimum value of controlled variable. To find better restoration parameters, learning algorithms, such as multilayer perceptron and bagged decision trees, are used to estimate the generic PSF support size instead of trial and error methods. The problem is modeled as a combination of pattern classification and regression using multiple training features, including noise metrics, blur metrics, and low-level statistics. Multi-objective genetic algorithm is used to find key patches from multiple saliency maps which enhance performance and save extra computation by avoiding ineffectual regions of the image. The proposed scheme is shown to outperform corresponding open-loop schemes, which often fails or needs many assumptions regarding images and thus resulting in sub-optimal results.
Collapse
|
121
|
Goris RLT, Simoncelli EP, Movshon JA. Origin and Function of Tuning Diversity in Macaque Visual Cortex. Neuron 2015; 88:819-31. [PMID: 26549331 DOI: 10.1016/j.neuron.2015.10.009] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2015] [Revised: 07/14/2015] [Accepted: 09/30/2015] [Indexed: 11/19/2022]
Abstract
Neurons in visual cortex vary in their orientation selectivity. We measured responses of V1 and V2 cells to orientation mixtures and fit them with a model whose stimulus selectivity arises from the combined effects of filtering, suppression, and response nonlinearity. The model explains the diversity of orientation selectivity with neuron-to-neuron variability in all three mechanisms, of which variability in the orientation bandwidth of linear filtering is the most important. The model also accounts for the cells' diversity of spatial frequency selectivity. Tuning diversity is matched to the needs of visual encoding. The orientation content found in natural scenes is diverse, and neurons with different selectivities are adapted to different stimulus configurations. Single orientations are better encoded by highly selective neurons, while orientation mixtures are better encoded by less selective neurons. A diverse population of neurons therefore provides better overall discrimination capabilities for natural images than any homogeneous population.
Collapse
Affiliation(s)
- Robbe L T Goris
- Center for Neural Science, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA; Howard Hughes Medical Institute, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA.
| | - Eero P Simoncelli
- Center for Neural Science, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA; Howard Hughes Medical Institute, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA
| | - J Anthony Movshon
- Center for Neural Science, New York University, 4 Washington Place, Room 809, New York, NY 10003, USA.
| |
Collapse
|
122
|
Mannion DJ, Kersten DJ, Olman CA. Scene coherence can affect the local response to natural images in human V1. Eur J Neurosci 2015; 42:2895-903. [PMID: 26390850 DOI: 10.1111/ejn.13082] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 09/14/2015] [Accepted: 09/16/2015] [Indexed: 11/30/2022]
Abstract
Neurons in primary visual cortex (V1) can be indirectly affected by visual stimulation positioned outside their receptive fields. Although this contextual modulation has been intensely studied, we have little notion of how it manifests with naturalistic stimulation. Here, we investigated how the V1 response to a natural image fragment is affected by spatial context that is consistent or inconsistent with the scene from which it was extracted. Using functional magnetic resonance imaging at 7 T, we measured the blood oxygen level-dependent signal in human V1 (n = 8) while participants viewed an array of apertures. Most apertures showed fragments from a single scene, yielding a dominant perceptual interpretation which participants were asked to categorize, and the remaining apertures each showed fragments drawn from a set of 20 scenes. We find that the V1 response was significantly increased for apertures showing image structure that was coherent with the dominant scene relative to the response to the same image structure when it was non-coherent. Additional analyses suggest that this effect was mostly evident for apertures in the periphery of the visual field, that it peaked towards the centre of the aperture, and that it peaked in the middle to superficial regions of the cortical grey matter. These findings suggest that knowledge of typical spatial relationships is embedded in the circuitry of contextual modulation. Such mechanisms, possibly augmented by contributions from attentional factors, serve to increase the local V1 activity under conditions of contextual consistency.
Collapse
Affiliation(s)
- Damien J Mannion
- School of Psychology, UNSW Australia UNSW, Sydney, NSW, 2052, Australia.,Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Daniel J Kersten
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA.,Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
| | - Cheryl A Olman
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
123
|
Laparra V, Malo J. Visual aftereffects and sensory nonlinearities from a single statistical framework. Front Hum Neurosci 2015; 9:557. [PMID: 26528165 PMCID: PMC4602147 DOI: 10.3389/fnhum.2015.00557] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 09/22/2015] [Indexed: 11/13/2022] Open
Abstract
When adapted to a particular scenery our senses may fool us: colors are misinterpreted, certain spatial patterns seem to fade out, and static objects appear to move in reverse. A mere empirical description of the mechanisms tuned to color, texture, and motion may tell us where these visual illusions come from. However, such empirical models of gain control do not explain why these mechanisms work in this apparently dysfunctional manner. Current normative explanations of aftereffects based on scene statistics derive gain changes by (1) invoking decorrelation and linear manifold matching/equalization, or (2) using nonlinear divisive normalization obtained from parametric scene models. These principled approaches have different drawbacks: the first is not compatible with the known saturation nonlinearities in the sensors and it cannot fully accomplish information maximization due to its linear nature. In the second, gain change is almost determined a priori by the assumed parametric image model linked to divisive normalization. In this study we show that both the response changes that lead to aftereffects and the nonlinear behavior can be simultaneously derived from a single statistical framework: the Sequential Principal Curves Analysis (SPCA). As opposed to mechanistic models, SPCA is not intended to describe how physiological sensors work, but it is focused on explaining why they behave as they do. Nonparametric SPCA has two key advantages as a normative model of adaptation: (i) it is better than linear techniques as it is a flexible equalization that can be tuned for more sensible criteria other than plain decorrelation (either full information maximization or error minimization); and (ii) it makes no a priori functional assumption regarding the nonlinearity, so the saturations emerge directly from the scene data and the goal (and not from the assumed function). It turns out that the optimal responses derived from these more sensible criteria and SPCA are consistent with dysfunctional behaviors such as aftereffects.
Collapse
Affiliation(s)
| | - Jesús Malo
- Image Processing Lab, Universitat de ValènciaValència, Spain
| |
Collapse
|
124
|
Dyakova O, Lee YJ, Longden KD, Kiselev VG, Nordström K. A higher order visual neuron tuned to the spatial amplitude spectra of natural scenes. Nat Commun 2015; 6:8522. [PMID: 26439748 PMCID: PMC4600736 DOI: 10.1038/ncomms9522] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 09/02/2015] [Indexed: 12/26/2022] Open
Abstract
Animal sensory systems are optimally adapted to those features typically encountered in natural surrounds, thus allowing neurons with limited bandwidth to encode challengingly large input ranges. Natural scenes are not random, and peripheral visual systems in vertebrates and insects have evolved to respond efficiently to their typical spatial statistics. The mammalian visual cortex is also tuned to natural spatial statistics, but less is known about coding in higher order neurons in insects. To redress this we here record intracellularly from a higher order visual neuron in the hoverfly. We show that the cSIFE neuron, which is inhibited by stationary images, is maximally inhibited when the slope constant of the amplitude spectrum is close to the mean in natural scenes. The behavioural optomotor response is also strongest to images with naturalistic image statistics. Our results thus reveal a close coupling between the inherent statistics of natural scenes and higher order visual processing in insects.
Collapse
Affiliation(s)
- Olga Dyakova
- Department of Neuroscience, Uppsala University, Box 593, 75124 Uppsala, Sweden
| | - Yu-Jen Lee
- Department of Neuroscience, Uppsala University, Box 593, 75124 Uppsala, Sweden
| | - Kit D. Longden
- HHMI Janelia Research Campus, 19700 Helix Drive, Ashburn, Virginia 20176, USA
| | - Valerij G. Kiselev
- Medical Physics, Department of Radiology, University Medical Center Freiburg, Breisacher Strasse 60a, 79106 Freiburg, Germany
| | - Karin Nordström
- Department of Neuroscience, Uppsala University, Box 593, 75124 Uppsala, Sweden
- Anatomy and Histology, Centre for Neuroscience, Flinders University, GPO Box 2100, Adelaide, South Australia 5001, Australia
| |
Collapse
|
125
|
Jennings BJ, Wang K, Menzies S, Kingdom FAA. Detection of chromatic and luminance distortions in natural scenes. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2015; 32:1613-1622. [PMID: 26367428 DOI: 10.1364/josaa.32.001613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A number of studies have measured visual thresholds for detecting spatial distortions applied to images of natural scenes. In one study, Bex [J. Vis.10(2), 1 (2010)10.1167/10.2.231534-7362] measured sensitivity to sinusoidal spatial modulations of image scale. Here, we measure sensitivity to sinusoidal scale distortions applied to the chromatic, luminance, or both layers of natural scene images. We first established that sensitivity does not depend on whether the undistorted comparison image was of the same or of a different scene. Next, we found that, when the luminance but not chromatic layer was distorted, performance was the same regardless of whether the chromatic layer was present, absent, or phase-scrambled; in other words, the chromatic layer, in whatever form, did not affect sensitivity to the luminance layer distortion. However, when the chromatic layer was distorted, sensitivity was higher when the luminance layer was intact compared to when absent or phase-scrambled. These detection threshold results complement the appearance of periodic distortions of the image scale: when the luminance layer is distorted visibly, the scene appears distorted, but when the chromatic layer is distorted visibly, there is little apparent scene distortion. We conclude that (a) observers have a built-in sense of how a normal image of a natural scene should appear, and (b) the detection of distortion in, as well as the apparent distortion of, natural scene images is mediated predominantly by the luminance layer and not chromatic layer.
Collapse
|
126
|
|
127
|
Ossandón JP, König P, Heed T. Irrelevant tactile stimulation biases visual exploration in external coordinates. Sci Rep 2015; 5:10664. [PMID: 26021612 PMCID: PMC4448131 DOI: 10.1038/srep10664] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 04/27/2015] [Indexed: 11/30/2022] Open
Abstract
We evaluated the effect of irrelevant tactile stimulation on humans’ free-viewing behavior during the exploration of complex static scenes. Specifically, we address the questions of (1) whether task-irrelevant tactile stimulation presented to subjects’ hands can guide visual selection during free viewing; (2) whether tactile stimulation can modulate visual exploratory biases that are independent of image content and task goals; and (3) in which reference frame these effects occur. Tactile stimulation to uncrossed and crossed hands during the viewing of static images resulted in long-lasting modulation of visual orienting responses. Subjects showed a well-known leftward bias during the early exploration of images, and this bias was modulated by tactile stimulation presented at image onset. Tactile stimulation, both at image onset and later during the trials, biased visual orienting toward the space ipsilateral to the stimulated hand, both in uncrossed and crossed hand postures. The long-lasting temporal and global spatial profile of the modulation of free viewing exploration by touch indicates that cross-modal cues produce orienting responses, which are coded exclusively in an external reference frame.
Collapse
Affiliation(s)
- José P Ossandón
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| | - Peter König
- 1] Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany [2] Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany
| | - Tobias Heed
- Biological Psychology &Neuropsychology, Faculty of Psychology &Movement Science, University of Hamburg, Hamburg, Germany
| |
Collapse
|
128
|
Predicting cortical dark/bright asymmetries from natural image statistics and early visual transforms. PLoS Comput Biol 2015; 11:e1004268. [PMID: 26020624 PMCID: PMC4447361 DOI: 10.1371/journal.pcbi.1004268] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 03/28/2015] [Indexed: 11/19/2022] Open
Abstract
The nervous system has evolved in an environment with structure and predictability. One of the ubiquitous principles of sensory systems is the creation of circuits that capitalize on this predictability. Previous work has identified predictable non-uniformities in the distributions of basic visual features in natural images that are relevant to the encoding tasks of the visual system. Here, we report that the well-established statistical distributions of visual features -- such as visual contrast, spatial scale, and depth -- differ between bright and dark image components. Following this analysis, we go on to trace how these differences in natural images translate into different patterns of cortical input that arise from the separate bright (ON) and dark (OFF) pathways originating in the retina. We use models of these early visual pathways to transform natural images into statistical patterns of cortical input. The models include the receptive fields and non-linear response properties of the magnocellular (M) and parvocellular (P) pathways, with their ON and OFF pathway divisions. The results indicate that there are regularities in visual cortical input beyond those that have previously been appreciated from the direct analysis of natural images. In particular, several dark/bright asymmetries provide a potential account for recently discovered asymmetries in how the brain processes visual features, such as violations of classic energy-type models. On the basis of our analysis, we expect that the dark/bright dichotomy in natural images plays a key role in the generation of both cortical and perceptual asymmetries. Sensory systems must contend with a tremendous amount of diversity in the natural world. Gaining a detailed description of the natural world’s statistical regularities is a critical part of understanding how the nervous system is adapted to its environment. Here, we report that the well-established statistical distributions of basic visual features—such as visual contrast and spatial scale—diverge when separated into bright and dark components. Operations such as dark/bright segregation are key features of early visual pathways. By modeling these pathways, we demonstrate that the dark and bright visual patterns driving cortical networks are asymmetric across a number of visual features, producing previously unappreciated second-order regularities. The results provide a parsimonious account for recently discovered asymmetries in cortical activity.
Collapse
|
129
|
Cheng W, Hirakawa K. Minimum risk wavelet shrinkage operator for Poisson image denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:1660-1671. [PMID: 25769158 DOI: 10.1109/tip.2015.2409566] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The pixel values of images taken by an image sensor are said to be corrupted by Poisson noise. To date, multiscale Poisson image denoising techniques have processed Haar frame and wavelet coefficients--the modeling of coefficients is enabled by the Skellam distribution analysis. We extend these results by solving for shrinkage operators for Skellam that minimizes the risk functional in the multiscale Poisson image denoising setting. The minimum risk shrinkage operator of this kind effectively produces denoised wavelet coefficients with minimum attainable L2 error.
Collapse
|
130
|
Sawayama M, Kimura E. Stain on texture: Perception of a dark spot having a blurred edge on textured backgrounds. Vision Res 2015; 109:209-20. [DOI: 10.1016/j.visres.2014.11.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2014] [Revised: 10/12/2014] [Accepted: 11/12/2014] [Indexed: 11/25/2022]
|
131
|
Sato H, Motoyoshi I, Sato T. On-Off asymmetry in the perception of blur. Vision Res 2015; 120:5-10. [PMID: 25817715 DOI: 10.1016/j.visres.2015.03.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Revised: 03/09/2015] [Accepted: 03/10/2015] [Indexed: 11/18/2022]
Abstract
Natural images appear blurred when imperfect lens focus reduces contrast energy at higher spatial frequencies. Here, we present evidence that perceived blur also depends on asymmetries between On (positive contrast polarities) and Off (negative contrast polarities) image signals. Psychophysical matching experiments involving natural and artificial stimuli suggest that attenuating Off signals at high spatial frequencies results in increased perceptual blur relative to similar attenuations of On signals. Results support the notion that Off image signals play an important role in blur perception.
Collapse
Affiliation(s)
- Hiromi Sato
- Department of Psychology, Graduate School of Humanities and Sociology, The University of Tokyo, Japan; JSPS Research Fellow, Japan.
| | | | - Takao Sato
- Department of Psychology, Graduate School of Humanities and Sociology, The University of Tokyo, Japan
| |
Collapse
|
132
|
Li X, Chen Y, Lashgari R, Bereshpolova Y, Swadlow HA, Lee BB, Alonso JM. Mixing of Chromatic and Luminance Retinal Signals in Primate Area V1. Cereb Cortex 2014; 25:1920-37. [PMID: 24464943 DOI: 10.1093/cercor/bhu002] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Vision emerges from activation of chromatic and achromatic retinal channels whose interaction in visual cortex is still poorly understood. To investigate this interaction, we recorded neuronal activity from retinal ganglion cells and V1 cortical cells in macaques and measured their visual responses to grating stimuli that had either luminance contrast (luminance grating), chromatic contrast (chromatic grating), or a combination of the two (compound grating). As with parvocellular or koniocellular retinal ganglion cells, some V1 cells responded mostly to the chromatic contrast of the compound grating. As with magnocellular retinal ganglion cells, other V1 cells responded mostly to the luminance contrast and generated a frequency-doubled response to equiluminant chromatic gratings. Unlike magnocellular and parvocellular retinal ganglion cells, V1 cells formed a unimodal distribution for luminance/color preference with a 2- to 4-fold bias toward luminance. V1 cells associated with positive local field potentials in deep layers showed the strongest combined responses to color and luminance and, as a population, V1 cells encoded a diverse combination of luminance/color edges that matched edge distributions of natural scenes. Taken together, these results suggest that the primary visual cortex combines magnocellular and parvocellular retinal inputs to increase cortical receptive field diversity and to optimize visual processing of our natural environment.
Collapse
Affiliation(s)
- Xiaobing Li
- Department of Biological Sciences, SUNY Optometry, New York, NY 10036, USA
| | - Yao Chen
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Reza Lashgari
- Department of Biological Sciences, SUNY Optometry, New York, NY 10036, USA Department of Biomedical Engineering, School of Electrical Engineering, Iran University of Science and Technology, Narmak, Tehran, Iran
| | - Yulia Bereshpolova
- Department of Psychology, University of Connecticut, Storrs, CT 06269, USA
| | - Harvey A Swadlow
- Department of Biological Sciences, SUNY Optometry, New York, NY 10036, USA Department of Psychology, University of Connecticut, Storrs, CT 06269, USA
| | - Barry B Lee
- Department of Biological Sciences, SUNY Optometry, New York, NY 10036, USA Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | - Jose Manuel Alonso
- Department of Biological Sciences, SUNY Optometry, New York, NY 10036, USA Department of Psychology, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
133
|
Groen II, Ghebreab S, Prins H, Lamme VA, Scholte HS. From image statistics to scene gist: evoked neural activity reveals transition from low-level natural image structure to scene category. J Neurosci 2013; 33:18814-24. [PMID: 24285888 PMCID: PMC6618700 DOI: 10.1523/jneurosci.3128-13.2013] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2013] [Revised: 10/07/2013] [Accepted: 10/24/2013] [Indexed: 11/21/2022] Open
Abstract
The visual system processes natural scenes in a split second. Part of this process is the extraction of "gist," a global first impression. It is unclear, however, how the human visual system computes this information. Here, we show that, when human observers categorize global information in real-world scenes, the brain exhibits strong sensitivity to low-level summary statistics. Subjects rated a specific instance of a global scene property, naturalness, for a large set of natural scenes while EEG was recorded. For each individual scene, we derived two physiologically plausible summary statistics by spatially pooling local contrast filter outputs: contrast energy (CE), indexing contrast strength, and spatial coherence (SC), indexing scene fragmentation. We show that behavioral performance is directly related to these statistics, with naturalness rating being influenced in particular by SC. At the neural level, both statistics parametrically modulated single-trial event-related potential amplitudes during an early, transient window (100-150 ms), but SC continued to influence activity levels later in time (up to 250 ms). In addition, the magnitude of neural activity that discriminated between man-made versus natural ratings of individual trials was related to SC, but not CE. These results suggest that global scene information may be computed by spatial pooling of responses from early visual areas (e.g., LGN or V1). The increased sensitivity over time to SC in particular, which reflects scene fragmentation, suggests that this statistic is actively exploited to estimate scene naturalness.
Collapse
Affiliation(s)
- Iris I.A. Groen
- Cognitive Neuroscience Group, Department of Psychology
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| | - Sennay Ghebreab
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
- Intelligent Systems Laboratory Amsterdam, Institute of Informatics, University of Amsterdam, 1018 WS, Amsterdam, The Netherlands
| | - Hielke Prins
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| | | | - H. Steven Scholte
- Cognitive Neuroscience Group, Department of Psychology
- Amsterdam Center for Brain and Cognition, Institute for Interdisciplinary Studies, and
| |
Collapse
|
134
|
Elliott SL, Cao D. Scotopic hue percepts in natural scenes. J Vis 2013; 13:15. [PMID: 24233245 PMCID: PMC3829393 DOI: 10.1167/13.13.15] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 10/08/2013] [Indexed: 11/24/2022] Open
Abstract
Traditional trichromatic theories of color vision conclude that color perception is not possible under scotopic illumination in which only one type of photoreceptor, rods, is active. The current study demonstrates the existence of scotopic color perception and indicates that perceived hue is influenced by spatial context and top-down processes of color perception. Experiment 1 required observers to report the perceived hue in various natural scene images under purely rod-mediated vision. The results showed that when the test patch had low variation in the luminance distribution and was a decrement in luminance compared to the surrounding area, reddish or orangish percepts were more likely to be reported compared to all other percepts. In contrast, when the test patch had a high variation and was an increment in luminance, the probability of perceiving blue, green, or yellow hues increased. In addition, when observers had a strong, but singular, daylight hue association for the test patch, color percepts were reported more often and hues appeared more saturated compared to patches with no daylight hue association. This suggests that experience in daylight conditions modulates the bottom-up processing for rod-mediated color perception. In Experiment 2, observers reported changes in hue percepts for a test ring surrounded by inducing rings that varied in spatial context. In sum, the results challenge the classic view that rod vision is achromatic and suggest that scotopic hue perception is mediated by cortical mechanisms.
Collapse
Affiliation(s)
| | - Dingcai Cao
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, IL, USA
| |
Collapse
|
135
|
Burini N, Nadernejad E, Korhonen J, Forchhammer S, Xiaolin Wu. Modeling Power-Constrained Optimal Backlight Dimming for Color Displays. ACTA ACUST UNITED AC 2013. [DOI: 10.1109/jdt.2013.2253544] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
136
|
Del Viva MM, Punzi G, Benedetti D. Information and perception of meaningful patterns. PLoS One 2013; 8:e69154. [PMID: 23894422 PMCID: PMC3716808 DOI: 10.1371/journal.pone.0069154] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 06/12/2013] [Indexed: 11/22/2022] Open
Abstract
The visual system needs to extract the most important elements of the external world from a large flux of information in a short time for survival purposes. It is widely believed that in performing this task, it operates a strong data reduction at an early stage, by creating a compact summary of relevant information that can be handled by further levels of processing. In this work we formulate a model of early vision based on a pattern-filtering architecture, partly inspired by high-speed digital data reduction in experimental high-energy physics (HEP). This allows a much stronger data reduction than models based just on redundancy reduction. We show that optimizing this model for best information preservation under tight constraints on computational resources yields surprisingly specific a-priori predictions for the shape of biologically plausible features, and for experimental observations on fast extraction of salient visual features by human observers. Interestingly, applying the same optimized model to HEP data acquisition systems based on pattern-filtering architectures leads to specific a-priori predictions for the relevant data patterns that these devices extract from their inputs. These results suggest that the limitedness of computing resources can play an important role in shaping the nature of perception, by determining what is perceived as “meaningful features” in the input data.
Collapse
Affiliation(s)
- Maria M Del Viva
- NEUROFARBA Dipartimento di Neuroscienze, Psicologia, Area del Farmaco e Salute del Bambino Sezione di Psicologia, Università di Firenze, Firenze, Italy.
| | | | | |
Collapse
|
137
|
Allred SR, Brainard DH. A Bayesian model of lightness perception that incorporates spatial variation in the illumination. J Vis 2013; 13:18. [PMID: 23814073 PMCID: PMC3697904 DOI: 10.1167/13.7.18] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 03/19/2013] [Indexed: 11/24/2022] Open
Abstract
The lightness of a test stimulus depends in a complex manner on the context in which it is viewed. To predict lightness, it is necessary to leverage measurements of a feasible number of contextual configurations into predictions for a wider range of configurations. Here we pursue this goal, using the idea that lightness results from the visual system's attempt to provide stable information about object surface reflectance. We develop a Bayesian algorithm that estimates both illumination and reflectance from image luminance, and link perceived lightness to the algorithm's estimates of surface reflectance. The algorithm resolves ambiguity in the image through the application of priors that specify what illumination and surface reflectances are likely to occur in viewed scenes. The prior distributions were chosen to allow spatial variation in both illumination and surface reflectance. To evaluate our model, we compared its predictions to a data set of judgments of perceived lightness of test patches embedded in achromatic checkerboards (Allred, Radonjić, Gilchrist, & Brainard, 2012). The checkerboard stimuli incorporated the large variation in luminance that is a pervasive feature of natural scenes. In addition, the luminance profile of the checks both near to and remote from the central test patches was systematically manipulated. The manipulations provided a simplified version of spatial variation in illumination. The model can account for effects of overall changes in image luminance and the dependence of such changes on spatial location as well as some but not all of the more detailed features of the data.
Collapse
Affiliation(s)
- Sarah R. Allred
- Department of Psychology, Rutgers, The State University of New Jersey, Camden, NJ, USA
| | - David H. Brainard
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
138
|
Kay KN, Winawer J, Rokem A, Mezer A, Wandell BA. A two-stage cascade model of BOLD responses in human visual cortex. PLoS Comput Biol 2013; 9:e1003079. [PMID: 23737741 PMCID: PMC3667759 DOI: 10.1371/journal.pcbi.1003079] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2012] [Accepted: 04/18/2013] [Indexed: 12/03/2022] Open
Abstract
Visual neuroscientists have discovered fundamental properties of neural representation through careful analysis of responses to controlled stimuli. Typically, different properties are studied and modeled separately. To integrate our knowledge, it is necessary to build general models that begin with an input image and predict responses to a wide range of stimuli. In this study, we develop a model that accepts an arbitrary band-pass grayscale image as input and predicts blood oxygenation level dependent (BOLD) responses in early visual cortex as output. The model has a cascade architecture, consisting of two stages of linear and nonlinear operations. The first stage involves well-established computations—local oriented filters and divisive normalization—whereas the second stage involves novel computations—compressive spatial summation (a form of normalization) and a variance-like nonlinearity that generates selectivity for second-order contrast. The parameters of the model, which are estimated from BOLD data, vary systematically across visual field maps: compared to primary visual cortex, extrastriate maps generally have larger receptive field size, stronger levels of normalization, and increased selectivity for second-order contrast. Our results provide insight into how stimuli are encoded and transformed in successive stages of visual processing. Much has been learned about how stimuli are represented in the visual system from measuring responses to carefully designed stimuli. Typically, different studies focus on different types of stimuli. Making sense of the large array of findings requires integrated models that explain responses to a wide range of stimuli. In this study, we measure functional magnetic resonance imaging (fMRI) responses in early visual cortex to a wide range of band-pass filtered images, and construct a computational model that takes the stimuli as input and predicts the fMRI responses as output. The model has a cascade architecture, consisting of two stages of linear and nonlinear operations. A novel component of the model is a nonlinear operation that generates selectivity for second-order contrast, that is, variations in contrast-energy across the visual field. We find that this nonlinearity is stronger in extrastriate areas V2 and V3 than in primary visual cortex V1. Our results provide insight into how stimuli are encoded and transformed in the visual system.
Collapse
Affiliation(s)
- Kendrick N Kay
- Department of Psychology, Stanford University, Stanford, California, USA.
| | | | | | | | | |
Collapse
|
139
|
Shen J, Yang X, Li X, Jia Y. Intrinsic Image Decomposition Using Optimization and User Scribbles. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:425-436. [PMID: 22907970 DOI: 10.1109/tsmcb.2012.2208744] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this paper, we present a novel high-quality intrinsic image recovery approach using optimization and user scribbles. Our approach is based on the assumption of color characteristics in a local window in natural images. Our method adopts a premise that neighboring pixels in a local window having similar intensity values should have similar reflectance values. Thus, the intrinsic image decomposition is formulated by minimizing an energy function with the addition of a weighting constraint to the local image properties. In order to improve the intrinsic image decomposition results, we further specify local constraint cues by integrating the user strokes in our energy formulation, including constant-reflectance, constant-illumination, and fixed-illumination brushes. Our experimental results demonstrate that the proposed approach achieves a better recovery result of intrinsic reflectance and illumination components than the previous approaches.
Collapse
|
140
|
Abstract
During visual exploration, saccadic eye movements scan the scene for objects of interest. During attempted fixation, the eyes are relatively still but often produce microsaccades. Saccadic rates during exploration are higher than those of microsaccades during fixation, reinforcing the classic view that exploration and fixation are two distinct oculomotor behaviors. An alternative model is that fixation and exploration are not dichotomous, but are instead two extremes of a functional continuum. Here, we measured the eye movements of human observers as they either fixed their gaze on a small spot or scanned natural scenes of varying sizes. As scene size diminished, so did saccade rates, until they were continuous with microsaccadic rates during fixation. Other saccadic properties varied as function of image size as well, forming a continuum with microsaccadic parameters during fixation. This saccadic continuum extended to nonrestrictive, ecological viewing conditions that allowed all types of saccades and fixation positions. Eye movement simulations moreover showed that a single model of oculomotor behavior can explain the saccadic continuum from exploration to fixation, for images of all sizes. These findings challenge the view that exploration and fixation are dichotomous, suggesting instead that visual fixation is functionally equivalent to visual exploration on a spatially focused scale.
Collapse
|
141
|
Ossandón JP, Onat S, Cazzoli D, Nyffeler T, Müri R, König P. Unmasking the contribution of low-level features to the guidance of attention. Neuropsychologia 2012; 50:3478-87. [PMID: 23044277 DOI: 10.1016/j.neuropsychologia.2012.09.043] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2011] [Revised: 09/19/2012] [Accepted: 09/26/2012] [Indexed: 11/18/2022]
Affiliation(s)
- José P Ossandón
- Universität Osnabrück, Institut für Kognitionswissenschaft, Albrechtstr. 28, 49076 Osnabrück, Germany.
| | | | | | | | | | | |
Collapse
|
142
|
Groen IIA, Ghebreab S, Lamme VAF, Scholte HS. Spatially pooled contrast responses predict neural and perceptual similarity of naturalistic image categories. PLoS Comput Biol 2012; 8:e1002726. [PMID: 23093921 PMCID: PMC3475684 DOI: 10.1371/journal.pcbi.1002726] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Accepted: 08/02/2012] [Indexed: 11/22/2022] Open
Abstract
The visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task. Humans excel in rapid and accurate processing of visual scenes. However, it is unclear which computations allow the visual system to convert light hitting the retina into a coherent representation of visual input in a rapid and efficient way. Here we used simple, computer-generated image categories with similar low-level structure as natural scenes to test whether a model of early integration of low-level information can predict perceived category similarity. Specifically, we show that summarized (spatially pooled) responses of model neurons covering the entire visual field (the population response) to low-level properties of visual input (contrasts) can already be informative about differences in early visual evoked activity as well as behavioral confusions of these categories. These results suggest that low-level population responses can carry relevant information to estimate similarity of controlled images, and put forward the exciting hypothesis that the visual system may exploit these responses to rapidly process real natural scenes. We propose that the spatial pooling that allows for the extraction of this information may be a plausible first step in extracting scene gist to form a rapid impression of the visual input.
Collapse
Affiliation(s)
- Iris I. A. Groen
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- * E-mail:
| | - Sennay Ghebreab
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Intelligent Systems Lab Amsterdam, Institute of Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Victor A. F. Lamme
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - H. Steven Scholte
- Cognitive Neuroscience Group, Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
143
|
Ellemberg D, Hansen BC, Johnson A. The developing visual system is not optimally sensitive to the spatial statistics of natural images. Vision Res 2012; 67:1-7. [PMID: 22766478 DOI: 10.1016/j.visres.2012.06.018] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Revised: 04/04/2012] [Accepted: 06/22/2012] [Indexed: 11/28/2022]
Abstract
The adult visual system is optimally tuned to process the spatial properties of natural scenes, which is demonstrated by sensitivity to changes in the 1/f(α) amplitude spectrum. It is also well documented that different aspects of spatial vision, including those likely responsible for the perception of natural scenes (e.g., spatial frequency discrimination), do not become mature until late childhood. This led us to hypothesise that the developing visual system is not optimally tuned to process the spatial properties of real-world scenes. The present study investigated how sensitivity to the statistical properties of natural images changes during development. Thresholds for discriminating a change in the slope of the amplitude spectrum of a natural scene with a reference α of 0.7, 1.0, or 1.3 where measured in children aged 6, 8, and 10 years (n=16 per age) and in adults (mean age=23). Consistent with previous studies, adults were least sensitive for the shallowest α (i.e., 0.7) and most sensitive for the steepest α (i.e., 1.3). Six- and 8-year-olds had significantly higher discrimination thresholds compared to the 10-year-olds and adults for α's of 1.0 and 1.3, and 10-year-olds did not differ significantly from adults for any of the α's tested. These data suggest that sensitivity to detecting a change in the spatial characteristics of natural scenes during childhood may not be optimally tuned to the statistics of natural images until about 10 years of age. Rather, is seems that perception of natural images could be limited by the known immaturities in spatial vision (Ellemberg, Lepore, & Turgeon, 2010). The question remains as to whether the adult's exquisite sensitivity to the spatial properties of the natural world is experience driven or whether it is part of our genetic programming that only fully expresses itself in late childhood.
Collapse
Affiliation(s)
- Dave Ellemberg
- Université de Montréal, Department of Kinesiology, Montréal, Québec, Canada.
| | | | | |
Collapse
|
144
|
Wardle SG, Bex PJ, Cass J, Alais D. Stereoacuity in the periphery is limited by internal noise. J Vis 2012; 12:12. [PMID: 22685339 PMCID: PMC4502945 DOI: 10.1167/12.6.12] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Accepted: 05/04/2012] [Indexed: 11/24/2022] Open
Abstract
It is well-established that depth discrimination is finer in the fovea than the periphery. Here, we study the decline in depth discrimination thresholds with distance from the fovea using an equivalent noise analysis to separate the contributions of internal noise and sampling efficiency. Observers discriminated the mean depth of patches of "dead leaves" composed of ellipses varying in size, orientation, and luminance at varying levels of disparity noise between 0.05 and 13.56 arcmin and visual field locations between 0° and 9° eccentricity. At low levels of disparity noise, depth discrimination thresholds were lower in the fovea than in the periphery. At higher noise levels (above 3.39 arcmin), thresholds converged, and there was little difference between foveal and peripheral depth discrimination. The parameters estimated from the equivalent noise model indicate that an increase in internal noise is the limiting factor in peripheral depth discrimination with no decline in sampling efficiency. Sampling efficiency was uniformly low across the visual field. The results indicate that a loss of precision of local disparity estimates early in visual processing limits fine depth discrimination in the periphery.
Collapse
Affiliation(s)
- Susan G. Wardle
- School of Psychology, The University of Sydney, Sydney, New South Wales, Australia
| | - Peter J. Bex
- Schepens Eye Research Institute, Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, USA
| | - John Cass
- School of Psychology, University of Western Sydney, Sydney, New South Wales, Australia
| | - David Alais
- School of Psychology, The University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
145
|
Natural versus synthetic stimuli for estimating receptive field models: a comparison of predictive robustness. J Neurosci 2012; 32:1560-76. [PMID: 22302799 DOI: 10.1523/jneurosci.4661-12.2012] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
An ultimate goal of visual neuroscience is to understand the neural encoding of complex, everyday scenes. Yet most of our knowledge of neuronal receptive fields has come from studies using simple artificial stimuli (e.g., bars, gratings) that may fail to reveal the full nature of a neuron's actual response properties. Our goal was to compare the utility of artificial and natural stimuli for estimating receptive field (RF) models. Using extracellular recordings from simple type cells in cat A18, we acquired responses to three types of broadband stimulus ensembles: two widely used artificial patterns (white noise and short bars), and natural images. We used a primary dataset to estimate the spatiotemporal receptive field (STRF) with two hold-back datasets for regularization and validation. STRFs were estimated using an iterative regression algorithm with regularization and subsequently fit with a zero-memory nonlinearity. Each RF model (STRF and zero-memory nonlinearity) was then used in simulations to predict responses to the same stimulus type used to estimate it, as well as to other broadband stimuli and sinewave gratings. White noise stimuli often elicited poor responses leading to noisy RF estimates, while short bars and natural image stimuli were more successful in driving A18 neurons and producing clear RF estimates with strong predictive ability. Natural image-derived RF models were the most robust at predicting responses to other broadband stimulus ensembles that were not used in their estimation and also provided good predictions of tuning curves for sinewave gratings.
Collapse
|
146
|
Schauerte B, Stiefelhagen R. Quaternion-Based Spectral Saliency Detection for Eye Fixation Prediction. COMPUTER VISION – ECCV 2012 2012. [DOI: 10.1007/978-3-642-33709-3_9] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
147
|
A three-dimensional spatiotemporal receptive field model explains responses of area MT neurons to naturalistic movies. J Neurosci 2011; 31:14551-64. [PMID: 21994372 DOI: 10.1523/jneurosci.6801-10.2011] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Area MT has been an important target for studies of motion processing. However, previous neurophysiological studies of MT have used simple stimuli that do not contain many of the motion signals that occur during natural vision. In this study we sought to determine whether views of area MT neurons developed using simple stimuli can account for MT responses under more naturalistic conditions. We recorded responses from macaque area MT neurons during stimulation with naturalistic movies. We then used a quantitative modeling framework to discover which specific mechanisms best predict neuronal responses under these challenging conditions. We find that the simplest model that accurately predicts responses of MT neurons consists of a bank of V1-like filters, each followed by a compressive nonlinearity, a divisive nonlinearity, and linear pooling. Inspection of the fit models shows that the excitatory receptive fields of MT neurons tend to lie on a single plane within the three-dimensional spatiotemporal frequency domain, and suppressive receptive fields lie off this plane. However, most excitatory receptive fields form a partial ring in the plane and avoid low temporal frequencies. This receptive field organization ensures that most MT neurons are tuned for velocity but do not tend to respond to ambiguous static textures that are aligned with the direction of motion. In sum, MT responses to naturalistic movies are largely consistent with predictions based on simple stimuli. However, models fit using naturalistic stimuli reveal several novel properties of MT receptive fields that had not been shown in prior experiments.
Collapse
|
148
|
The efficacy of local luminance amplitude in disambiguating the origin of luminance signals depends on carrier frequency: Further evidence for the active role of second-order vision in layer decomposition. Vision Res 2011; 51:496-507. [DOI: 10.1016/j.visres.2011.01.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2010] [Revised: 01/16/2011] [Accepted: 01/19/2011] [Indexed: 11/24/2022]
|
149
|
Vazquez-Corral J, Párraga CA, Vanrell M, Baldrich R. Color Constancy Algorithms: Psychophysical Evaluation on a New Dataset. J Imaging Sci Technol 2009. [DOI: 10.2352/j.imagingsci.technol.2009.53.3.031105] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
|
150
|
Kingdom FA. Perceiving light versus material. Vision Res 2008; 48:2090-105. [PMID: 18479723 DOI: 10.1016/j.visres.2008.03.020] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2008] [Revised: 03/25/2008] [Accepted: 03/26/2008] [Indexed: 10/22/2022]
|