1
|
Li R, Li J, Wang C, Liu H, Liu T, Wang X, Zou T, Huang W, Yan H, Chen H. Multi-Semantic Decoding of Visual Perception with Graph Neural Networks. Int J Neural Syst 2024; 34:2450016. [PMID: 38372016 DOI: 10.1142/s0129065724500163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Constructing computational decoding models to account for the cortical representation of semantic information plays a crucial role in understanding visual perception. The human visual system processes interactive relationships among different objects when perceiving the semantic contents of natural visions. However, the existing semantic decoding models commonly regard categories as completely separate and independent visually and semantically and rarely consider the relationships from prior information. In this work, a novel semantic graph learning model was proposed to decode multiple semantic categories of perceived natural images from brain activity. The proposed model was validated on the functional magnetic resonance imaging data collected from five normal subjects while viewing 2750 natural images comprising 52 semantic categories. The results showed that the Graph Neural Network-based decoding model achieved higher accuracies than other deep neural network models. Moreover, the co-occurrence probability among semantic categories showed a significant correlation with the decoding accuracy. Additionally, the results suggested that semantic content organized in a hierarchical way with higher visual areas was more closely related to the internal visual experience. Together, this study provides a superior computational framework for multi-semantic decoding that supports the visual integration mechanism of semantic processing.
Collapse
Affiliation(s)
- Rong Li
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Jiyi Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Chong Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Haoxiang Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Tao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Xuyang Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Ting Zou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Wei Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Hongmei Yan
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| | - Huafu Chen
- The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
- MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging, Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 611731, P. R. China
| |
Collapse
|
2
|
Machida I, Shishikura M, Yamane Y, Sakai K. Representation of Natural Contours by a Neural Population in Monkey V4. eNeuro 2024; 11:ENEURO.0445-23.2024. [PMID: 38423791 PMCID: PMC10946029 DOI: 10.1523/eneuro.0445-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 02/18/2024] [Accepted: 02/22/2024] [Indexed: 03/02/2024] Open
Abstract
The cortical visual area, V4, has been considered to code contours that contribute to the intermediate-level representation of objects. The neural responses to the complex contour features intrinsic to natural contours are expected to clarify the essence of the representation. To approach the cortical coding of natural contours, we investigated the simultaneous coding of multiple contour features in monkey (Macaca fuscata) V4 neurons and their population-level representation. A substantial number of neurons showed significant tuning for two or more features such as curvature and closure, indicating that a substantial number of V4 neurons simultaneously code multiple contour features. A large portion of the neurons responded vigorously to acutely curved contours that surrounded the center of classical receptive field, suggesting that V4 neurons tend to code prominent features of object contours. The analysis of mutual information (MI) between the neural responses and each contour feature showed that most neurons exhibited similar magnitudes for each type of MI, indicating that many neurons showing the responses depended on multiple contour features. We next examined the population-level representation by using multidimensional scaling analysis. The neural preferences to the multiple contour features and that to natural stimuli compared with silhouette stimuli increased along with the primary and secondary axes, respectively, indicating the contribution of the multiple contour features and surface textures in the population responses. Our analyses suggested that V4 neurons simultaneously code multiple contour features in natural images and represent contour and surface properties in population.
Collapse
Affiliation(s)
- Itsuki Machida
- Department of Computer Science, University of Tsukuba, Tsukuba 305-8573, Japan
| | - Motofumi Shishikura
- Department of Computer Science, University of Tsukuba, Tsukuba 305-8573, Japan
| | - Yukako Yamane
- Neural Computation Unit, Okinawa Institute of Science and Technology, Okinawa 904-0495, Japan
| | - Ko Sakai
- Department of Computer Science, University of Tsukuba, Tsukuba 305-8573, Japan
| |
Collapse
|
3
|
Hibbard PB, Hornsey RL, Asher JM. Binocular Information Improves the Reliability and Consistency of Pictorial Relief. Vision (Basel) 2022; 7:1. [PMID: 36649048 DOI: 10.3390/vision7010001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 11/26/2022] [Accepted: 12/15/2022] [Indexed: 12/25/2022]
Abstract
Binocular disparity is an important cue to three-dimensional shape. We assessed the contribution of this cue to the reliability and consistency of depth in stereoscopic photographs of natural scenes. Observers viewed photographs of cluttered scenes while adjusting a gauge figure to indicate the apparent three-dimensional orientation of the surfaces of objects. The gauge figure was positioned on the surfaces of objects at multiple points in the scene, and settings were made under monocular and binocular, stereoscopic viewing. Settings were used to create a depth relief map, indicating the apparent three-dimensional structure of the scene. We found that binocular cues increased the magnitude of apparent depth, the reliability of settings across repeated measures, and the consistency of perceived depth across participants. These results show that binocular cues make an important contribution to the precise and accurate perception of depth in natural scenes that contain multiple pictorial cues.
Collapse
|
4
|
Gert AL, Ehinger BV, Timm S, Kietzmann TC, König P. WildLab: A naturalistic free viewing experiment reveals previously unknown electroencephalography signatures of face processing. Eur J Neurosci 2022; 56:6022-6038. [PMID: 36113866 DOI: 10.1111/ejn.15824] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 08/26/2022] [Accepted: 08/30/2022] [Indexed: 12/29/2022]
Abstract
Neural mechanisms of face perception are predominantly studied in well-controlled experimental settings that involve random stimulus sequences and fixed eye positions. Although powerful, the employed paradigms are far from what constitutes natural vision. Here, we demonstrate the feasibility of ecologically more valid experimental paradigms using natural viewing behaviour, by combining a free viewing paradigm on natural scenes, free of photographer bias, with advanced data processing techniques that correct for overlap effects and co-varying non-linear dependencies of multiple eye movement parameters. We validate this approach by replicating classic N170 effects in neural responses, triggered by fixation onsets (fixation event-related potentials [fERPs]). Importantly, besides finding a strong correlation between both experiments, our more natural stimulus paradigm yielded smaller variability between subjects than the classic setup. Moving beyond classic temporal and spatial effect locations, our experiment furthermore revealed previously unknown signatures of face processing: This includes category-specific modulation of the event-related potential (ERP)'s amplitude even before fixation onset, as well as adaptation effects across subsequent fixations depending on their history.
Collapse
Affiliation(s)
- Anna L Gert
- Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| | - Benedikt V Ehinger
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.,Stuttgart Center for Simulation Science, University of Stuttgart, Stuttgart, Germany
| | - Silja Timm
- Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| | - Tim C Kietzmann
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.,MRC Cognition and Brain Sciences Unit, Cambridge University, Cambridge, UK
| | - Peter König
- Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany.,Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
5
|
Mazade R, Jin J, Rahimi-Nasrabadi H, Najafian S, Pons C, Alonso JM. Cortical mechanisms of visual brightness. Cell Rep 2022; 40:111438. [PMID: 36170812 DOI: 10.1016/j.celrep.2022.111438] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 06/03/2022] [Accepted: 09/09/2022] [Indexed: 11/29/2022] Open
Abstract
The primary visual cortex signals the onset of light and dark stimuli with ON and OFF cortical pathways. Here, we demonstrate that both pathways generate similar response increments to large homogeneous surfaces and their response average increases with surface brightness. We show that, in cat visual cortex, response dominance from ON or OFF pathways is bimodally distributed when stimuli are smaller than one receptive field center but unimodally distributed when they are larger. Moreover, whereas small bright stimuli drive opposite responses from ON and OFF pathways (increased versus suppressed activity), large bright surfaces drive similar response increments. We show that this size-brightness relation emerges because strong illumination increases the size of light surfaces in nature and both ON and OFF cortical neurons receive input from ON thalamic pathways. We conclude that visual scenes are perceived as brighter when the average response increments from ON and OFF cortical pathways become stronger. Mazade et al. find that the visual cortex encodes brightness differently for small than large stimuli. Bright small stimuli drive cortical pathways signaling lights and suppress cortical pathways signaling darks. Conversely, large surfaces drive response increments from both pathways and appear brightest when the response average is strongest.
Collapse
|
6
|
Abstract
The THINGS database is a freely available stimulus set that has the potential to facilitate the generation of theory that bridges multiple areas within cognitive neuroscience. The database consists of 26,107 high quality digital photos that are sorted into 1,854 concepts. While a valuable resource, relatively few technical details relevant to the design of studies in cognitive neuroscience have been described. We present an analysis of two key low-level properties of THINGS images, luminance and luminance contrast. These image statistics are known to influence common physiological and neural correlates of perceptual and cognitive processes. In general, we found that the distributions of luminance and contrast are in close agreement with the statistics of natural images reported previously. However, we found that image concepts are separable in their luminance and contrast: we show that luminance and contrast alone are sufficient to classify images into their concepts with above chance accuracy. We describe how these factors may confound studies using the THINGS images, and suggest simple controls that can be implemented a priori or post-hoc. We discuss the importance of using such natural images as stimuli in psychological research.
Collapse
Affiliation(s)
- William J Harrison
- Queensland Brain Institute and School of Psychology, 1974The University of Queensland
| |
Collapse
|
7
|
Yu Z, Turner MH, Baudin J, Rieke F. Adaptation in cone photoreceptors contributes to an unexpected insensitivity of primate On parasol retinal ganglion cells to spatial structure in natural images. eLife 2022; 11:70611. [PMID: 35285798 PMCID: PMC8956286 DOI: 10.7554/elife.70611] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 03/13/2022] [Indexed: 02/06/2023] Open
Abstract
Neural circuits are constructed from nonlinear building blocks, and not surprisingly overall circuit behavior is often strongly nonlinear. But neural circuits can also behave near linearly, and some circuits shift from linear to nonlinear behavior depending on stimulus conditions. Such control of nonlinear circuit behavior is fundamental to neural computation. Here, we study a surprising stimulus dependence of the responses of macaque On (but not Off) parasol retinal ganglion cells: these cells respond nonlinearly to spatial structure in some stimuli but near linearly to spatial structure in others, including natural inputs. We show that these differences in the linearity of the integration of spatial inputs can be explained by a shift in the balance of excitatory and inhibitory synaptic inputs that originates at least partially from adaptation in the cone photoreceptors. More generally, this highlights how subtle asymmetries in signaling - here in the cone signals - can qualitatively alter circuit computation.
Collapse
Affiliation(s)
- Zhou Yu
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| | - Maxwell H Turner
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| | - Jacob Baudin
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| | - Fred Rieke
- Department of Physiology and Biophysics, University of Washington, Seattle, United States
| |
Collapse
|
8
|
Rideaux R, West RK, Wallis TSA, Bex PJ, Mattingley JB, Harrison WJ. Spatial structure, phase, and the contrast of natural images. J Vis 2022; 22:4. [PMID: 35006237 PMCID: PMC8762697 DOI: 10.1167/jov.22.1.4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/25/2021] [Indexed: 11/24/2022] Open
Abstract
The sensitivity of the human visual system is thought to be shaped by environmental statistics. A major endeavor in vision science, therefore, is to uncover the image statistics that predict perceptual and cognitive function. When searching for targets in natural images, for example, it has recently been proposed that target detection is inversely related to the spatial similarity of the target to its local background. We tested this hypothesis by measuring observers' sensitivity to targets that were blended with natural image backgrounds. Targets were designed to have a spatial structure that was either similar or dissimilar to the background. Contrary to masking from similarity, we found that observers were most sensitive to targets that were most similar to their backgrounds. We hypothesized that a coincidence of phase alignment between target and background results in a local contrast signal that facilitates detection when target-background similarity is high. We confirmed this prediction in a second experiment. Indeed, we show that, by solely manipulating the phase of a target relative to its background, the target can be rendered easily visible or undetectable. Our study thus reveals that, in addition to its structural similarity, the phase of the target relative to the background must be considered when predicting detection sensitivity in natural images.
Collapse
Affiliation(s)
- Reuben Rideaux
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
| | - Rebecca K West
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| | - Thomas S A Wallis
- Institut für Psychologie & Centre for Cognitive Science, Technische Universität Darmstadt, Darmstadt, Germany
| | - Peter J Bex
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - Jason B Mattingley
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| | - William J Harrison
- Queensland Brain Institute, University of Queensland, St. Lucia, Queensland, Australia
- School of Psychology, University of Queensland, St. Lucia, Queensland, Australia
| |
Collapse
|
9
|
O'Hare L, Hird E, Whybrow M. Steady-state visual evoked potential responses predict visual discomfort judgements. Eur J Neurosci 2021; 54:7575-7598. [PMID: 34661322 DOI: 10.1111/ejn.15492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 10/07/2021] [Accepted: 10/08/2021] [Indexed: 11/29/2022]
Abstract
It has been suggested that aesthetically pleasing stimuli are processed efficiently by the visual system, whereas uncomfortable stimuli are processed inefficiently. This study consists of a series of three experiments investigating this idea using a range of images of abstract artworks, photographs of natural scenes, and computer-generated stimuli previously shown to be uncomfortable. Subjective judgements and neural correlates were measured using electroencephalogram (EEG) (steady-state visual evoked potentials, SSVEPs). In addition, global image statistics (contrast, Fourier amplitude spectral slope and fractal dimension) were taken into account. When effects of physical image contrast were controlled, fractal dimension predicted discomfort judgements, suggesting the SSVEP response is more likely to be influenced by distribution of edges than the spectral slope. Importantly, when effects of physical contrast and fractal dimension were accounted for using linear mixed effects modelling, SSVEP responses predicted subjective judgements of images. Specifically, when stimuli were not matched for perceived contrast, there was a positive relationship between SSVEP responses and how pleasing a stimulus was judged to be, and conversely a negative relationship between discomfort and SSVEP response. This is significant as it shows that the neural responses in early visual areas contribute to the subjective (un)pleasantness of images, although the results of this study do not provide clear support for the theory of efficient coding as the cause of perceived pleasantness or discomfort of images, and so other explanations need to be considered.
Collapse
Affiliation(s)
- Louise O'Hare
- School of Psychology, University of Lincoln, Lincoln, UK.,Department of Psychology, Nottingham Trent University, Nottingham, UK
| | - Emily Hird
- School of Psychology, University of Lincoln, Lincoln, UK
| | | |
Collapse
|
10
|
Goetschalckx L, Andonian A, Wagemans J. Generative adversarial networks unlock new methods for cognitive science. Trends Cogn Sci 2021; 25:788-801. [PMID: 34364792 DOI: 10.1016/j.tics.2021.06.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/22/2021] [Accepted: 06/22/2021] [Indexed: 11/16/2022]
Abstract
Generative adversarial networks (GANs) enable computers to learn complex data distributions and sample from these distributions. When applied to the visual domain, this allows artificial, yet photorealistic images to be synthesized. Their success at this very challenging task triggered an explosion of research within the field of artificial intelligence (AI), yielding various new GAN findings and applications. After explaining the core principles behind GANs and reviewing recent GAN innovations, we illustrate how they can be applied to tackle thorny theoretical and methodological problems in cognitive science. We focus on how GANs can reveal hidden structure in internal representations and how they offer a valuable new compromise in the trade-off between experimental control and ecological validity.
Collapse
Affiliation(s)
- Lore Goetschalckx
- Department of Brain and Cognition, KU Leuven, 3000 Leuven, Belgium; Carney Institute for Brain Science, Department of Cognitive Linguistic & Psychological Sciences, Brown University, Providence, RI 02912, USA.
| | - Alex Andonian
- Computer Science and Artificial Intelligence Laboratory (CSAIL), MIT, Cambridge, MA 02139, USA
| | - Johan Wagemans
- Department of Brain and Cognition, KU Leuven, 3000 Leuven, Belgium
| |
Collapse
|
11
|
Zhu H, Yuille A, Kersten D. Three-dimensional pose discrimination in natural images of humans. Cogsci 2021; 43:223-229. [PMID: 35969705 PMCID: PMC9374112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Perceiving 3D structure in natural images is an immense computational challenge for the visual system. While many previous studies focused on the perception of rigid 3D objects, we applied a novel method on a common set of non-rigid objects-static images of the human body in the natural world. We investigated to what extent human ability to interpret 3D poses in natural images depends on the typicality of the underlying 3D pose and the informativeness of the viewpoint. Using a novel 2AFC pose matching task, we measured how well subjects were able to match a target natural pose image with one of two comparison, synthetic body images from a different viewpoint-one was rendered with the same 3D pose parameters as the target while the other was a distractor rendered with added noises on joint angles. We found that performance for typical poses was measurably better than atypical poses; however, we found no significant difference between informative and less informative viewpoints. Further comparisons of 2D and 3D pose matching models on the same task showed that 3D body knowledge is particularly important when interpreting images of atypical poses. These results suggested that human ability to interpret 3D poses depends on pose typicality but not viewpoint informativeness, and that humans probably use prior knowledge of 3D pose structures.
Collapse
Affiliation(s)
- Hongru Zhu
- Department of Cognitive Science, Johns Hopkins University
| | - Alan Yuille
- Department of Cognitive Science, Johns Hopkins University
| | - Daniel Kersten
- Department of Psychology, University of Minnesota Twin Cities
| |
Collapse
|
12
|
Bai Y, Chen S, Chen Y, Geisler WS, Seidemann E. Similar masking effects of natural backgrounds on detection performances in humans, macaques, and macaque-V1 population responses. J Neurophysiol 2021; 125:2125-2134. [PMID: 33909494 DOI: 10.1152/jn.00275.2020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Visual systems evolve to process the stimuli that arise in the organism's natural environment, and hence, to fully understand the neural computations in the visual system, it is important to measure behavioral and neural responses to natural visual stimuli. Here, we measured psychometric and neurometric functions in the macaque monkey for detection of a windowed sine-wave target in uniform backgrounds and in natural backgrounds of various contrasts. The neurometric functions were obtained by near-optimal decoding of voltage-sensitive-dye-imaging (VSDI) responses at the retinotopic scale in primary visual cortex (V1). The results were compared with previous human psychophysical measurements made under the same conditions. We found that human and macaque behavioral thresholds followed the generalized Weber's law as function of contrast, and that both the slopes and the intercepts of the threshold as a function of background contrast match each other up to a single scale factor. We also found that the neurometric thresholds followed the generalized Weber's law with slopes and intercepts matching the behavioral slopes and intercepts up to a single scale factor. We conclude that human and macaque ability to detect targets in natural backgrounds are affected in the same way by background contrast, that these effects are consistent with population decoding at the retinotopic scale by down-stream circuits, and that the macaque monkey is an appropriate animal model for gaining an understanding of the neural mechanisms in humans for detecting targets in natural backgrounds. Finally, we discuss limitations of the current study and potential next steps.NEW & NOTEWORTHY We measured macaque detection performance in natural images and compared their performance to the detection sensitivity of neurophysiological responses recorded in the primary visual cortex (V1), and to the performance of human subjects. We found that 1) human and macaque behavioral performances are in quantitative agreement and 2) are consistent with near-optimal decoding of V1 population responses.
Collapse
Affiliation(s)
- Yoon Bai
- Center for Perceptual Systems, University of Texas, Austin, Texas.,Department of Psychology, University of Texas, Austin, Texas
| | - Spencer Chen
- Center for Perceptual Systems, University of Texas, Austin, Texas
| | - Yuzhi Chen
- Center for Perceptual Systems, University of Texas, Austin, Texas
| | - Wilson S Geisler
- Center for Perceptual Systems, University of Texas, Austin, Texas.,Department of Psychology, University of Texas, Austin, Texas
| | - Eyal Seidemann
- Center for Perceptual Systems, University of Texas, Austin, Texas.,Department of Psychology, University of Texas, Austin, Texas.,Department of Neuroscience, University of Texas, Austin, Texas
| |
Collapse
|
13
|
Tsekouras GE, Rigos A, Chatzistamatis S, Tsimikas J, Kotis K, Caridakis G, Anagnostopoulos CN. A Novel Approach to Image Recoloring for Color Vision Deficiency. Sensors (Basel) 2021; 21:2740. [PMID: 33924510 PMCID: PMC8069325 DOI: 10.3390/s21082740] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/07/2021] [Accepted: 04/09/2021] [Indexed: 11/16/2022]
Abstract
In this paper, a novel method to modify color images for the protanopia and deuteranopia color vision deficiencies is proposed. The method admits certain criteria, such as preserving image naturalness and color contrast enhancement. Four modules are employed in the process. First, fuzzy clustering-based color segmentation extracts key colors (which are the cluster centers) of the input image. Second, the key colors are mapped onto the CIE 1931 chromaticity diagram. Then, using the concept of confusion line (i.e., loci of colors confused by the color-blind), a sophisticated mechanism translates (i.e., removes) key colors lying on the same confusion line to different confusion lines so that they can be discriminated by the color-blind. In the third module, the key colors are further adapted by optimizing a regularized objective function that combines the aforementioned criteria. Fourth, the recolored image is obtained by color transfer that involves the adapted key colors and the associated fuzzy clusters. Three related methods are compared with the proposed one, using two performance indices, and evaluated by several experiments over 195 natural images and six digitized art paintings. The main outcomes of the comparative analysis are as follows. (a) Quantitative evaluation based on nonparametric statistical analysis is conducted by comparing the proposed method to each one of the other three methods for protanopia and deuteranopia, and for each index. In most of the comparisons, the Bonferroni adjusted p-values are <0.015, favoring the superiority of the proposed method. (b) Qualitative evaluation verifies the aesthetic appearance of the recolored images.
Collapse
Affiliation(s)
- George E. Tsekouras
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| | - Anastasios Rigos
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| | - Stamatis Chatzistamatis
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| | - John Tsimikas
- Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, 811 00 Mitilini, Greece;
| | - Konstantinos Kotis
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| | - George Caridakis
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| | - Christos-Nikolaos Anagnostopoulos
- Department of Cultural Technology and Communications, University of the Aegean, 811 00 Mitilini, Greece; (A.R.); (S.C.); (K.K.); (G.C.); (C.-N.A.)
| |
Collapse
|
14
|
Brackbill N, Rhoades C, Kling A, Shah NP, Sher A, Litke AM, Chichilnisky EJ. Reconstruction of natural images from responses of primate retinal ganglion cells. eLife 2020; 9:e58516. [PMID: 33146609 PMCID: PMC7752138 DOI: 10.7554/elife.58516] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Accepted: 11/02/2020] [Indexed: 11/23/2022] Open
Abstract
The visual message conveyed by a retinal ganglion cell (RGC) is often summarized by its spatial receptive field, but in principle also depends on the responses of other RGCs and natural image statistics. This possibility was explored by linear reconstruction of natural images from responses of the four numerically-dominant macaque RGC types. Reconstructions were highly consistent across retinas. The optimal reconstruction filter for each RGC - its visual message - reflected natural image statistics, and resembled the receptive field only when nearby, same-type cells were included. ON and OFF cells conveyed largely independent, complementary representations, and parasol and midget cells conveyed distinct features. Correlated activity and nonlinearities had statistically significant but minor effects on reconstruction. Simulated reconstructions, using linear-nonlinear cascade models of RGC light responses that incorporated measured spatial properties and nonlinearities, produced similar results. Spatiotemporal reconstructions exhibited similar spatial properties, suggesting that the results are relevant for natural vision.
Collapse
Affiliation(s)
- Nora Brackbill
- Department of Physics, Stanford UniversityStanfordUnited States
| | - Colleen Rhoades
- Department of Bioengineering, Stanford UniversityStanfordUnited States
| | - Alexandra Kling
- Department of Neurosurgery, Stanford School of MedicineStanfordUnited States
- Department of Ophthalmology, Stanford UniversityStanfordUnited States
- Hansen Experimental Physics Laboratory, Stanford UniversityStanfordUnited States
| | - Nishal P Shah
- Department of Electrical Engineering, Stanford UniversityStanfordUnited States
| | - Alexander Sher
- Santa Cruz Institute for Particle Physics, University of California, Santa CruzSanta CruzUnited States
| | - Alan M Litke
- Santa Cruz Institute for Particle Physics, University of California, Santa CruzSanta CruzUnited States
| | - EJ Chichilnisky
- Department of Neurosurgery, Stanford School of MedicineStanfordUnited States
- Department of Ophthalmology, Stanford UniversityStanfordUnited States
- Hansen Experimental Physics Laboratory, Stanford UniversityStanfordUnited States
| |
Collapse
|
15
|
Tesileanu T, Conte MM, Briguglio JJ, Hermundstad AM, Victor JD, Balasubramanian V. Efficient coding of natural scene statistics predicts discrimination thresholds for grayscale textures. eLife 2020; 9:e54347. [PMID: 32744505 PMCID: PMC7494356 DOI: 10.7554/elife.54347] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 07/31/2020] [Indexed: 11/13/2022] Open
Abstract
Previously, in Hermundstad et al., 2014, we showed that when sampling is limiting, the efficient coding principle leads to a 'variance is salience' hypothesis, and that this hypothesis accounts for visual sensitivity to binary image statistics. Here, using extensive new psychophysical data and image analysis, we show that this hypothesis accounts for visual sensitivity to a large set of grayscale image statistics at a striking level of detail, and also identify the limits of the prediction. We define a 66-dimensional space of local grayscale light-intensity correlations, and measure the relevance of each direction to natural scenes. The 'variance is salience' hypothesis predicts that two-point correlations are most salient, and predicts their relative salience. We tested these predictions in a texture-segregation task using un-natural, synthetic textures. As predicted, correlations beyond second order are not salient, and predicted thresholds for over 300 second-order correlations match psychophysical thresholds closely (median fractional error <0.13).
Collapse
Affiliation(s)
| | - Mary M Conte
- Feil Family Brain and Mind Institute, Weill Cornell Medical CollegeNew YorkUnited States
| | | | | | - Jonathan D Victor
- Feil Family Brain and Mind Institute, Weill Cornell Medical CollegeNew YorkUnited States
| | | |
Collapse
|
16
|
Puckett AM, Schira MM, Isherwood ZJ, Victor JD, Roberts JA, Breakspear M. Manipulating the structure of natural scenes using wavelets to study the functional architecture of perceptual hierarchies in the brain. Neuroimage 2020; 221:117173. [PMID: 32682991 PMCID: PMC8239382 DOI: 10.1016/j.neuroimage.2020.117173] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 05/11/2020] [Accepted: 07/14/2020] [Indexed: 01/08/2023] Open
Abstract
Functional neuroimaging experiments that employ naturalistic stimuli (natural scenes, films, spoken narratives) provide insights into cognitive function "in the wild". Natural stimuli typically possess crowded, spectrally dense, dynamic, and multimodal properties within a rich multiscale structure. However, when using natural stimuli, various challenges exist for creating parametric manipulations with tight experimental control. Here, we revisit the typical spectral composition and statistical dependences of natural scenes, which distinguish them from abstract stimuli. We then demonstrate how to selectively degrade subtle statistical dependences within specific spatial scales using the wavelet transform. Such manipulations leave basic features of the stimuli, such as luminance and contrast, intact. Using functional neuroimaging of human participants viewing degraded natural images, we demonstrate that cortical responses at different levels of the visual hierarchy are differentially sensitive to subtle statistical dependences in natural images. This demonstration supports the notion that perceptual systems in the brain are optimally tuned to the complex statistical properties of the natural world. The code to undertake these stimulus manipulations, and their natural extension to dynamic natural scenes (films), is freely available.
Collapse
Affiliation(s)
- Alexander M Puckett
- School of Psychology, The University of Queensland, Brisbane QLD 4072, Australia; Queensland Brain Institute, The University of Queensland, Brisbane QLD 4072, Australia.
| | - Mark M Schira
- School of Psychology, University of Wollongong, Wollongong NSW 2522, Australia
| | - Zoey J Isherwood
- School of Psychology, University of Nevada, Reno NV 89557, United States
| | - Jonathan D Victor
- Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medical College, New York NY 10065, United States
| | - James A Roberts
- Brain Modelling Group, QIMR Berghofer Medical Research Institute, Brisbane QLD 4006, Australia
| | - Michael Breakspear
- Brain and Mind PRC, University of Newcastle, Newcastle NSW 2308, Australia
| |
Collapse
|
17
|
Michalak H, Okarma K. Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character Recognition. Sensors (Basel) 2020; 20:E2914. [PMID: 32455623 PMCID: PMC7287981 DOI: 10.3390/s20102914] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 05/13/2020] [Accepted: 05/19/2020] [Indexed: 11/16/2022]
Abstract
Image binarization is one of the key operations decreasing the amount of information used in further analysis of image data, significantly influencing the final results. Although in some applications, where well illuminated images may be easily captured, ensuring a high contrast, even a simple global thresholding may be sufficient, there are some more challenging solutions, e.g., based on the analysis of natural images or assuming the presence of some quality degradations, such as in historical document images. Considering the variety of image binarization methods, as well as their different applications and types of images, one cannot expect a single universal thresholding method that would be the best solution for all images. Nevertheless, since one of the most common operations preceded by the binarization is the Optical Character Recognition (OCR), which may also be applied for non-uniformly illuminated images captured by camera sensors mounted in mobile phones, the development of even better binarization methods in view of the maximization of the OCR accuracy is still expected. Therefore, in this paper, the idea of the use of robust combined measures is presented, making it possible to bring together the advantages of various methods, including some recently proposed approaches based on entropy filtering and a multi-layered stack of regions. The experimental results, obtained for a dataset of 176 non-uniformly illuminated document images, referred to as the WEZUT OCR Dataset, confirm the validity and usefulness of the proposed approach, leading to a significant increase of the recognition accuracy.
Collapse
Affiliation(s)
| | - Krzysztof Okarma
- Faculty of Electrical Engineering, West Pomeranian University of Technology in Szczecin, 70-313 Szczecin, Poland;
| |
Collapse
|
18
|
Kanth ST, Ray S. Electrocorticogram (ECoG) Is Highly Informative in Primate Visual Cortex. J Neurosci 2020; 40:2430-44. [PMID: 32066581 DOI: 10.1523/JNEUROSCI.1368-19.2020] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 02/08/2020] [Accepted: 02/10/2020] [Indexed: 12/21/2022] Open
Abstract
Neural signals recorded at different scales contain information about environment and behavior and have been used to control Brain Machine Interfaces with varying degrees of success. However, a direct comparison of their efficacy has not been possible due to different recording setups, tasks, species, etc. To address this, we implanted customized arrays having both microelectrodes and electrocorticogram (ECoG) electrodes in the primary visual cortex of 2 female macaque monkeys, and also recorded electroencephalogram (EEG), while they viewed a variety of naturalistic images and parametric gratings. Surprisingly, ECoG had higher information and decodability than all other signals. Combining a few ECoG electrodes allowed more accurate decoding than combining a much larger number of microelectrodes. Control analyses showed that higher decoding accuracy of ECoG compared with local field potential was not because of differences in low-level visual features captured by them but instead because of larger spatial summation of the ECoG. Information was high in the 30-80 Hz range and at lower frequencies. Information in different frequencies and scales was nonredundant. These results have strong implications for Brain Machine Interface applications and for study of population representation of visual stimuli.SIGNIFICANCE STATEMENT Electrophysiological signals captured across scales by different recording electrodes are regularly used for Brain Machine Interfaces, but the information content varies due to electrode size and location. A systematic comparison of their efficiency for Brain Machine Interfaces is important but technically challenging. Here, we recorded simultaneous signals across four scales: spikes, local field potential, electrocorticogram (ECoG), and EEG, and compared their information and decoding accuracy for a large variety of naturalistic stimuli. We found that ECoGs were highly informative and outperformed other signals in information content and decoding accuracy.
Collapse
|
19
|
Rideaux R, Welchman AE. But Still It Moves: Static Image Statistics Underlie How We See Motion. J Neurosci 2020; 40:2538-52. [PMID: 32054676 DOI: 10.1523/JNEUROSCI.2760-19.2020] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 12/18/2019] [Accepted: 01/09/2020] [Indexed: 12/03/2022] Open
Abstract
Seeing movement promotes survival. It results from an uncertain interplay between evolution and experience, making it hard to isolate the drivers of computational architectures found in brains. Here we seek insight into motion perception using a neural network (MotionNet) trained on moving images to classify velocity. The network recapitulates key properties of motion direction and speed processing in biological brains, and we use it to derive, and test, understanding of motion (mis)perception at the computational, neural, and perceptual levels. We show that diverse motion characteristics are largely explained by the statistical structure of natural images, rather than motion per se. First, we show how neural and perceptual biases for particular motion directions can result from the orientation structure of natural images. Second, we demonstrate an interrelation between speed and direction preferences in (macaque) MT neurons that can be explained by image autocorrelation. Third, we show that natural image statistics mean that speed and image contrast are related quantities. Finally, using behavioral tests (humans, both sexes), we show that it is knowledge of the speed-contrast association that accounts for motion illusions, rather than the distribution of movements in the environment (the “slow world” prior) as premised by Bayesian accounts. Together, this provides an exposition of motion speed and direction estimation, and produces concrete predictions for future neurophysiological experiments. More broadly, we demonstrate the conceptual value of marrying artificial systems with biological characterization, moving beyond “black box” reproduction of an architecture to advance understanding of complex systems, such as the brain. SIGNIFICANCE STATEMENT Using an artificial systems approach, we show that physiological properties of motion can result from natural image structure. In particular, we show that the anisotropic distribution of orientations in natural statistics is sufficient to explain the cardinal bias for motion direction. We show that inherent autocorrelation in natural images means that speed and direction are related quantities, which could shape the relationship between speed and direction tuning of MT neurons. Finally, we show that movement speed and image contrast are related in moving natural images, and that motion misperception can be explained by this speed-contrast association not a “slow world” prior.
Collapse
|
20
|
Abstract
Ambiguous images are widely recognized as a valuable tool for probing human perception. Perceptual biases that arise when people make judgements about ambiguous images reveal their expectations about the environment. While perceptual biases in early visual processing have been well established, their existence in higher-level vision has been explored only for faces, which may be processed differently from other objects. Here we developed a new, highly versatile method of creating ambiguous hybrid images comprising two component objects belonging to distinct categories. We used these hybrids to measure perceptual biases in object classification and found that images of man-made (manufactured) objects dominated those of naturally occurring (non-man-made) ones in hybrids. This dominance generalized to a broad range of object categories, persisted when the horizontal and vertical elements that dominate man-made objects were removed and increased with the real-world size of the manufactured object. Our findings show for the first time that people have perceptual biases to see man-made objects and suggest that extended exposure to manufactured environments in our urban-living participants has changed the way that they see the world.
Collapse
Affiliation(s)
- Ahamed Miflah Hussain Ismail
- School of Psychology, University of Nottingham Malaysia, Semenyih 43500, Malaysia
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Joshua A. Solomon
- Centre for Applied Vision Research, City, University of London, London EC1V 0HB, UK
| | - Miles Hansard
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Isabelle Mareschal
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| |
Collapse
|
21
|
Zimmermann FGS, Yan X, Rossion B. An objective, sensitive and ecologically valid neural measure of rapid human individual face recognition. R Soc Open Sci 2019; 6:181904. [PMID: 31312474 PMCID: PMC6599768 DOI: 10.1098/rsos.181904] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 05/10/2019] [Indexed: 06/10/2023]
Abstract
Humans may be the only species able to rapidly and automatically recognize a familiar face identity in a crowd of unfamiliar faces, an important social skill. Here, by combining electroencephalography (EEG) and fast periodic visual stimulation (FPVS), we introduce an ecologically valid, objective and sensitive neural measure of this human individual face recognition function. Natural images of various unfamiliar faces are presented at a fast rate of 6 Hz, allowing one fixation per face, with variable natural images of a highly familiar face identity, a celebrity, appearing every seven images (0.86 Hz). Following a few minutes of stimulation, a high signal-to-noise ratio neural response reflecting the generalized discrimination of the familiar face identity from unfamiliar faces is observed over the occipito-temporal cortex at 0.86 Hz and harmonics. When face images are presented upside-down, the individual familiar face recognition response is negligible, being reduced by a factor of 5 over occipito-temporal regions. Differences in the magnitude of the individual face recognition response across different familiar face identities suggest that factors such as exposure, within-person variability and distinctiveness mediate this response. Our findings of a biological marker for fast and automatic recognition of individual familiar faces with ecological stimuli open an avenue for understanding this function, its development and neural basis in neurotypical individual brains along with its pathology. This should also have implications for the use of facial recognition measures in forensic science.
Collapse
Affiliation(s)
- Friederike G. S. Zimmermann
- Institute of Research in Psychological Science, Institute of Neuroscience, Université de Louvain, Louvain-la-Neuve, Belgium
- BG Klinikum Hamburg, Bergedorfer Straße 10, 21033 Hamburg, Germany
| | - Xiaoqian Yan
- Institute of Research in Psychological Science, Institute of Neuroscience, Université de Louvain, Louvain-la-Neuve, Belgium
| | - Bruno Rossion
- Institute of Research in Psychological Science, Institute of Neuroscience, Université de Louvain, Louvain-la-Neuve, Belgium
- Université de Lorraine, CNRS, CRAN, 54000 Nancy, France
- CHRU-Nancy, Service de Neurologie, 54000 Nancy, France
| |
Collapse
|
22
|
Abstract
Human observers readily detect targets in stimuli presented briefly and in rapid succession. Here, we show that even without predefined targets, humans can spot repetitions in streams of thousands of images. We presented sequences of natural images reoccurring a number of times interleaved with either one or two distractors, and we asked participants to detect the repetitions and to identify the repeated images after a delay that could last for minutes. Performance improved with the number of repeated-image presentations up to a ceiling around seven repetitions and was above chance even after only two to three presentations. The task was easiest for slow streams; performance dropped with increasing image-presentation rate but stabilized above 15 Hz and remained well above chance even at 120 Hz. To summarize, we reveal that the human brain has an impressive capacity to detect repetitions in rapid-serial-visual-presentation streams and to remember repeated images over a time course of minutes.
Collapse
Affiliation(s)
- Evelina Thunell
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique (CNRS), Université Toulouse III-Paul Sabatier
| | - Simon J Thorpe
- Centre de Recherche Cerveau et Cognition, Centre National de la Recherche Scientifique (CNRS), Université Toulouse III-Paul Sabatier
| |
Collapse
|
23
|
Astudillo C, Muñoz K, Maldonado PE. Emotional Content Modulates Attentional Visual Orientation During Free Viewing of Natural Images. Front Hum Neurosci 2018; 12:459. [PMID: 30498438 PMCID: PMC6249414 DOI: 10.3389/fnhum.2018.00459] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 10/29/2018] [Indexed: 11/13/2022] Open
Abstract
Visual attention is the process that enables us to select relevant visual stimuli in our environment to achieve a goal or perform adaptive behaviors. In this process, bottom-up mechanisms interact with top-down mechanisms underlying the automatic and voluntary orienting of attention. Cognitive functions, such as emotional processing, can influence visual attention by increasing or decreasing the resources destined for processing stimuli. The relationship between attention and emotion has been explored mainly in the field of automatic attentional capturing; especially, emotional stimuli are suddenly presented and detection rates or reaction times are recorded. Unlike these paradigms, natural visual scenes may be comprised in multiple stimuli with different emotional valences. In this setting, the mechanisms supporting voluntary visual orientation, under the influence of the emotional components of stimuli, are unknown. We employed a mosaic of pictures with different emotional valences (positive, negative, and neutral) and explored the dynamics of attentional visual orientation, assessed by eye tracking and measurements of pupil diameter. We found that pictures with affective content display increased dwelling times when compared to neutral pictures with a larger effect for negative pictures. The valence, regardless of the arousal levels, was the main factor driving the behavioral modulation of visual orientation. On the other hand, the visual exploration was accompanied by a systematic pupillary response, with the pupil contraction and dilation influenced by the arousal levels, with minor effects driven by the valence. Our results emphasize that arousal and valence should be considered different dimensions of emotional processing both interacting with cognitive processes such as visual attention.
Collapse
Affiliation(s)
- Carolina Astudillo
- Biomedical Neuroscience Institute, Universidad de Chile, Santiago, Chile
| | - Kristofher Muñoz
- Biomedical Neuroscience Institute, Universidad de Chile, Santiago, Chile
| | - Pedro E Maldonado
- Biomedical Neuroscience Institute, Universidad de Chile, Santiago, Chile.,Department of Neuroscience, Faculty of Medicine, Universidad de Chile, Santiago, Chile
| |
Collapse
|
24
|
Yao Y, Hu W, Zhang W, Wu T, Shi YQ. Distinguishing Computer-Generated Graphics from Natural Images Based on Sensor Pattern Noise and Deep Learning. Sensors (Basel) 2018; 18:s18041296. [PMID: 29690629 PMCID: PMC5948567 DOI: 10.3390/s18041296] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 04/19/2018] [Accepted: 04/20/2018] [Indexed: 11/24/2022]
Abstract
Computer-generated graphics (CGs) are images generated by computer software. The rapid development of computer graphics technologies has made it easier to generate photorealistic computer graphics, and these graphics are quite difficult to distinguish from natural images (NIs) with the naked eye. In this paper, we propose a method based on sensor pattern noise (SPN) and deep learning to distinguish CGs from NIs. Before being fed into our convolutional neural network (CNN)-based model, these images—CGs and NIs—are clipped into image patches. Furthermore, three high-pass filters (HPFs) are used to remove low-frequency signals, which represent the image content. These filters are also used to reveal the residual signal as well as SPN introduced by the digital camera device. Different from the traditional methods of distinguishing CGs from NIs, the proposed method utilizes a five-layer CNN to classify the input image patches. Based on the classification results of the image patches, we deploy a majority vote scheme to obtain the classification results for the full-size images. The experiments have demonstrated that (1) the proposed method with three HPFs can achieve better results than that with only one HPF or no HPF and that (2) the proposed method with three HPFs achieves 100% accuracy, although the NIs undergo a JPEG compression with a quality factor of 75.
Collapse
Affiliation(s)
- Ye Yao
- School of CyberSpace, Hangzhou Dianzi University, Hangzhou 310018, China.
- Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen 518060, China.
| | - Weitong Hu
- School of CyberSpace, Hangzhou Dianzi University, Hangzhou 310018, China.
| | - Wei Zhang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China.
| | - Ting Wu
- School of CyberSpace, Hangzhou Dianzi University, Hangzhou 310018, China.
| | - Yun-Qing Shi
- Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102, USA.
| |
Collapse
|
25
|
Zuiderbaan W, van Leeuwen J, Dumoulin SO. Change Blindness Is Influenced by Both Contrast Energy and Subjective Importance within Local Regions of the Image. Front Psychol 2017; 8:1718. [PMID: 29046655 PMCID: PMC5632668 DOI: 10.3389/fpsyg.2017.01718] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 09/19/2017] [Indexed: 11/13/2022] Open
Abstract
Our visual system receives an enormous amount of information, but not all information is retained. This is exemplified by the fact that subjects fail to detect large changes in a visual scene, i.e., change-blindness. Current theories propose that our ability to detect these changes is influenced by the gist or interpretation of an image. On the other hand, stimulus-driven image features such as contrast energy dominate the representation in early visual cortex (De Valois and De Valois, 1988; Boynton et al., 1999; Olman et al., 2004; Mante and Carandini, 2005; Dumoulin et al., 2008). Here we investigated whether contrast energy contributes to our ability to detect changes within a visual scene. We compared the ability to detect changes in contrast energy together with changes to a measure of the interpretation of an image. We used subjective important aspects of the image as a measure of the interpretation of an image. We measured reaction times while manipulating the contrast energy and subjective important properties using the change blindness paradigm. Our results suggest that our ability to detect changes in a visual scene is not only influenced by the subjective importance, but also by contrast energy. Also, we find that contrast energy and subjective importance interact. We speculate that contrast energy and subjective important properties are not independently represented in the visual system. Thus, our results suggest that the information that is retained of a visual scene is both influenced by stimulus-driven information as well as the interpretation of a scene.
Collapse
Affiliation(s)
- Wietske Zuiderbaan
- Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, Netherlands
| | - Jonathan van Leeuwen
- Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, Netherlands.,Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Serge O Dumoulin
- Department of Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, Netherlands.,Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands.,Spinoza Centre for Neuroimaging, Amsterdam, Netherlands
| |
Collapse
|
26
|
Orbán G, Berkes P, Fiser J, Lengyel M. Neural Variability and Sampling-Based Probabilistic Representations in the Visual Cortex. Neuron 2016; 92:530-43. [PMID: 27764674 DOI: 10.1016/j.neuron.2016.09.038] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Revised: 07/27/2016] [Accepted: 09/06/2016] [Indexed: 11/21/2022]
Abstract
Neural responses in the visual cortex are variable, and there is now an abundance of data characterizing how the magnitude and structure of this variability depends on the stimulus. Current theories of cortical computation fail to account for these data; they either ignore variability altogether or only model its unstructured Poisson-like aspects. We develop a theory in which the cortex performs probabilistic inference such that population activity patterns represent statistical samples from the inferred probability distribution. Our main prediction is that perceptual uncertainty is directly encoded by the variability, rather than the average, of cortical responses. Through direct comparisons to previously published data as well as original data analyses, we show that a sampling-based probabilistic representation accounts for the structure of noise, signal, and spontaneous response variability and correlations in the primary visual cortex. These results suggest a novel role for neural variability in cortical dynamics and computations.
Collapse
|
27
|
Abstract
The human ventral visual pathway is implicated in higher order form processing, but the organizational principles within this region are not yet well understood. Recently, Lafer-Sousa, Conway, and Kanwisher (J Neurosci 36: 1682-1697, 2016) used functional magnetic resonance imaging to demonstrate that functional responses in the human ventral visual pathway share a broad homology with the those in macaque inferior temporal cortex, providing new evidence supporting the validity of the macaque as a model of the human visual system in this region. In addition, these results give new clues for understanding the organizational principles within the ventral visual pathway and the processing of higher order color and form, suggesting new avenues for research into this cortical region.
Collapse
Affiliation(s)
- Erin Goddard
- School of Psychology, The University of Sydney, Sydney, New South Wales, Australia; and Australian Research Council Centre of Excellence in Cognition and its Disorders, Macquarie University, Sydney, New South Wales, Australia
| |
Collapse
|
28
|
Ghodrati M, Ghodousi M, Yoonessi A. Low-Level Contrast Statistics of Natural Images Can Modulate the Frequency of Event-Related Potentials (ERP) in Humans. Front Hum Neurosci 2016; 10:630. [PMID: 28018197 PMCID: PMC5145888 DOI: 10.3389/fnhum.2016.00630] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Accepted: 11/25/2016] [Indexed: 11/20/2022] Open
Abstract
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.
Collapse
Affiliation(s)
- Masoud Ghodrati
- Department of Physiology, Monash UniversityClayton, VIC, Australia; Neuroscience Program, Biomedicine Discovery Institute, Monash UniversityClayton, VIC, Australia
| | - Mahrad Ghodousi
- Department of Neuroscience, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences Tehran, Iran
| | - Ali Yoonessi
- Department of Neuroscience, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences Tehran, Iran
| |
Collapse
|
29
|
Hussain Ismail AM, Solomon JA, Hansard M, Mareschal I. A tilt after-effect for images of buildings: evidence of selectivity for the orientation of everyday scenes. R Soc Open Sci 2016; 3:160551. [PMID: 28018643 PMCID: PMC5180141 DOI: 10.1098/rsos.160551] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 10/25/2016] [Indexed: 06/06/2023]
Abstract
The tilt after-effect (TAE) is thought to be a manifestation of gain control in mechanisms selective for spatial orientation in visual stimuli. It has been demonstrated with luminance-defined stripes, contrast-defined stripes, orientation-defined stripes and even with natural images. Of course, all images can be decomposed into a sum of stripes, so it should not be surprising to find a TAE when adapting and test images contain stripes that differ by 15° or so. We show this latter condition is not necessary for the TAE with natural images: adaptation to slightly tilted and vertically filtered houses produced a 'repulsive' bias in the perceived orientation of horizontally filtered houses. These results suggest gain control in mechanisms selective for spatial orientation in natural images.
Collapse
Affiliation(s)
- Ahamed Miflah Hussain Ismail
- Department of Experimental Psychology, School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| | - Joshua A. Solomon
- Centre for Applied Vision Research City, University of London, London, UK
| | - Miles Hansard
- School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK
| | - Isabelle Mareschal
- Department of Experimental Psychology, School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| |
Collapse
|
30
|
Huth AG, Lee T, Nishimoto S, Bilenko NY, Vu AT, Gallant JL. Decoding the Semantic Content of Natural Movies from Human Brain Activity. Front Syst Neurosci 2016; 10:81. [PMID: 27781035 PMCID: PMC5057448 DOI: 10.3389/fnsys.2016.00081] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 09/21/2016] [Indexed: 11/13/2022] Open
Abstract
One crucial test for any quantitative model of the brain is to show that the model can be used to accurately decode information from evoked brain activity. Several recent neuroimaging studies have decoded the structure or semantic content of static visual images from human brain activity. Here we present a decoding algorithm that makes it possible to decode detailed information about the object and action categories present in natural movies from human brain activity signals measured by functional MRI. Decoding is accomplished using a hierarchical logistic regression (HLR) model that is based on labels that were manually assigned from the WordNet semantic taxonomy. This model makes it possible to simultaneously decode information about both specific and general categories, while respecting the relationships between them. Our results show that we can decode the presence of many object and action categories from averaged blood-oxygen level-dependent (BOLD) responses with a high degree of accuracy (area under the ROC curve > 0.9). Furthermore, we used this framework to test whether semantic relationships defined in the WordNet taxonomy are represented the same way in the human brain. This analysis showed that hierarchical relationships between general categories and atypical examples, such as organism and plant, did not seem to be reflected in representations measured by BOLD fMRI.
Collapse
Affiliation(s)
- Alexander G Huth
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Tyler Lee
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Shinji Nishimoto
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - Natalia Y Bilenko
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - An T Vu
- Bioengineering Graduate Group, University of California Berkeley, Berkeley, CA, USA
| | - Jack L Gallant
- Helen Wills Neuroscience Institute, University of CaliforniaBerkeley, Berkeley, CA, USA; Bioengineering Graduate Group, University of CaliforniaBerkeley, Berkeley, CA, USA; Department of Psychology, University of CaliforniaBerkeley, Berkeley, CA, USA
| |
Collapse
|
31
|
Talebi V, Baker CL. Categorically distinct types of receptive fields in early visual cortex. J Neurophysiol 2016; 115:2556-76. [PMID: 26936978 DOI: 10.1152/jn.00659.2015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 02/29/2016] [Indexed: 12/11/2022] Open
Abstract
In the visual cortex, distinct types of neurons have been identified based on cellular morphology, response to injected current, or expression of specific markers, but neurophysiological studies have revealed visual receptive field (RF) properties that appear to be on a continuum, with only two generally recognized classes: simple and complex. Most previous studies have characterized visual responses of neurons using stereotyped stimuli such as bars, gratings, or white noise and simple system identification approaches (e.g., reverse correlation). Here we estimate visual RF models of cortical neurons using visually rich natural image stimuli and regularized regression system identification methods and characterize their spatial tuning, temporal dynamics, spatiotemporal behavior, and spiking properties. We quantitatively demonstrate the existence of three functionally distinct categories of simple cells, distinguished by their degree of orientation selectivity (isotropic or oriented) and the nature of their output nonlinearity (expansive or compressive). In addition, these three types have differing average values of several other properties. Cells with nonoriented RFs tend to have smaller RFs, shorter response durations, no direction selectivity, and high reliability. Orientation-selective neurons with an expansive output nonlinearity have Gabor-like RFs, lower spontaneous activity and responsivity, and spiking responses with higher sparseness. Oriented RFs with a compressive nonlinearity are spatially nondescript and tend to show longer response latency. Our findings indicate multiple physiologically defined types of RFs beyond the simple/complex dichotomy, suggesting that cortical neurons may have more specialized functional roles rather than lying on a multidimensional continuum.
Collapse
Affiliation(s)
- Vargha Talebi
- McGill Vision Research, Department of Ophthalmology, McGill University, Montreal, Quebec, Canada
| | - Curtis L Baker
- McGill Vision Research, Department of Ophthalmology, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
32
|
Liu L, She L, Chen M, Liu T, Lu HD, Dan Y, Poo MM. Spatial structure of neuronal receptive field in awake monkey secondary visual cortex (V2). Proc Natl Acad Sci U S A 2016; 113:1913-8. [PMID: 26839410 DOI: 10.1073/pnas.1525505113] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Visual processing depends critically on the receptive field (RF) properties of visual neurons. However, comprehensive characterization of RFs beyond the primary visual cortex (V1) remains a challenge. Here we report fine RF structures in secondary visual cortex (V2) of awake macaque monkeys, identified through a projection pursuit regression analysis of neuronal responses to natural images. We found that V2 RFs could be broadly classified as V1-like (typical Gabor-shaped subunits), ultralong (subunits with high aspect ratios), or complex-shaped (subunits with multiple oriented components). Furthermore, single-unit recordings from functional domains identified by intrinsic optical imaging showed that neurons with ultralong RFs were primarily localized within pale stripes, whereas neurons with complex-shaped RFs were more concentrated in thin stripes. Thus, by combining single-unit recording with optical imaging and a computational approach, we identified RF subunits underlying spatial feature selectivity of V2 neurons and demonstrated the functional organization of these RF properties.
Collapse
|
33
|
Abstract
Human performance at categorizing natural visual images surpasses automatic algorithms, but how and when this function arises and develops remain unanswered. We recorded scalp electrical brain activity in 4–6 months infants viewing images of objects in their natural background at a rapid rate of 6 images/second (6 Hz). Widely variable face images appearing every 5 stimuli generate an electrophysiological response over the right hemisphere exactly at 1.2 Hz (6 Hz/5). This face-selective response is absent for phase-scrambled images and therefore not due to low-level information. These findings indicate that right lateralized face-selective processes emerge well before reading acquisition in the infant brain, which can perform figure-ground segregation and generalize face-selective responses across changes in size, viewpoint, illumination as well as expression, age and gender. These observations made with a highly sensitive and objective approach open an avenue for clarifying the developmental course of natural image categorization in the human brain. DOI:http://dx.doi.org/10.7554/eLife.06564.001 Putting names to faces can sometimes be challenging, but humans are generally extremely good at recognising faces. Computers, on the other hand, often find it difficult to categorize a face as a face. Indeed, a major challenge in face recognition arises because faces come in many different shapes and sizes. Moreover, both the lighting conditions and the orientation of the head can change, which makes the challenge even more difficult. Young infants also show a preference for pictures of human faces over nonsense images, which suggests that the ability to recognise faces is at least partly hard-wired. Neuroimaging studies have revealed that face recognition depends on activity in specific regions of the right hemisphere of the brain, and adults who sustain damage to these regions lose their face recognition skills. De Heering and Rossion have now provided the first evidence that the right hemisphere is specialized for distinguishing between natural images of faces and ‘non-face objects’ in infants as young as 4 to 6 months. By using scalp electrodes to record electrical activity in the brain as the infants viewed images on a screen, De Heering and Rossion showed that photographs of human faces triggered a distinct pattern of electrical activity in the right hemisphere: this pattern was clearly different to the patterns triggered by photographs of animals or objects. A consistent response was triggered by faces of different genders and expressions, and by faces presented from various viewpoints and under different lighting conditions. In a control experiment, De Heering and Rossion demonstrated that low-level visual features such as differences in luminance or contrast do not contribute to this selective response to faces. These results argue against the idea that face perception only becomes assigned to the right hemisphere of the brain when children learn to read (that is, when language processing begins to occupy parts of the left hemisphere). By generating significant responses in a short period of time (just five minutes or less), the protocol developed by De Heering and Rossion has the potential to prove very useful to researchers investigating developmental changes to the perception of visual images during childhood. DOI:http://dx.doi.org/10.7554/eLife.06564.002
Collapse
Affiliation(s)
- Adélaïde de Heering
- Psychological Sciences Research Institute, University of Louvain, Louvain-la-Neuve, Belgium
| | - Bruno Rossion
- Psychological Sciences Research Institute, University of Louvain, Louvain-la-Neuve, Belgium
| |
Collapse
|
34
|
Hibbard PB, O'Hare L. Uncomfortable images produce non-sparse responses in a model of primary visual cortex. R Soc Open Sci 2015; 2:140535. [PMID: 26064607 PMCID: PMC4448811 DOI: 10.1098/rsos.140535] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 01/29/2015] [Indexed: 05/26/2023]
Abstract
The processing of visual information by the nervous system requires significant metabolic resources. To minimize the energy needed, our visual system appears to be optimized to encode typical natural images as efficiently as possible. One consequence of this is that some atypical images will produce inefficient, non-optimal responses. Here, we show that images that are reported to be uncomfortable to view, and that can trigger migraine attacks and epileptic seizures, produce relatively non-sparse responses in a model of the primary visual cortex. In comparison with the responses to typical inputs, responses to aversive images were larger and less sparse. We propose that this difference in the neural population response may be one cause of visual discomfort in the general population, and can produce more extreme responses in clinical populations such as migraine and epilepsy sufferers.
Collapse
Affiliation(s)
- Paul B. Hibbard
- Department of Psychology, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
- School of Psychology and Neuroscience, University of St Andrews, St Andrews, Fife KY16 9JP, UK
| | - Louise O'Hare
- School of Psychology, University of Lincoln, Lincoln, UK
| |
Collapse
|
35
|
Rossion B, Torfs K, Jacques C, Liu-Shuang J. Fast periodic presentation of natural images reveals a robust face-selective electrophysiological response in the human brain. J Vis 2015; 15:15.1.18. [PMID: 25597037 DOI: 10.1167/15.1.18] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We designed a fast periodic visual stimulation approach to identify an objective signature of face categorization incorporating both visual discrimination (from nonface objects) and generalization (across widely variable face exemplars). Scalp electroencephalographic (EEG) data were recorded in 12 human observers viewing natural images of objects at a rapid frequency of 5.88 images/s for 60 s. Natural images of faces were interleaved every five stimuli, i.e., at 1.18 Hz (5.88/5). Face categorization was indexed by a high signal-to-noise ratio response, specifically at an oddball face stimulation frequency of 1.18 Hz and its harmonics. This face-selective periodic EEG response was highly significant for every participant, even for a single 60-s sequence, and was generally localized over the right occipitotemporal cortex. The periodicity constraint and the large selection of stimuli ensured that this selective response to natural face images was free of low-level visual confounds, as confirmed by the absence of any oddball response for phase-scrambled stimuli. Without any subtraction procedure, time-domain analysis revealed a sequence of differential face-selective EEG components between 120 and 400 ms after oddball face image onset, progressing from medial occipital (P1-faces) to occipitotemporal (N1-faces) and anterior temporal (P2-faces) regions. Overall, this fast periodic visual stimulation approach provides a direct signature of natural face categorization and opens an avenue for efficiently measuring categorization responses of complex visual stimuli in the human brain.
Collapse
Affiliation(s)
- Bruno Rossion
- Psychological Sciences Research Institute, Institute of Neuroscience, University of Louvain, Louvain-la-Neuve, Belgium
| | - Katrien Torfs
- Psychological Sciences Research Institute, Institute of Neuroscience, University of Louvain, Louvain-la-Neuve, Belgium
| | - Corentin Jacques
- Psychological Sciences Research Institute, Institute of Neuroscience, University of Louvain, Louvain-la-Neuve, Belgium
| | - Joan Liu-Shuang
- Psychological Sciences Research Institute, Institute of Neuroscience, University of Louvain, Louvain-la-Neuve, Belgium
| |
Collapse
|
36
|
Abstract
A practical model is proposed for predicting the detectability of targets at arbitrary locations in the visual field, in arbitrary gray scale backgrounds, and under photopic viewing conditions. The major factors incorporated into the model include (a) the optical point spread function of the eye, (b) local luminance gain control (Weber's law), (c) the sampling array of retinal ganglion cells, (d) orientation and spatial frequency-dependent contrast masking, (e) broadband contrast masking, and (f) efficient response pooling. The model is tested against previously reported threshold measurements on uniform backgrounds (the ModelFest data set and data from Foley, Varadharajan, Koh, & Farias, 2007) and against new measurements reported here for several ModelFest targets presented on uniform, 1/f noise, and natural backgrounds at retinal eccentricities ranging from 0° to 10°. Although the model has few free parameters, it is able to account quite well for all the threshold measurements.
Collapse
|
37
|
Abstract
Studies of visual masking have provided a wide range of important insights into the processes involved in visual coding. However, very few of these studies have employed natural scenes as masks. Little is known on how the particular features found in natural scenes affect visual detection thresholds and how the results obtained using unnatural masks relate to the results obtained using natural masks. To address this issue, this paper describes a psychophysical study designed to obtain local contrast detection thresholds for a database of natural images. Via a three-alternative forced-choice experiment, we measured thresholds for detecting 3.7 cycles/° vertically oriented log-Gabor noise targets placed within an 85 × 85-pixels patch (1.9° patch) drawn from 30 natural images from the CSIQ image database (Larson & Chandler, Journal of Electronic Imaging, 2010). Thus, for each image, we obtained a masking map in which each entry in the map denotes the root mean squared contrast threshold for detecting the log-Gabor noise target at the corresponding spatial location in the image. From qualitative observations we found that detection thresholds were affected by several patch properties such as visual complexity, fineness of textures, sharpness, and overall luminance. Our quantitative analysis shows that except for the sharpness measure (correlation coefficient of 0.7), the other tested low-level mask features showed a weak correlation (correlation coefficients less than or equal to 0.52) with the detection thresholds. Furthermore, we evaluated the performance of a computational contrast gain control model that performed fairly well with an average correlation coefficient of 0.79 in predicting the local contrast detection thresholds. We also describe specific choices of parameters for the gain control model. The objective of this database is to provide researchers with a large ground-truth dataset in order to further investigate the properties of the human visual system using natural masks.
Collapse
Affiliation(s)
- Md Mushfiqul Alam
- School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK, USA
| | | | - David J Field
- Department of Psychology, Cornell University, Ithaca, NY, USA
| | - Damon M Chandler
- School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK, USA
| |
Collapse
|
38
|
Abstract
We have developed a low-cost, practical gaze-contingent display in which natural images are presented to the observer with dioptric blur and stereoscopic disparity that are dependent on the three-dimensional structure of natural scenes. Our system simulates a distribution of retinal blur and depth similar to that experienced in real-world viewing conditions by emmetropic observers. We implemented the system using light-field photographs taken with a plenoptic camera which supports digital refocusing anywhere in the images. We coupled this capability with an eye-tracking system and stereoscopic rendering. With this display, we examine how the time course of binocular fusion depends on depth cues from blur and stereoscopic disparity in naturalistic images. Our results show that disparity and peripheral blur interact to modify eye-movement behavior and facilitate binocular fusion, and the greatest benefit was gained by observers who struggled most to achieve fusion. Even though plenoptic images do not replicate an individual’s aberrations, the results demonstrate that a naturalistic distribution of depth-dependent blur may improve 3-D virtual reality, and that interruptions of this pattern (e.g., with intraocular lenses) which flatten the distribution of retinal blur may adversely affect binocular fusion.
Collapse
Affiliation(s)
- Guido Maiello
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USADepartment of Informatics, Bioengineering, Robotics and System Engineering, University of Genoa, Genoa, ItalyUCL Institute of Ophthalmology, University College London, London, UK
| | - Manuela Chessa
- Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genoa, Genoa, Italy
| | - Fabio Solari
- Department of Informatics, Bioengineering, Robotics and System Engineering, University of Genoa, Genoa, Italy
| | - Peter J Bex
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
39
|
McCamy MB, Otero-Millan J, Di Stasi LL, Macknik SL, Martinez-Conde S. Highly informative natural scene regions increase microsaccade production during visual scanning. J Neurosci 2014; 34:2956-66. [PMID: 24553936 DOI: 10.1523/JNEUROSCI.4448-13.2014] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Classical image statistics, such as contrast, entropy, and the correlation between central and nearby pixel intensities, are thought to guide ocular fixation targeting. However, these statistics are not necessarily task relevant and therefore do not provide a complete picture of the relationship between informativeness and ocular targeting. Moreover, it is not known whether either informativeness or classical image statistics affect microsaccade production; thus, the role of microsaccades in information acquisition is also unknown. The objective quantification of the informativeness of a scene region is a major challenge, because it can vary with both image features and the task of the viewer. Thus, previous definitions of informativeness suffered from subjectivity and inconsistency across studies. Here we developed an objective measure of informativeness based on fixation consistency across human observers, which accounts for both bottom-up and top-down influences in ocular targeting. We then analyzed fixations in more versus less informative image regions in relation to classical statistics. Observers generated more microsaccades on more informative than less informative image regions, and such regions also exhibited low redundancy in their classical statistics. Increased microsaccade production was not explained by increased fixation duration, suggesting that the visual system specifically uses microsaccades to heighten information acquisition from informative regions.
Collapse
|
40
|
Abstract
Visual discomfort has been reported for certain visual stimuli and under particular viewing conditions, such as stereoscopic viewing. In stereoscopic viewing, visual discomfort can be caused by a conflict between accommodation and convergence cues that may specify different distances in depth. Earlier research has shown that depth-of-field, which is the distance range in depth in the scene that is perceived to be sharp, influences both the perception of egocentric distance to the focal plane, and the distance range in depth between objects in the scene. Because depth-of-field may also be in conflict with convergence and the accommodative state of the eyes, we raised the question of whether depth-of-field affects discomfort when viewing stereoscopic photographs. The first experiment assessed whether discomfort increases when depth-of-field is in conflict with coherent accommodation-convergence cues to distance in depth. The second experiment assessed whether depth-of-field influences discomfort from a pre-existing accommodation-convergence conflict. Results showed no effect of depth-of-field on visual discomfort. These results suggest therefore that depth-of-field can be used as a cue to depth without inducing discomfort in the viewer, even when cue conflicts are large.
Collapse
Affiliation(s)
- Louise O'Hare
- School of Psychology and Neuroscience, University of St Andrews, St Mary's Quad, South Street, St Andrews, Fife KY16 9JP, UK; e-mail:
| | | | | | | |
Collapse
|
41
|
Abstract
Symmetry is a biologically relevant, mathematically involving, and aesthetically compelling visual phenomenon. Mirror symmetry detection is considered particularly rapid and efficient, based on experiments with random noise. Symmetry detection in natural settings, however, is often accomplished against structured backgrounds. To measure salience of symmetry in diverse contexts, we assembled mirror symmetric patterns from 101 natural textures. Temporal thresholds for detecting the symmetry axis ranged from 28 to 568 ms indicating a wide range of salience (1/Threshold). We built a model for estimating symmetry-energy by connecting pairs of mirror-symmetric filters that simulated cortical receptive fields. The model easily identified the axis of symmetry for all patterns. However, symmetry-energy quantified at this axis correlated weakly with salience. To examine context effects on symmetry detection, we used the same model to estimate approximate symmetry resulting from the underlying texture throughout the image. Magnitudes of approximate symmetry at flanking and orthogonal axes showed strong negative correlations with salience, revealing context interference with symmetry detection. A regression model that included the context-based measures explained the salience results, and revealed why perceptual symmetry can differ from mathematical characterizations. Using natural patterns thus produces new insights into symmetry perception and its possible neural circuits.
Collapse
Affiliation(s)
- Elias H Cohen
- Psychology Department and Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN, USA.
| | | |
Collapse
|
42
|
Egaña JI, Devia C, Mayol R, Parrini J, Orellana G, Ruiz A, Maldonado PE. Small Saccades and Image Complexity during Free Viewing of Natural Images in Schizophrenia. Front Psychiatry 2013; 4:37. [PMID: 23730291 PMCID: PMC3657715 DOI: 10.3389/fpsyt.2013.00037] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2012] [Accepted: 05/05/2013] [Indexed: 11/22/2022] Open
Abstract
In schizophrenia, patients display dysfunctions during the execution of simple visual tasks such as antisaccade or smooth pursuit. In more ecological scenarios, such as free viewing of natural images, patients appear to make fewer and longer visual fixations and display shorter scanpaths. It is not clear whether these measurements reflect alterations in their proficiency to perform basic eye movements, such as saccades and fixations, or are related to high-level mechanisms, such as exploration or attention. We utilized free exploration of natural images of different complexities as a model of an ecological context where normally operative mechanisms of visual control can be accurately measured. We quantified visual exploration as Euclidean distance, scanpaths, saccades, and visual fixation, using the standard SR-Research eye tracker algorithm (SR). We then compared this result with a computation that includes microsaccades (EM). We evaluated eight schizophrenia patients and corresponding healthy controls (HC). Next, we tested whether the decrement in the number of saccades and fixations, as well as their increment in duration reported previously in schizophrenia patients, resulted from the increasing occurrence of undetected microsaccades. We found that when utilizing the standard SR algorithm, patients displayed shorter scanpaths as well as fewer and shorter saccades and fixations. When we employed the EM algorithm, the differences in these parameters between patients and HC were no longer significant. On the other hand, we found that image complexity plays an important role in exploratory behaviors, demonstrating that this factor explains most of differences between eye-movement behaviors in schizophrenia patients. These results help elucidate the mechanisms of visual motor control that are affected in schizophrenia and contribute to the finding of adequate markers for diagnosis and treatment for this condition.
Collapse
Affiliation(s)
- Jose Ignacio Egaña
- Laboratorio de Neurosistemas, Programa de Fisiología y Biofísica, Facultad de Medicina, Universidad de ChileSantiago, Chile
- Biomedical Neuroscience Institute, Faculty of Medicine, Universidad de ChileSantiago, Chile
- Departamento de Anestesiología y Reanimación, Hospital Clínico Universidad de ChileSantiago, Chile
| | - Christ Devia
- Laboratorio de Neurosistemas, Programa de Fisiología y Biofísica, Facultad de Medicina, Universidad de ChileSantiago, Chile
- Biomedical Neuroscience Institute, Faculty of Medicine, Universidad de ChileSantiago, Chile
| | - Rocío Mayol
- Laboratorio de Neurosistemas, Programa de Fisiología y Biofísica, Facultad de Medicina, Universidad de ChileSantiago, Chile
- Biomedical Neuroscience Institute, Faculty of Medicine, Universidad de ChileSantiago, Chile
| | - Javiera Parrini
- Departamento de Psiquiatría y Salud Mental, Campus Oriente, Facultad de Medicina, Universidad de ChileSantiago, Chile
| | - Gricel Orellana
- Departamento de Psiquiatría y Salud Mental, Campus Oriente, Facultad de Medicina, Universidad de ChileSantiago, Chile
| | - Aida Ruiz
- Departamento de Psiquiatría y Salud Mental, Campus Norte, Facultad de Medicina, Universidad de ChileSantiago, Chile
| | - Pedro E. Maldonado
- Laboratorio de Neurosistemas, Programa de Fisiología y Biofísica, Facultad de Medicina, Universidad de ChileSantiago, Chile
- Biomedical Neuroscience Institute, Faculty of Medicine, Universidad de ChileSantiago, Chile
| |
Collapse
|
43
|
Buffat S, Plantier J, Roumes C, Lorenceau J. Repetition blindness for natural images of objects with viewpoint changes. Front Psychol 2013; 3:622. [PMID: 23346069 PMCID: PMC3551441 DOI: 10.3389/fpsyg.2012.00622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Accepted: 12/30/2012] [Indexed: 11/13/2022] Open
Abstract
When stimuli are repeated in a rapid serial visual presentation (RSVP), observers sometimes fail to report the second occurrence of a target. This phenomenon is referred to as “repetition blindness” (RB). We report an RSVP experiment with photographs in which we manipulated object viewpoints between the first and second occurrences of a target (0°, 45°, or 90° changes), and spatial frequency (SF) content. Natural images were spatially filtered to produce low, medium, or high SF stimuli. RB was observed for all filtering conditions. Surprisingly, for full-spectrum (FS) images, RB increased significantly as the viewpoint reached 90°. For filtered images, a similar pattern of results was found for all conditions except for medium SF stimuli. These findings suggest that object recognition in RSVP are subtended by viewpoint-specific representations for all spatial frequencies except medium ones.
Collapse
Affiliation(s)
- Stéphane Buffat
- Institut de Recherche Biomédicale des Armées Brétigny sur Orge, France
| | | | | | | |
Collapse
|
44
|
Groen IIA, Ghebreab S, Lamme VAF, Scholte HS. Low-level contrast statistics are diagnostic of invariance of natural textures. Front Comput Neurosci 2012; 6:34. [PMID: 22701419 PMCID: PMC3370418 DOI: 10.3389/fncom.2012.00034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 05/23/2012] [Indexed: 11/13/2022] Open
Abstract
Texture may provide important clues for real world object and scene perception. To be reliable, these clues should ideally be invariant to common viewing variations such as changes in illumination and orientation. In a large image database of natural materials, we found textures with low-level contrast statistics that varied substantially under viewing variations, as well as textures that remained relatively constant. This led us to ask whether textures with constant contrast statistics give rise to more invariant representations compared to other textures. To test this, we selected natural texture images with either high (HV) or low (LV) variance in contrast statistics and presented these to human observers. In two distinct behavioral categorization paradigms, participants more often judged HV textures as "different" compared to LV textures, showing that textures with constant contrast statistics are perceived as being more invariant. In a separate electroencephalogram (EEG) experiment, evoked responses to single texture images (single-image ERPs) were collected. The results show that differences in contrast statistics correlated with both early and late differences in occipital ERP amplitude between individual images. Importantly, ERP differences between images of HV textures were mainly driven by illumination angle, which was not the case for LV images: there, differences were completely driven by texture membership. These converging neural and behavioral results imply that some natural textures are surprisingly invariant to illumination changes and that low-level contrast statistics are diagnostic of the extent of this invariance.
Collapse
Affiliation(s)
- Iris I A Groen
- Department of Psychology, Cognitive Neuroscience Group, University of Amsterdam Amsterdam, Netherlands
| | | | | | | |
Collapse
|
45
|
Babies B, Lindemann JP, Egelhaaf M, Möller R. Contrast-independent biologically inspired motion detection. Sensors (Basel) 2011; 11:3303-26. [PMID: 22163800 PMCID: PMC3231623 DOI: 10.3390/s110303303] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Revised: 03/15/2011] [Accepted: 03/17/2011] [Indexed: 11/16/2022]
Abstract
Optic flow, i.e., retinal image movement resulting from ego-motion, is a crucial source of information used for obstacle avoidance and course control in flying insects. Optic flow analysis may prove promising for mobile robotics although it is currently not among the standard techniques. Insects have developed a computationally cheap analysis mechanism for image motion. Detailed computational models, the so-called elementary motion detectors (EMDs), describe motion detection in insects. However, the technical application of EMDs is complicated by the strong effect of local pattern contrast on their motion response. Here we present augmented versions of an EMD, the (s)cc-EMDs, which normalise their responses for contrast and thereby reduce the sensitivity to contrast changes. Thus, velocity changes of moving natural images are reflected more reliably in the detector response. The (s)cc-EMDs can easily be implemented in hardware and software and can be a valuable novel visual motion sensor for mobile robots.
Collapse
Affiliation(s)
- Birthe Babies
- Center of Excellence ‘Cognitive Interaction Technology’, Bielefeld University, D-33594 Bielefeld, Germany; E-Mails: (J.P.L.); (M.E.)
- Computer Engineering Group, Faculty of Technology, Bielefeld University, D-33594 Bielefeld, Germany; E-Mail:
| | - Jens Peter Lindemann
- Center of Excellence ‘Cognitive Interaction Technology’, Bielefeld University, D-33594 Bielefeld, Germany; E-Mails: (J.P.L.); (M.E.)
- Department of Neurobiology, Faculty of Biology, Bielefeld University, D-33594 Bielefeld, Germany
| | - Martin Egelhaaf
- Center of Excellence ‘Cognitive Interaction Technology’, Bielefeld University, D-33594 Bielefeld, Germany; E-Mails: (J.P.L.); (M.E.)
- Department of Neurobiology, Faculty of Biology, Bielefeld University, D-33594 Bielefeld, Germany
| | - Ralf Möller
- Center of Excellence ‘Cognitive Interaction Technology’, Bielefeld University, D-33594 Bielefeld, Germany; E-Mails: (J.P.L.); (M.E.)
- Computer Engineering Group, Faculty of Technology, Bielefeld University, D-33594 Bielefeld, Germany; E-Mail:
| |
Collapse
|
46
|
Abstract
Recent studies have used fMRI signals from early visual areas to reconstruct simple geometric patterns. Here, we demonstrate a new Bayesian decoder that uses fMRI signals from early and anterior visual areas to reconstruct complex natural images. Our decoder combines three elements: a structural encoding model that characterizes responses in early visual areas, a semantic encoding model that characterizes responses in anterior visual areas, and prior information about the structure and semantic content of natural images. By combining all these elements, the decoder produces reconstructions that accurately reflect both the spatial structure and semantic category of the objects contained in the observed natural image. Our results show that prior information has a substantial effect on the quality of natural image reconstructions. We also demonstrate that much of the variance in the responses of anterior visual areas to complex natural images is explained by the semantic category of the image alone.
Collapse
Affiliation(s)
- Thomas Naselaris
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA
| | - Ryan J. Prenger
- Department of Physics, University of California, Berkeley, CA 94720, USA
| | - Kendrick N. Kay
- Department of Psychology, University of California, Berkeley, CA 94720, USA
| | - Michael Oliver
- Vision Science Program, University of California, Berkeley, CA 94720, USA
| | - Jack L. Gallant
- Helen Wills Neuroscience Institute, University of California, Berkeley, CA 94720, USA
- Department of Psychology, University of California, Berkeley, CA 94720, USA
- Vision Science Program, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
47
|
Abstract
The contrast sensitivity function is routinely measured in the laboratory with sine-wave gratings presented on homogenous gray backgrounds; natural images are instead composed of a broad range of spatial and temporal structures. In order to extend channel-based models of visual processing to more natural conditions, we examined how contrast sensitivity varies with the context in which it is measured. We report that contrast sensitivity is quite different under laboratory than natural viewing conditions: adaptation or masking with natural scenes attenuates contrast sensitivity at low spatial and temporal frequencies. Expressed another way, viewing stimuli presented on homogenous screens overcomes chronic adaptation to the natural environment and causes a sharp, unnatural increase in sensitivity to low spatial and temporal frequencies. Consequently, the standard contrast sensitivity function is a poor indicator of sensitivity to structure in natural scenes. The magnitude of masking by natural scenes is relatively independent of local contrast but depends strongly on the density of edges even though neither greatly affects the local amplitude spectrum. These results suggest that sensitivity to spatial structure in natural scenes depends on the distribution of local edges as well as the local amplitude spectrum.
Collapse
Affiliation(s)
- Peter J Bex
- Department of Ophthalmology, Schepens Eye Research Institute, Harvard Medical School, Boston, MA 02114, USA.
| | | | | |
Collapse
|
48
|
Carandini M, Demb JB, Mante V, Tolhurst DJ, Dan Y, Olshausen BA, Gallant JL, Rust NC. Do we know what the early visual system does? J Neurosci 2005; 25:10577-97. [PMID: 16291931 PMCID: PMC6725861 DOI: 10.1523/jneurosci.3726-05.2005] [Citation(s) in RCA: 315] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2005] [Revised: 10/10/2005] [Accepted: 10/11/2005] [Indexed: 11/21/2022] Open
Abstract
We can claim that we know what the visual system does once we can predict neural responses to arbitrary stimuli, including those seen in nature. In the early visual system, models based on one or more linear receptive fields hold promise to achieve this goal as long as the models include nonlinear mechanisms that control responsiveness, based on stimulus context and history, and take into account the nonlinearity of spike generation. These linear and nonlinear mechanisms might be the only essential determinants of the response, or alternatively, there may be additional fundamental determinants yet to be identified. Research is progressing with the goals of defining a single "standard model" for each stage of the visual pathway and testing the predictive power of these models on the responses to movies of natural scenes. These predictive models represent, at a given stage of the visual pathway, a compact description of visual computation. They would be an invaluable guide for understanding the underlying biophysical and anatomical mechanisms and relating neural responses to visual perception.
Collapse
Affiliation(s)
- Matteo Carandini
- Smith-Kettlewell Eye Research Institute, San Francisco, California 94115, USA.
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Smyth D, Willmore B, Baker GE, Thompson ID, Tolhurst DJ. The receptive-field organization of simple cells in primary visual cortex of ferrets under natural scene stimulation. J Neurosci 2003; 23:4746-59. [PMID: 12805314 PMCID: PMC6740783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023] Open
Abstract
The responses of simple cells in primary visual cortex to sinusoidal gratings can primarily be predicted from their spatial receptive fields, as mapped using spots or bars. Although this quasilinearity is well documented, it is not clear whether it holds for complex natural stimuli. We recorded from simple cells in the primary visual cortex of anesthetized ferrets while stimulating with flashed digitized photographs of natural scenes. We applied standard reverse-correlation methods to quantify the average natural stimulus that invokes a neuronal response. Although these maps cannot be the receptive fields, we find that they still predict the preferred orientation of grating for each cell very well (r = 0.91); they do not predict the spatial-frequency tuning. Using a novel application of the linear reconstruction method called regularized pseudoinverse, we were able to recover high-resolution receptive-field maps from the responses to a relatively small number of natural scenes. These receptive-field maps not only predict the optimum orientation of each cell (r = 0.96) but also the spatial-frequency optimum (r = 0.89); the maps also predict the tuning bandwidths of many cells. Therefore, our first conclusion is that the tuning preferences of the cells are primarily linear and constant across stimulus type. However, when we used these maps to predict the actual responses of the cells to natural scenes, we did find evidence of expansive output nonlinearity and nonlinear influences from outside the classical receptive fields, orientation tuning, and spatial-frequency tuning.
Collapse
Affiliation(s)
- Darragh Smyth
- Laboratory of Physiology, Oxford University, Oxford OX1 3PT, United Kingdom.
| | | | | | | | | |
Collapse
|
50
|
Abstract
The human visual system encodes the chromatic signals conveyed by the three types of retinal cone photoreceptors in an opponent fashion. This opponency is thought to reduce redundant information by decorrelating the photoreceptor signals. Correlations in the receptor signals are caused by the substantial overlap of the spectral sensitivities of the receptors, but it is not clear to what extent the properties of natural spectra contribute to the correlations. To investigate the influences of natural spectra and photoreceptor spectral sensitivities, we attempted to find linear codes with minimal redundancy for trichromatic images assuming human cone spectral sensitivities, or hypothetical non-overlapping cone sensitivities, respectively. The resulting properties of basis functions are similar in both cases. They are non-orthogonal, show strong opponency along an achromatic direction (luminance edges) and along chromatic directions, and they achieve a highly efficient encoding of natural chromatic signals. Thus, color opponency arises for the encoding of human cone signals, i.e. with strongly overlapping spectral sensitivities, but also under the assumption of non-overlapping spectral sensitivities. Our results suggest that color opponency may in part be a result of the properties of natural spectra and not solely a consequence of the cone spectral sensitivities.
Collapse
Affiliation(s)
- Te-Won Lee
- Institute for Neural Computation, University of California, San Diego, La Jolla, CA 92093-0523, USA.
| | | | | |
Collapse
|