51
|
Güçlü U, van Gerven MAJ. Modeling the Dynamics of Human Brain Activity with Recurrent Neural Networks. Front Comput Neurosci 2017; 11:7. [PMID: 28232797 PMCID: PMC5299026 DOI: 10.3389/fncom.2017.00007] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 01/25/2017] [Indexed: 11/13/2022] Open
Abstract
Encoding models are used for predicting brain activity in response to sensory stimuli with the objective of elucidating how sensory information is represented in the brain. Encoding models typically comprise a nonlinear transformation of stimuli to features (feature model) and a linear convolution of features to responses (response model). While there has been extensive work on developing better feature models, the work on developing better response models has been rather limited. Here, we investigate the extent to which recurrent neural network models can use their internal memories for nonlinear processing of arbitrary feature sequences to predict feature-evoked response sequences as measured by functional magnetic resonance imaging. We show that the proposed recurrent neural network models can significantly outperform established response models by accurately estimating long-term dependencies that drive hemodynamic responses. The results open a new window into modeling the dynamics of brain activity in response to sensory stimuli.
Collapse
Affiliation(s)
- Umut Güçlü
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| | - Marcel A J van Gerven
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| |
Collapse
|
52
|
Groen IIA, Silson EH, Baker CI. Contributions of low- and high-level properties to neural processing of visual scenes in the human brain. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0102. [PMID: 28044013 DOI: 10.1098/rstb.2016.0102] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2016] [Indexed: 11/12/2022] Open
Abstract
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Iris I A Groen
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| | - Edward H Silson
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| |
Collapse
|
53
|
Cichy RM, Teng S. Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0108. [PMID: 28044019 PMCID: PMC5206276 DOI: 10.1098/rstb.2016.0108] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/22/2016] [Indexed: 01/06/2023] Open
Abstract
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’.
Collapse
Affiliation(s)
| | - Santani Teng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
54
|
Ghodrati M, Ghodousi M, Yoonessi A. Low-Level Contrast Statistics of Natural Images Can Modulate the Frequency of Event-Related Potentials (ERP) in Humans. Front Hum Neurosci 2016; 10:630. [PMID: 28018197 PMCID: PMC5145888 DOI: 10.3389/fnhum.2016.00630] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2016] [Accepted: 11/25/2016] [Indexed: 11/20/2022] Open
Abstract
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.
Collapse
Affiliation(s)
- Masoud Ghodrati
- Department of Physiology, Monash UniversityClayton, VIC, Australia; Neuroscience Program, Biomedicine Discovery Institute, Monash UniversityClayton, VIC, Australia
| | - Mahrad Ghodousi
- Department of Neuroscience, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences Tehran, Iran
| | - Ali Yoonessi
- Department of Neuroscience, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences Tehran, Iran
| |
Collapse
|
55
|
Sun Q, Ren Y, Zheng Y, Sun M, Zheng Y. Superordinate Level Processing Has Priority Over Basic-Level Processing in Scene Gist Recognition. Iperception 2016; 7:2041669516681307. [PMID: 28382195 PMCID: PMC5367644 DOI: 10.1177/2041669516681307] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
By combining a perceptual discrimination task and a visuospatial working memory task, the present study examined the effects of visuospatial working memory load on the hierarchical processing of scene gist. In the perceptual discrimination task, two scene images from the same (manmade-manmade pairing or natural-natural pairing) or different superordinate level categories (manmade-natural pairing) were presented simultaneously, and participants were asked to judge whether these two images belonged to the same basic-level category (e.g., street-street pairing) or not (e.g., street-highway pairing). In the concurrent working memory task, spatial load (position-based load in Experiment 1) and object load (figure-based load in Experiment 2) were manipulated. The results were as follows: (a) spatial load and object load have stronger effects on discrimination of same basic-level scene pairing than same superordinate level scene pairing; (b) spatial load has a larger impact on the discrimination of scene pairings at early stages than at later stages; on the contrary, object information has a larger influence on at later stages than at early stages. It followed that superordinate level processing has priority over basic-level processing in scene gist recognition and spatial information contributes to the earlier and object information to the later stages in scene gist recognition.
Collapse
Affiliation(s)
- Qi Sun
- School of Psychology, Shandong Normal University, Jinan, P. R. China
| | - Yanju Ren
- School of Psychology, Shandong Normal University, Jinan, P. R. China
| | - Yang Zheng
- Department of Psychology, Zhejiang Normal University, Jinhua, P. R. China
| | - Mingxia Sun
- School of Public Administration, Shandong Normal University, Jinan, P. R. China
| | - Yuanjie Zheng
- School of Information Science and Engineering, Shandong Normal University, Jinan, P. R. China
| |
Collapse
|
56
|
Making Sense of Real-World Scenes. Trends Cogn Sci 2016; 20:843-856. [PMID: 27769727 DOI: 10.1016/j.tics.2016.09.003] [Citation(s) in RCA: 82] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 09/06/2016] [Accepted: 09/06/2016] [Indexed: 11/23/2022]
Abstract
To interact with the world, we have to make sense of the continuous sensory input conveying information about our environment. A recent surge of studies has investigated the processes enabling scene understanding, using increasingly complex stimuli and sophisticated analyses to highlight the visual features and brain regions involved. However, there are two major challenges to producing a comprehensive framework for scene understanding. First, scene perception is highly dynamic, subserving multiple behavioral goals. Second, a multitude of different visual properties co-occur across scenes and may be correlated or independent. We synthesize the recent literature and argue that for a complete view of scene understanding, it is necessary to account for both differing observer goals and the contribution of diverse scene properties.
Collapse
|
57
|
The Temporal Dynamics of Scene Processing: A Multifaceted EEG Investigation. eNeuro 2016; 3:eN-NWR-0139-16. [PMID: 27699208 PMCID: PMC5037322 DOI: 10.1523/eneuro.0139-16.2016] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 08/12/2016] [Accepted: 09/06/2016] [Indexed: 11/25/2022] Open
Abstract
Our remarkable ability to process complex visual scenes is supported by a network of scene-selective cortical regions. Despite growing knowledge about the scene representation in these regions, much less is known about the temporal dynamics with which these representations emerge. We conducted two experiments aimed at identifying and characterizing the earliest markers of scene-specific processing. In the first experiment, human participants viewed images of scenes, faces, and everyday objects while event-related potentials (ERPs) were recorded. We found that the first ERP component to evince a significantly stronger response to scenes than the other categories was the P2, peaking ∼220 ms after stimulus onset. To establish that the P2 component reflects scene-specific processing, in the second experiment, we recorded ERPs while the participants viewed diverse real-world scenes spanning the following three global scene properties: spatial expanse (open/closed), relative distance (near/far), and naturalness (man-made/natural). We found that P2 amplitude was sensitive to these scene properties at both the categorical level, distinguishing between open and closed natural scenes, as well as at the single-image level, reflecting both computationally derived scene statistics and behavioral ratings of naturalness and spatial expanse. Together, these results establish the P2 as an ERP marker for scene processing, and demonstrate that scene-specific global information is available in the neural response as early as 220 ms.
Collapse
|
58
|
Silson EH, Groen IIA, Kravitz DJ, Baker CI. Evaluating the correspondence between face-, scene-, and object-selectivity and retinotopic organization within lateral occipitotemporal cortex. J Vis 2016; 16:14. [PMID: 27105060 PMCID: PMC4898275 DOI: 10.1167/16.6.14] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The organization of human lateral occipitotemporal cortex (lOTC) has been characterized largely according to two distinct principles: retinotopy and category-selectivity. Whereas category-selective regions were originally thought to exist beyond retinotopic maps, recent evidence highlights overlap. Here, we combined detailed mapping of retinotopy, using population receptive fields (pRF), and category-selectivity to examine and contrast the retinotopic profiles of scene- (occipital place area, OPA), face- (occipital face area, OFA) and object- (lateral occipital cortex, LO) selective regions of lOTC. We observe striking differences in the relationship each region has to underlying retinotopy. Whereas OPA overlapped multiple retinotopic maps (including V3A, V3B, LO1, and LO2), and LO overlapped two maps (LO1 and LO2), OFA overlapped almost none. There appears no simple consistent relationship between category-selectivity and retinotopic maps, meaning category-selective regions are not constrained spatially to retinotopic map borders consistently. The multiple maps that overlap OPA suggests it is likely not appropriate to conceptualize it as a single scene-selective region, whereas the inconsistency in any systematic map overlapping OFA suggests it may constitute a more uniform area. Beyond their relationship to retinotopy, all three regions evidenced strongly retinotopic voxels, with pRFs exhibiting a significant bias towards the contralateral lower visual field, despite differences in pRF size, contributing to an emerging literature suggesting this bias is present across much of lOTC. Taken together, these results suggest that whereas category-selective regions are not constrained to consistently contain ordered retinotopic maps, they nonetheless likely inherit retinotopic characteristics of the maps from which they draw information.
Collapse
|
59
|
Abstract
Naturalistic textures with an intermediate degree of statistical regularity can capture key structural features of natural images (Freeman and Simoncelli, 2011). V2 and later visual areas are sensitive to these features, while primary visual cortex is not (Freeman et al., 2013). Here we expand on this work by investigating a class of textures that have maximal formal regularity, the 17 crystallographic wallpaper groups (Fedorov, 1891). We used texture stimuli from four of the groups that differ in the maximum order of rotation symmetry they contain, and measured neural responses in human participants using functional MRI and high-density EEG. We found that cortical area V3 has a parametric representation of the rotation symmetries in the textures that is not present in either V1 or V2, the first discovery of a stimulus property that differentiates processing in V3 from that of lower-level areas. Parametric responses were also seen in higher-order ventral stream areas V4, VO1, and lateral occipital complex (LOC), but not in dorsal stream areas. The parametric response pattern was replicated in the EEG data, and source localization indicated that responses in V3 and V4 lead responses in LOC, which is consistent with a feedforward mechanism. Finally, we presented our stimuli to four well developed feedforward models and found that none of them were able to account for our results. Our results highlight structural regularity as an important stimulus dimension for distinguishing the early stages of visual processing, and suggest a previously unrecognized role for V3 in the visual form-processing hierarchy. Significance statement: Hierarchical processing is a fundamental organizing principle in visual neuroscience, with each successive processing stage being sensitive to increasingly complex stimulus properties. Here, we probe the encoding hierarchy in human visual cortex using a class of visual textures--wallpaper patterns--that are maximally regular. Through a combination of fMRI and EEG source imaging, we find specific responses to texture regularity that depend parametrically on the maximum order of rotation symmetry in the textures. These parametric responses are seen in several areas of the ventral visual processing stream, as well as in area V3, but not in V1 or V2. This is the first demonstration of a stimulus property that differentiates processing in V3 from that of lower-level visual areas.
Collapse
|
60
|
Ramkumar P, Hansen BC, Pannasch S, Loschky LC. Visual information representation and rapid-scene categorization are simultaneous across cortex: An MEG study. Neuroimage 2016; 134:295-304. [PMID: 27001497 DOI: 10.1016/j.neuroimage.2016.03.027] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 03/04/2016] [Accepted: 03/13/2016] [Indexed: 11/17/2022] Open
Abstract
Perceiving the visual world around us requires the brain to represent the features of stimuli and to categorize the stimulus based on these features. Incorrect categorization can result either from errors in visual representation or from errors in processes that lead to categorical choice. To understand the temporal relationship between the neural signatures of such systematic errors, we recorded whole-scalp magnetoencephalography (MEG) data from human subjects performing a rapid-scene categorization task. We built scene category decoders based on (1) spatiotemporally resolved neural activity, (2) spatial envelope (SpEn) image features, and (3) behavioral responses. Using confusion matrices, we tracked how well the pattern of errors from neural decoders could be explained by SpEn decoders and behavioral errors, over time and across cortical areas. Across the visual cortex and the medial temporal lobe, we found that both SpEn and behavioral errors explained unique variance in the errors of neural decoders. Critically, these effects were nearly simultaneous, and most prominent between 100 and 250ms after stimulus onset. Thus, during rapid-scene categorization, neural processes that ultimately result in behavioral categorization are simultaneous and co-localized with neural processes underlying visual information representation.
Collapse
Affiliation(s)
- Pavan Ramkumar
- Brain Research Unit, O.V. Lounasmaa Laboratory, Aalto University School of Science, Espoo, Finland.
| | - Bruce C Hansen
- Department of Psychology and Neuroscience Program, Colgate University, Hamilton, NY, USA.
| | - Sebastian Pannasch
- Brain Research Unit, O.V. Lounasmaa Laboratory, Aalto University School of Science, Espoo, Finland; Department of Psychology, Technische Universtät Dresden, Dresden, Germany.
| | - Lester C Loschky
- Department of Psychological Sciences, Kansas State University, Manhattan, KS, USA.
| |
Collapse
|
61
|
Groen IIA, Ghebreab S, Lamme VAF, Scholte HS. The time course of natural scene perception with reduced attention. J Neurophysiol 2016; 115:931-46. [DOI: 10.1152/jn.00896.2015] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Accepted: 11/09/2015] [Indexed: 11/22/2022] Open
Abstract
Attention is thought to impose an informational bottleneck on vision by selecting particular information from visual scenes for enhanced processing. Behavioral evidence suggests, however, that some scene information is extracted even when attention is directed elsewhere. Here, we investigated the neural correlates of this ability by examining how attention affects electrophysiological markers of scene perception. In two electro-encephalography (EEG) experiments, human subjects categorized real-world scenes as manmade or natural (full attention condition) or performed tasks on unrelated stimuli in the center or periphery of the scenes (reduced attention conditions). Scene processing was examined in two ways: traditional trial averaging was used to assess the presence of a categorical manmade/natural distinction in event-related potentials, whereas single-trial analyses assessed whether EEG activity was modulated by scene statistics that are diagnostic of naturalness of individual scenes. The results indicated that evoked activity up to 250 ms was unaffected by reduced attention, showing intact categorical differences between manmade and natural scenes and strong modulations of single-trial activity by scene statistics in all conditions. Thus initial processing of both categorical and individual scene information remained intact with reduced attention. Importantly, however, attention did have profound effects on later evoked activity; full attention on the scene resulted in prolonged manmade/natural differences, increased neural sensitivity to scene statistics, and enhanced scene memory. These results show that initial processing of real-world scene information is intact with diminished attention but that the depth of processing of this information does depend on attention.
Collapse
Affiliation(s)
- Iris I. A. Groen
- Amsterdam Brain and Cognition Center, Department of Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Sennay Ghebreab
- Amsterdam Brain and Cognition Center, Department of Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
- Intelligent Systems Lab Amsterdam, Institute of Informatics, University of Amsterdam, Amsterdam, The Netherlands
| | - Victor A. F. Lamme
- Amsterdam Brain and Cognition Center, Department of Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - H. Steven Scholte
- Amsterdam Brain and Cognition Center, Department of Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
62
|
Bieniek MM, Bennett PJ, Sekuler AB, Rousselet GA. A robust and representative lower bound on object processing speed in humans. Eur J Neurosci 2015; 44:1804-14. [PMID: 26469359 PMCID: PMC4982026 DOI: 10.1111/ejn.13100] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 10/06/2015] [Accepted: 10/10/2015] [Indexed: 11/29/2022]
Abstract
How early does the brain decode object categories? Addressing this question is critical to constrain the type of neuronal architecture supporting object categorization. In this context, much effort has been devoted to estimating face processing speed. With onsets estimated from 50 to 150 ms, the timing of the first face-sensitive responses in humans remains controversial. This controversy is due partially to the susceptibility of dynamic brain measurements to filtering distortions and analysis issues. Here, using distributions of single-trial event-related potentials (ERPs), causal filtering, statistical analyses at all electrodes and time points, and effective correction for multiple comparisons, we present evidence that the earliest categorical differences start around 90 ms following stimulus presentation. These results were obtained from a representative group of 120 participants, aged 18-81, who categorized images of faces and noise textures. The results were reliable across testing days, as determined by test-retest assessment in 74 of the participants. Furthermore, a control experiment showed similar ERP onsets for contrasts involving images of houses or white noise. Face onsets did not change with age, suggesting that face sensitivity occurs within 100 ms across the adult lifespan. Finally, the simplicity of the face-texture contrast, and the dominant midline distribution of the effects, suggest the face responses were evoked by relatively simple image properties and are not face specific. Our results provide a new lower benchmark for the earliest neuronal responses to complex objects in the human visual system.
Collapse
Affiliation(s)
- Magdalena M Bieniek
- Institute of Neuroscience and Psychology, College of Medical, Veterinary and Life Sciences, University of Glasgow, 58 Hillhead Street, Glasgow, G12 8QB, UK
| | - Patrick J Bennett
- Department of Psychology, Neuroscience and Behaviour, McMaster University, Hamilton, ON, Canada
| | - Allison B Sekuler
- Department of Psychology, Neuroscience and Behaviour, McMaster University, Hamilton, ON, Canada
| | - Guillaume A Rousselet
- Institute of Neuroscience and Psychology, College of Medical, Veterinary and Life Sciences, University of Glasgow, 58 Hillhead Street, Glasgow, G12 8QB, UK
| |
Collapse
|
63
|
Choi LK, You J, Bovik AC. Referenceless Prediction of Perceptual Fog Density and Perceptual Image Defogging. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:3888-3901. [PMID: 26186784 DOI: 10.1109/tip.2015.2456502] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We propose a referenceless perceptual fog density prediction model based on natural scene statistics (NSS) and fog aware statistical features. The proposed model, called Fog Aware Density Evaluator (FADE), predicts the visibility of a foggy scene from a single image without reference to a corresponding fog-free image, without dependence on salient objects in a scene, without side geographical camera information, without estimating a depth-dependent transmission map, and without training on human-rated judgments. FADE only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. Fog aware statistical features that define the perceptual fog density index derive from a space domain NSS model and the observed characteristics of foggy images. FADE not only predicts perceptual fog density for the entire image, but also provides a local fog density index for each patch. The predicted fog density using FADE correlates well with human judgments of fog density taken in a subjective study on a large foggy image database. As applications, FADE not only accurately assesses the performance of defogging algorithms designed to enhance the visibility of foggy images, but also is well suited for image defogging. A new FADE-based referenceless perceptual image defogging, dubbed DEnsity of Fog Assessment-based DEfogger (DEFADE) achieves better results for darker, denser foggy images as well as on standard foggy images than the state of the art defogging methods. A software release of FADE and DEFADE is available online for public use: http://live.ece.utexas.edu/research/fog/index.html.
Collapse
|
64
|
Sofer I, Crouzet SM, Serre T. Explaining the Timing of Natural Scene Understanding with a Computational Model of Perceptual Categorization. PLoS Comput Biol 2015; 11:e1004456. [PMID: 26335683 PMCID: PMC4559373 DOI: 10.1371/journal.pcbi.1004456] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 07/19/2015] [Indexed: 11/18/2022] Open
Abstract
Observers can rapidly perform a variety of visual tasks such as categorizing a scene as open, as outdoor, or as a beach. Although we know that different tasks are typically associated with systematic differences in behavioral responses, to date, little is known about the underlying mechanisms. Here, we implemented a single integrated paradigm that links perceptual processes with categorization processes. Using a large image database of natural scenes, we trained machine-learning classifiers to derive quantitative measures of task-specific perceptual discriminability based on the distance between individual images and different categorization boundaries. We showed that the resulting discriminability measure accurately predicts variations in behavioral responses across categorization tasks and stimulus sets. We further used the model to design an experiment, which challenged previous interpretations of the so-called "superordinate advantage." Overall, our study suggests that observed differences in behavioral responses across rapid categorization tasks reflect natural variations in perceptual discriminability.
Collapse
Affiliation(s)
- Imri Sofer
- Cognitive, Linguistic & Psychological Sciences Department, Brown Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| | - Sébastien M. Crouzet
- Cognitive, Linguistic & Psychological Sciences Department, Brown Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| | - Thomas Serre
- Cognitive, Linguistic & Psychological Sciences Department, Brown Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
65
|
Can the Outputs of LGN Y-Cells Support Emotion Recognition? A Computational Study. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2015; 2015:695921. [PMID: 26175756 PMCID: PMC4484845 DOI: 10.1155/2015/695921] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 05/26/2015] [Indexed: 11/22/2022]
Abstract
It has been suggested that emotional visual input is processed along both a slower cortical pathway and a faster subcortical pathway which comprises the lateral geniculate nucleus (LGN), the superior colliculus, the pulvinar, and finally the amygdala. However, anatomical as well as functional evidence concerning the subcortical route is lacking. Here, we adopt a computational approach in order to investigate whether the visual representation that is achieved in the LGN may support emotion recognition and emotional response along the subcortical route. In four experiments, we show that the outputs of LGN Y-cells support neither facial expression categorization nor the same/different expression matching by an artificial classificator. However, the same classificator is able to perform at an above chance level in a statistics-based categorization of scenes containing animals and scenes containing people and of light and dark patterns. It is concluded that the visual representation achieved in the LGN is insufficient to allow for the recognition of emotional facial expression.
Collapse
|
66
|
Tibon R, Levy DA. Striking a balance: analyzing unbalanced event-related potential data. Front Psychol 2015; 6:555. [PMID: 25983716 PMCID: PMC4416363 DOI: 10.3389/fpsyg.2015.00555] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 04/16/2015] [Indexed: 11/13/2022] Open
Affiliation(s)
- Roni Tibon
- Baruch Ivcher School of Psychology and Sagol Unit for Applied Neuroscience, The Interdisciplinary Center Herzliya, Israel ; Cognition and Brain Sciences Unit, Medical Research Council Cambridge, UK
| | - Daniel A Levy
- Baruch Ivcher School of Psychology and Sagol Unit for Applied Neuroscience, The Interdisciplinary Center Herzliya, Israel
| |
Collapse
|
67
|
Redies C. Combining universal beauty and cultural context in a unifying model of visual aesthetic experience. Front Hum Neurosci 2015; 9:218. [PMID: 25972799 PMCID: PMC4412058 DOI: 10.3389/fnhum.2015.00218] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 04/07/2015] [Indexed: 12/21/2022] Open
Abstract
In this work, I propose a model of visual aesthetic experience that combines formalist and contextual aspects of aesthetics. The model distinguishes between two modes of processing. First, perceptual processing is based on the intrinsic form of an artwork, which may or may not be beautiful. If it is beautiful, a beauty-responsive mechanism is activated in the brain. This bottom-up mechanism is universal amongst humans; it is widespread in the visual brain and responsive across visual modalities. Second, cognitive processing is based on contextual information, such as the depicted content, the intentions of the artist or the circumstances of the presentation of the artwork. Cognitive processing is partially top-down and varies between individuals according to their cultural experience. Processing in the two channels is parallel and largely independent. In the general case, an aesthetic experience is induced if processing in both channels is favorable, i.e., if there is resonance in the perceptual processing channel ("aesthetics of perception"), and successful mastering in the cognitive processing channel ("aesthetics of cognition"). I speculate that this combinatorial mechanism has evolved to mediate social bonding between members of a (cultural) group of people. Primary emotions can be elicited via both channels and modulate the degree of the aesthetic experience. Two special cases are discussed. First, in a subset of (post-)modern art, beauty no longer plays a prominent role. Second, in some forms of abstract art, beautiful form can be enjoyed with minimal cognitive processing. The model is applied to examples of Western art. Finally, implications of the model are discussed. In summary, the proposed model resolves the seeming contradiction between formalist perceptual approaches to aesthetic experience, which are based on the intrinsic beauty of artworks, and contextual approaches, which account for highly individual and culturally dependent aspects of aesthetics.
Collapse
Affiliation(s)
- Christoph Redies
- Experimental Aesthetics Group, Institute of Anatomy I, University of Jena School of Medicine, Jena University HospitalJena, Germany
| |
Collapse
|
68
|
What you see is what you expect: rapid scene understanding benefits from prior experience. Atten Percept Psychophys 2015; 77:1239-51. [DOI: 10.3758/s13414-015-0859-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|