51
|
Cheng A, Walther DB, Park S, Dilks DD. Concavity as a diagnostic feature of visual scenes. Neuroimage 2021; 232:117920. [PMID: 33652147 PMCID: PMC8256888 DOI: 10.1016/j.neuroimage.2021.117920] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 02/07/2021] [Accepted: 02/21/2021] [Indexed: 11/29/2022] Open
Abstract
Despite over two decades of research on the neural mechanisms underlying human visual scene, or place, processing, it remains unknown what exactly a “scene” is. Intuitively, we are always inside a scene, while interacting with the outside of objects. Hence, we hypothesize that one diagnostic feature of a scene may be concavity, portraying “inside”, and predict that if concavity is a scene-diagnostic feature, then: 1) images that depict concavity, even non-scene images (e.g., the “inside” of an object – or concave object), will be behaviorally categorized as scenes more often than those that depict convexity, and 2) the cortical scene-processing system will respond more to concave images than to convex images. As predicted, participants categorized concave objects as scenes more often than convex objects, and, using functional magnetic resonance imaging (fMRI), two scene-selective cortical regions (the parahippocampal place area, PPA, and the occipital place area, OPA) responded significantly more to concave than convex objects. Surprisingly, we found no behavioral or neural differences between images of concave versus convex buildings. However, in a follow-up experiment, using tightly-controlled images, we unmasked a selective sensitivity to concavity over convexity of scene boundaries (i.e., walls) in PPA and OPA. Furthermore, we found that even highly impoverished line drawings of concave shapes are behaviorally categorized as scenes more often than convex shapes. Together, these results provide converging behavioral and neural evidence that concavity is a diagnostic feature of visual scenes.
Collapse
Affiliation(s)
- Annie Cheng
- Department of Psychology, Emory University, Atlanta, GA 30322, USA
| | - Dirk B Walther
- Department of Psychology, University of Toronto, Toronto, ON, Canada
| | - Soojin Park
- Department of Psychology, Yonsei University, Seoul, Republic of Korea.
| | - Daniel D Dilks
- Department of Psychology, Emory University, Atlanta, GA 30322, USA.
| |
Collapse
|
52
|
Herrera-Esposito D, Coen-Cagli R, Gomez-Sena L. Flexible contextual modulation of naturalistic texture perception in peripheral vision. J Vis 2021; 21:1. [PMID: 33393962 PMCID: PMC7794279 DOI: 10.1167/jov.21.1.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 12/01/2020] [Indexed: 11/24/2022] Open
Abstract
Peripheral vision comprises most of our visual field, and is essential in guiding visual behavior. Its characteristic capabilities and limitations, which distinguish it from foveal vision, have been explained by the most influential theory of peripheral vision as the product of representing the visual input using summary statistics. Despite its success, this account may provide a limited understanding of peripheral vision, because it neglects processes of perceptual grouping and segmentation. To test this hypothesis, we studied how contextual modulation, namely the modulation of the perception of a stimulus by its surrounds, interacts with segmentation in human peripheral vision. We used naturalistic textures, which are directly related to summary-statistics representations. We show that segmentation cues affect contextual modulation, and that this is not captured by our implementation of the summary-statistics model. We then characterize the effects of different texture statistics on contextual modulation, providing guidance for extending the model, as well as for probing neural mechanisms of peripheral vision.
Collapse
Affiliation(s)
- Daniel Herrera-Esposito
- Laboratorio de Neurociencias, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology and Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Leonel Gomez-Sena
- Laboratorio de Neurociencias, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
53
|
Lee SM, Jin SW, Park SB, Park EH, Lee CH, Lee HW, Lim HY, Yoo SW, Ahn JR, Shin J, Lee SA, Lee I. Goal-directed interaction of stimulus and task demand in the parahippocampal region. Hippocampus 2021; 31:717-736. [PMID: 33394547 PMCID: PMC8359334 DOI: 10.1002/hipo.23295] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 12/05/2020] [Accepted: 12/12/2020] [Indexed: 11/10/2022]
Abstract
The hippocampus and parahippocampal region are essential for representing episodic memories involving various spatial locations and objects, and for using those memories for future adaptive behavior. The “dual‐stream model” was initially formulated based on anatomical characteristics of the medial temporal lobe, dividing the parahippocampal region into two streams that separately process and relay spatial and nonspatial information to the hippocampus. Despite its significance, the dual‐stream model in its original form cannot explain recent experimental results, and many researchers have recognized the need for a modification of the model. Here, we argue that dividing the parahippocampal region into spatial and nonspatial streams a priori may be too simplistic, particularly in light of ambiguous situations in which a sensory cue alone (e.g., visual scene) may not allow such a definitive categorization. Upon reviewing evidence, including our own, that reveals the importance of goal‐directed behavioral responses in determining the relative involvement of the parahippocampal processing streams, we propose the Goal‐directed Interaction of Stimulus and Task‐demand (GIST) model. In the GIST model, input stimuli such as visual scenes and objects are first processed by both the postrhinal and perirhinal cortices—the postrhinal cortex more heavily involved with visual scenes and perirhinal cortex with objects—with relatively little dependence on behavioral task demand. However, once perceptual ambiguities are resolved and the scenes and objects are identified and recognized, the information is then processed through the medial or lateral entorhinal cortex, depending on whether it is used to fulfill navigational or non‐navigational goals, respectively. As complex sensory stimuli are utilized for both navigational and non‐navigational purposes in an intermixed fashion in naturalistic settings, the hippocampus may be required to then put together these experiences into a coherent map to allow flexible cognitive operations for adaptive behavior to occur.
Collapse
Affiliation(s)
- Su-Min Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Seung-Woo Jin
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Seong-Beom Park
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Eun-Hye Park
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Choong-Hee Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Hyun-Woo Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Heung-Yeol Lim
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Seung-Woo Yoo
- Department of Biomedical Science, Charles E. Schmidt College of Medicine, Brain Institute, Florida Atlantic University, Jupiter, Florida, USA
| | - Jae Rong Ahn
- Department of Biology, Tufts University, Medford, Massachusetts, USA
| | - Jhoseph Shin
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| | - Sang Ah Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | - Inah Lee
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, South Korea
| |
Collapse
|
54
|
Gamal M, Mounier E, Eldawlatly S. On the Extraction of High-Level Visual Features from Lateral Geniculate Nucleus Activity: A Rat Study. Brain Inform 2021. [DOI: 10.1007/978-3-030-86993-9_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
55
|
Abstract
The accurate perception of human crowds is integral to social understanding and interaction. Previous studies have shown that observers are sensitive to several crowd characteristics such as average facial expression, gender, identity, joint attention, and heading direction. In two experiments, we examined ensemble perception of crowd speed using standard point-light walkers (PLW). Participants were asked to estimate the average speed of a crowd consisting of 12 figures moving at different speeds. In Experiment 1, trials of intact PLWs alternated with trials of scrambled PLWs with a viewing duration of 3 seconds. We found that ensemble processing of crowd speed could rely on local motion alone, although a globally intact configuration enhanced performance. In Experiment 2, observers estimated the average speed of intact-PLW crowds that were displayed at reduced viewing durations across five blocks of trials (between 2500 ms and 500 ms). Estimation of fast crowds was precise and accurate regardless of viewing duration, and we estimated that three to four walkers could still be integrated at 500 ms. For slow crowds, we found a systematic deterioration in performance as viewing time reduced, and performance at 500 ms could not be distinguished from a single-walker response strategy. Overall, our results suggest that rapid and accurate ensemble perception of crowd speed is possible, although sensitive to the precise speed range examined.
Collapse
|
56
|
Finlayson NJ, Neacsu V, Schwarzkopf DS. Spatial Heterogeneity in Bistable Figure-Ground Perception. Iperception 2020; 11:2041669520961120. [PMID: 33194167 PMCID: PMC7594238 DOI: 10.1177/2041669520961120] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 09/02/2020] [Indexed: 11/20/2022] Open
Abstract
The appearance of visual objects varies substantially across the visual field. Could such spatial heterogeneity be due to undersampling of the visual field by neurons selective for stimulus categories? Here, we show that which parts of a bistable vase-face image observers perceive as figure and ground depends on the retinal location where the image appears. The spatial patterns of these perceptual biases were similar regardless of whether the images were upright or inverted. Undersampling by neurons tuned to an object class (e.g., faces) or variability in general local versus global processing cannot readily explain this spatial heterogeneity. Rather, these biases could result from idiosyncrasies in low-level sensitivity across the visual field.
Collapse
Affiliation(s)
- Nonie J. Finlayson
- Department of Experimental Psychology, University College
London, London, UK; School of Optometry & Vision Science, University of Auckland,
Auckland, New Zealand
| | - Victorita Neacsu
- Department of Experimental Psychology, University College
London, London, UK; School of Optometry & Vision Science, University of Auckland,
Auckland, New Zealand
| | - D. S. Schwarzkopf
- Department of Experimental Psychology, University College
London, London, UK; School of Optometry & Vision Science, University of Auckland,
Auckland, New Zealand
| |
Collapse
|
57
|
Rigby SN, Jakobson LS, Pearson PM, Stoesz BM. Alexithymia and the Evaluation of Emotionally Valenced Scenes. Front Psychol 2020; 11:1820. [PMID: 32793083 PMCID: PMC7394003 DOI: 10.3389/fpsyg.2020.01820] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/01/2020] [Indexed: 01/15/2023] Open
Abstract
Alexithymia is a personality trait characterized by difficulties identifying and describing feelings (DIF and DDF) and an externally oriented thinking (EOT) style. The primary aim of the present study was to investigate links between alexithymia and the evaluation of emotional scenes. We also investigated whether viewers' evaluations of emotional scenes were better predicted by specific alexithymic traits or by individual differences in sensory processing sensitivity (SPS). Participants (N = 106) completed measures of alexithymia and SPS along with a task requiring speeded judgments of the pleasantness of 120 moderately arousing scenes. We did not replicate laterality effects previously described with the scene perception task. Compared to those with weak alexithymic traits, individuals with moderate-to-strong alexithymic traits were less likely to classify positively valenced scenes as pleasant and were less likely to classify scenes with (vs. without) implied motion (IM) in a way that was consistent with normative scene valence ratings. In addition, regression analyses confirmed that reporting strong EOT and a tendency to be easily overwhelmed by busy sensory environments negatively predicted classification accuracy for positive scenes, and that both DDF and EOT negatively predicted classification accuracy for scenes depicting IM. These findings highlight the importance of accounting for stimulus characteristics and individual differences in specific traits associated with alexithymia and SPS when investigating the processing of emotional stimuli. Learning more about the links between these individual difference variables may have significant clinical implications, given that alexithymia is an important, transdiagnostic risk factor for a wide range of psychopathologies.
Collapse
Affiliation(s)
- Sarah N Rigby
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada
| | - Lorna S Jakobson
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada
| | - Pauline M Pearson
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada.,Department of Psychology, University of Winnipeg, Winnipeg, MB, Canada
| | - Brenda M Stoesz
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada.,Centre for the Advancement of Teaching and Learning, University of Manitoba, Winnipeg, MB, Canada
| |
Collapse
|
58
|
Disentangling the Independent Contributions of Visual and Conceptual Features to the Spatiotemporal Dynamics of Scene Categorization. J Neurosci 2020; 40:5283-5299. [PMID: 32467356 DOI: 10.1523/jneurosci.2088-19.2020] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 04/18/2020] [Accepted: 04/23/2020] [Indexed: 11/21/2022] Open
Abstract
Human scene categorization is characterized by its remarkable speed. While many visual and conceptual features have been linked to this ability, significant correlations exist between feature spaces, impeding our ability to determine their relative contributions to scene categorization. Here, we used a whitening transformation to decorrelate a variety of visual and conceptual features and assess the time course of their unique contributions to scene categorization. Participants (both sexes) viewed 2250 full-color scene images drawn from 30 different scene categories while having their brain activity measured through 256-channel EEG. We examined the variance explained at each electrode and time point of visual event-related potential (vERP) data from nine different whitened encoding models. These ranged from low-level features obtained from filter outputs to high-level conceptual features requiring human annotation. The amount of category information in the vERPs was assessed through multivariate decoding methods. Behavioral similarity measures were obtained in separate crowdsourced experiments. We found that all nine models together contributed 78% of the variance of human scene similarity assessments and were within the noise ceiling of the vERP data. Low-level models explained earlier vERP variability (88 ms after image onset), whereas high-level models explained later variance (169 ms). Critically, only high-level models shared vERP variability with behavior. Together, these results suggest that scene categorization is primarily a high-level process, but reliant on previously extracted low-level features.SIGNIFICANCE STATEMENT In a single fixation, we glean enough information to describe a general scene category. Many types of features are associated with scene categories, ranging from low-level properties, such as colors and contours, to high-level properties, such as objects and attributes. Because these properties are correlated, it is difficult to understand each property's unique contributions to scene categorization. This work uses a whitening transformation to remove the correlations between features and examines the extent to which each feature contributes to visual event-related potentials over time. We found that low-level visual features contributed first but were not correlated with categorization behavior. High-level features followed 80 ms later, providing key insights into how the brain makes sense of a complex visual world.
Collapse
|
59
|
Hung SM, Wu DA, Shimojo S. Task-induced attention load guides and gates unconscious semantic interference. Nat Commun 2020; 11:2088. [PMID: 32350246 PMCID: PMC7190740 DOI: 10.1038/s41467-020-15439-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 03/10/2020] [Indexed: 11/08/2022] Open
Abstract
The tight relationship between attention and conscious perception has been extensively researched in the past decades. However, whether attentional modulation extended to unconscious processes remained largely unknown, particularly when it came to abstract and high-level processing. Here we use a double Stroop paradigm to demonstrate that attention load gates unconscious semantic processing. We find that word and color incongruencies between a subliminal prime and a supraliminal target cause slower responses to non-Stroop target words-but only if the task is to name the target word (low-load task), and not if the task is to name the target's color (high-load task). The task load hypothesis is confirmed by showing that the word-induced incongruence effect can be detected in the color-naming task, but only in the late, practiced trials. We further replicate this task-induced attentional modulation phenomenon in separate experiments with colorless words (word-only) and words with semantic relationship but no orthographic similarities (semantics-only).
Collapse
Affiliation(s)
- Shao-Min Hung
- Biology and Biological Engineering, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA.
- Huntington Medical Research Institutes, 686 South Fair Oaks Avenue, Pasadena, CA, 91105, USA.
| | - Daw-An Wu
- Biology and Biological Engineering, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA
| | - Shinsuke Shimojo
- Biology and Biological Engineering, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA
- Computation and Neural Systems, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA
| |
Collapse
|
60
|
Valdés-Sosa M, Ontivero-Ortega M, Iglesias-Fuster J, Lage-Castellanos A, Gong J, Luo C, Castro-Laguardia AM, Bobes MA, Marinazzo D, Yao D. Objects seen as scenes: Neural circuitry for attending whole or parts. Neuroimage 2020; 210:116526. [DOI: 10.1016/j.neuroimage.2020.116526] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 12/10/2019] [Accepted: 01/06/2020] [Indexed: 01/03/2023] Open
|
61
|
Harel A, Mzozoyana MW, Al Zoubi H, Nador JD, Noesen BT, Lowe MX, Cant JS. Artificially-generated scenes demonstrate the importance of global scene properties for scene perception. Neuropsychologia 2020; 141:107434. [PMID: 32179102 DOI: 10.1016/j.neuropsychologia.2020.107434] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 03/04/2020] [Accepted: 03/09/2020] [Indexed: 10/24/2022]
Abstract
Recent electrophysiological research highlights the significance of global scene properties (GSPs) for scene perception. However, since real-world scenes span a range of low-level stimulus properties and high-level contextual semantics, GSP effects may also reflect additional processing of such non-global factors. We examined this question by asking whether Event-Related Potentials (ERPs) to GSPs will still be observed when specific low- and high-level scene properties are absent from the scene. We presented participants with computer-based artificially-manipulated scenes varying in two GSPs (spatial expanse and naturalness) which minimized other sources of scene information (color and semantic object detail). We found that the peak amplitude of the P2 component was sensitive to the spatial expanse and naturalness of the artificially-generated scenes: P2 amplitude was higher to closed than open scenes, and in response to manmade than natural scenes. A control experiment showed that the effect of Naturalness on the P2 is not driven by local texture information, while earlier effects of naturalness, expressed as a modulation of the P1 and N1 amplitudes, are sensitive to texture information. Our results demonstrate that GSPs are processed robustly around 220 ms and that P2 can be used as an index of global scene perception.
Collapse
Affiliation(s)
- Assaf Harel
- Department of Psychology, Wright State University, Dayton, OH, USA.
| | - Mavuso W Mzozoyana
- Department of Neuroscience, Cell Biology and Physiology, Wright State University, Dayton, OH, USA
| | - Hamada Al Zoubi
- Department of Neuroscience, Cell Biology and Physiology, Wright State University, Dayton, OH, USA
| | - Jeffrey D Nador
- Department of Psychology, Wright State University, Dayton, OH, USA
| | - Birken T Noesen
- Department of Psychology, Wright State University, Dayton, OH, USA
| | - Matthew X Lowe
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jonathan S Cant
- Department of Psychology, University of Toronto Scarborough, Toronto, ON, Canada
| |
Collapse
|
62
|
Functional Imaging of Visuospatial Attention in Complex and Naturalistic Conditions. Curr Top Behav Neurosci 2020. [PMID: 30547430 DOI: 10.1007/7854_2018_73] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
One of the ultimate goals of cognitive neuroscience is to understand how the brain works in the real world. Functional imaging with naturalistic stimuli provides us with the opportunity to study the brain in situations similar to the everyday life. This includes the processing of complex stimuli that can trigger many types of signals related both to the physical characteristics of the external input and to the internal knowledge that we have about natural objects and environments. In this chapter, I will first outline different types of stimuli that have been used in naturalistic imaging studies. These include static pictures, short video clips, full-length movies, and virtual reality, each comprising specific advantages and disadvantages. Next, I will turn to the main issue of visual-spatial orienting in naturalistic conditions and its neural substrates. I will discuss different classes of internal signals, related to objects, scene structure, and long-term memory. All of these, together with external signals about stimulus salience, have been found to modulate the activity and the connectivity of the frontoparietal attention networks. I will conclude by pointing out some promising future directions for functional imaging with naturalistic stimuli. Despite this field of research is still in its early days, I consider that it will play a major role in bridging the gap between standard laboratory paradigms and mechanisms of brain functioning in the real world.
Collapse
|
63
|
|
64
|
King ML, Groen IIA, Steel A, Kravitz DJ, Baker CI. Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. Neuroimage 2019; 197:368-382. [PMID: 31054350 PMCID: PMC6591094 DOI: 10.1016/j.neuroimage.2019.04.079] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 03/26/2019] [Accepted: 04/29/2019] [Indexed: 12/20/2022] Open
Abstract
Numerous factors have been reported to underlie the representation of complex images in high-level human visual cortex, including categories (e.g. faces, objects, scenes), animacy, and real-world size, but the extent to which this organization reflects behavioral judgments of real-world stimuli is unclear. Here, we compared representations derived from explicit behavioral similarity judgments and ultra-high field (7T) fMRI of human visual cortex for multiple exemplars of a diverse set of naturalistic images from 48 object and scene categories. While there was a significant correlation between similarity judgments and fMRI responses, there were striking differences between the two representational spaces. Behavioral judgements primarily revealed a coarse division between man-made (including humans) and natural (including animals) images, with clear groupings of conceptually-related categories (e.g. transportation, animals), while these conceptual groupings were largely absent in the fMRI representations. Instead, fMRI responses primarily seemed to reflect a separation of both human and non-human faces/bodies from all other categories. Further, comparison of the behavioral and fMRI representational spaces with those derived from the layers of a deep neural network (DNN) showed a strong correspondence with behavior in the top-most layer and with fMRI in the mid-level layers. These results suggest a complex relationship between localized responses in high-level visual cortex and behavioral similarity judgments - each domain reflects different properties of the images, and responses in high-level visual cortex may correspond to intermediate stages of processing between basic visual features and the conceptual categories that dominate the behavioral response.
Collapse
Affiliation(s)
- Marcie L King
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA; Department of Psychological and Brain Sciences, University of Iowa, W311 Seashore Hall, Iowa City, IA, 52242, USA
| | - Iris I A Groen
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA; Department of Psychology, New York University, 6 Washington Place, New York, NY, 10003, USA
| | - Adam Steel
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Dwight J Kravitz
- Department of Psychology, George Washington University, 2125 G St. NW, Washington, DC, 20008, USA
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
65
|
Abstract
Humans are remarkably adept at perceiving and understanding complex real-world scenes. Uncovering the neural basis of this ability is an important goal of vision science. Neuroimaging studies have identified three cortical regions that respond selectively to scenes: parahippocampal place area, retrosplenial complex/medial place area, and occipital place area. Here, we review what is known about the visual and functional properties of these brain areas. Scene-selective regions exhibit retinotopic properties and sensitivity to low-level visual features that are characteristic of scenes. They also mediate higher-level representations of layout, objects, and surface properties that allow individual scenes to be recognized and their spatial structure ascertained. Challenges for the future include developing computational models of information processing in scene regions, investigating how these regions support scene perception under ecologically realistic conditions, and understanding how they operate in the context of larger brain networks.
Collapse
Affiliation(s)
- Russell A Epstein
- Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
| | - Chris I Baker
- Section on Learning and Plasticity, Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, Maryland 20892, USA;
| |
Collapse
|
66
|
Grassini S, Valli K, Souchet J, Aubret F, Segurini GV, Revonsuo A, Koivisto M. Pattern matters: Snakes exhibiting triangular and diamond-shaped skin patterns modulate electrophysiological activity in human visual cortex. Neuropsychologia 2019; 131:62-72. [PMID: 31153966 DOI: 10.1016/j.neuropsychologia.2019.05.024] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 05/20/2019] [Accepted: 05/25/2019] [Indexed: 02/08/2023]
Abstract
The neural and perceptual mechanisms that support the efficient visual detection of snakes in humans are still not fully understood. According to the Snake Detection Theory, selection pressures posed by snakes on early primates have shaped the development of the visual system. Previous studies in humans have investigated early visual electrophysiological activity in response to snake images vs. various alternative dangerous or non-dangerous stimuli. These studies have shown that the Early Posterior Negativity (EPN) component is selectively elicited by snake or snake-like images. Recent findings yielded the complementary/alternative hypothesis that early humans (and possibly other primates) evolved an aversion especially for potentially harmful triangular shapes, such as teeth, claws or spikes. In the present study we investigated the effect of triangular and diamond-shaped patterns in snake skins on the ERP correlates of visual processing in humans. In the first experiment, we employed pictures of snakes displaying either triangular/diamond-shaped patterns or no particular pattern on their skins, and pictures of frogs as control. Participants observed a random visual presentation of these pictures. Consistent with previous studies, snakes elicited an enhanced negativity between 225 and 300 ms (EPN) compared to frogs. However, snakes featuring triangular/diamond-shaped patterns on their skin produced an enhanced EPN compared to the snakes that did not display such patterns. In a second experiment we used pictures displaying only skin patterns of snakes and frogs. Results from the second experiment confirmed the results of the first experiment, suggesting that triangular snake-skin patterns modulate the activity in human visual cortex. Taken together, our results constitute an important contribution to the snake detection theory.
Collapse
Affiliation(s)
- Simone Grassini
- Department of Psychology, University of Turku, 20014, Finland.
| | - Katja Valli
- Department of Psychology, University of Turku, 20014, Finland; Department of Cognitive Neuroscience and Philosophy, School of Bioscience, University of Skövde, 54128, Sweden
| | - Jérémie Souchet
- Station D'Ecologie Théorique et Expérimentale Du CNRS, 2 Route Du Cnrs, 09200, Moulis, France
| | - Fabien Aubret
- Station D'Ecologie Théorique et Expérimentale Du CNRS, 2 Route Du Cnrs, 09200, Moulis, France
| | | | - Antti Revonsuo
- Department of Psychology, University of Turku, 20014, Finland; Department of Cognitive Neuroscience and Philosophy, School of Bioscience, University of Skövde, 54128, Sweden
| | - Mika Koivisto
- Department of Psychology, University of Turku, 20014, Finland
| |
Collapse
|
67
|
Heissl A, Betancourt AJ, Hermann P, Povysil G, Arbeithuber B, Futschik A, Ebner T, Tiemann-Boege I. The impact of poly-A microsatellite heterologies in meiotic recombination. Life Sci Alliance 2019; 2:2/2/e201900364. [PMID: 31023833 PMCID: PMC6485458 DOI: 10.26508/lsa.201900364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 03/27/2019] [Accepted: 03/29/2019] [Indexed: 12/12/2022] Open
Abstract
Meiosis strongly influences the transmission and evolution of heterozygous poly-A repeats as measured experimentally in a large collection of single recombination products in a human hotspot. Meiotic recombination has strong, but poorly understood effects on short tandem repeat (STR) instability. Here, we screened thousands of single recombinant products with sperm typing to characterize the role of polymorphic poly-A repeats at a human recombination hotspot in terms of hotspot activity and STR evolution. We show that the length asymmetry between heterozygous poly-A’s strongly influences the recombination outcome: a heterology of 10 A’s (9A/19A) reduces the number of crossovers and elevates the frequency of non-crossovers, complex recombination products, and long conversion tracts. Moreover, the length of the heterology also influences the STR transmission during meiotic repair with a strong and significant insertion bias for the short heterology (6A/7A) and a deletion bias for the long heterology (9A/19A). In spite of this opposing insertion-/deletion-biased gene conversion, we find that poly-A’s are enriched at human recombination hotspots that could have important consequences in hotspot activation.
Collapse
Affiliation(s)
- Angelika Heissl
- Institute of Biophysics, Johannes Kepler University, Linz, Austria
| | | | - Philipp Hermann
- Institute of Applied Statistics, Johannes Kepler University, Linz, Austria
| | - Gundula Povysil
- Institute of Bioinformatics, Johannes Kepler University, Linz, Austria
| | | | - Andreas Futschik
- Institute of Applied Statistics, Johannes Kepler University, Linz, Austria
| | - Thomas Ebner
- Department of Gynecology, Obstetrics and Gynecological Endocrinology, Kepler University Clinic, Linz, Austria
| | | |
Collapse
|
68
|
Nag S, Berman D, Golomb JD. Category-selective areas in human visual cortex exhibit preferences for stimulus depth. Neuroimage 2019; 196:289-301. [PMID: 30978498 DOI: 10.1016/j.neuroimage.2019.04.025] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 03/21/2019] [Accepted: 04/07/2019] [Indexed: 12/01/2022] Open
Abstract
Multiple regions in the human brain are dedicated to accomplish the feat of object recognition; yet our brains must also compute the 2D and 3D locations of the objects we encounter in order to make sense of our visual environments. A number of studies have explored how various object category-selective regions are sensitive to and have preferences for specific 2D spatial locations in addition to processing their preferred-stimulus categories, but there is no survey of how these regions respond to depth information. In a blocked functional MRI experiment, subjects viewed a series of category-specific (i.e., faces, objects, scenes) and unspecific (e.g., random moving dots) stimuli with red/green anaglyph glasses. Critically, these stimuli were presented at different depth planes such that they appeared in front of, behind, or at the same (i.e., middle) depth plane as the fixation point (Experiment 1) or simultaneously in front of and behind fixation (i.e., mixed depth; Experiment 2). Comparisons of mean response magnitudes between back, middle, and front depth planes reveal that face and object regions OFA and LOC exhibit a preference for front depths, and motion area MT+ exhibits a strong linear preference for front, followed by middle, followed by back depth planes. In contrast, scene-selective regions PPA and OPA prefer front and/or back depth planes (relative to middle). Moreover, the occipital place area demonstrates a strong preference for "mixed" depth above and beyond back alone, raising potential implications about its particular role in scene perception. Crucially, the observed depth preferences in nearly all areas were evoked irrespective of the semantic stimulus category being viewed. These results reveal that the object category-selective regions may play a role in processing or incorporating depth information that is orthogonal to their primary processing of object category information.
Collapse
Affiliation(s)
- Samoni Nag
- Department of Psychology, Center for Cognitive & Brain Sciences, The Ohio State University, USA; Department of Psychology, The George Washington University, USA
| | - Daniel Berman
- Department of Psychology, Center for Cognitive & Brain Sciences, The Ohio State University, USA
| | - Julie D Golomb
- Department of Psychology, Center for Cognitive & Brain Sciences, The Ohio State University, USA.
| |
Collapse
|
69
|
De Cesarei A, Cavicchi S, Micucci A, Codispoti M. Categorization Goals Modulate the Use of Natural Scene Statistics. J Cogn Neurosci 2019; 31:109-125. [DOI: 10.1162/jocn_a_01333] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Understanding natural scenes involves the contribution of bottom–up analysis and top–down modulatory processes. However, the interaction of these processes during the categorization of natural scenes is not well understood. In the current study, we approached this issue using ERPs and behavioral and computational data. We presented pictures of natural scenes and asked participants to categorize them in response to different questions (Is it an animal/vehicle? Is it indoors/outdoors? Are there one/two foreground elements?). ERPs for target scenes requiring a “yes” response began to differ from those of nontarget scenes, beginning at 250 msec from picture onset, and this ERP difference was unmodulated by the categorization questions. Earlier ERPs showed category-specific differences (e.g., between animals and vehicles), which were associated with the processing of scene statistics. From 180 msec after scene onset, these category-specific ERP differences were modulated by the categorization question that was asked. Categorization goals do not modulate only later stages associated with target/nontarget decision but also earlier perceptual stages, which are involved in the processing of scene statistics.
Collapse
|
70
|
Greene MR. The information content of scene categories. PSYCHOLOGY OF LEARNING AND MOTIVATION 2019. [DOI: 10.1016/bs.plm.2019.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
71
|
Rosenfield KA, Semple S, Georgiev AV, Maestripieri D, Higham JP, Dubuc C. Experimental evidence that female rhesus macaques ( Macaca mulatta) perceive variation in male facial masculinity. ROYAL SOCIETY OPEN SCIENCE 2019; 6:181415. [PMID: 30800385 PMCID: PMC6366174 DOI: 10.1098/rsos.181415] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Accepted: 12/17/2018] [Indexed: 06/09/2023]
Abstract
Among many primate species, face shape is sexually dimorphic, and male facial masculinity has been proposed to influence female mate choice and male-male competition by signalling competitive ability. However, whether conspecifics pay attention to facial masculinity has only been assessed in humans. In a study of free-ranging rhesus macaques, Macaca mulatta, we used a two-alternative look-time experiment to test whether females perceive male facial masculinity. We presented 107 females with pairs of images of male faces-one with a more masculine shape and one more feminine-and recorded their looking behaviour. Females looked at the masculine face longer than at the feminine face in more trials than predicted by chance. Although there was no overall difference in average look-time between masculine and feminine faces across all trials, females looked significantly longer at masculine faces in a subset of trials for which the within-pair difference in masculinity was most pronounced. Additionally, the proportion of time subjects looked toward the masculine face increased as the within-pair difference in masculinity increased. This study provides evidence that female macaques perceive variation in male facial shape, a necessary condition for intersexual selection to operate on such a trait. It also highlights the potential impact of perceptual thresholds on look-time experiments.
Collapse
Affiliation(s)
- Kevin A. Rosenfield
- Centre for Research in Evolutionary, Social and Interdisciplinary Anthropology, University of Roehampton, Holybourne Avenue, London SW15 4JD, UK
- Department of Anthropology, Pennsylvania State University, 409 Carpenter Building, University Park, PA 16802, USA
| | - Stuart Semple
- Centre for Research in Evolutionary, Social and Interdisciplinary Anthropology, University of Roehampton, Holybourne Avenue, London SW15 4JD, UK
| | - Alexander V. Georgiev
- School of Natural Sciences, Bangor University, Bangor, Gwynedd LL57 2UW, UK
- Institute for Mind and Biology, The University of Chicago, 940 East 57th Street, Chicago, IL 60637, USA
| | - Dario Maestripieri
- Institute for Mind and Biology, The University of Chicago, 940 East 57th Street, Chicago, IL 60637, USA
| | - James P. Higham
- Department of Anthropology, New York University, 25 Waverly Place, New York, NY 10003, USA
| | - Constance Dubuc
- Department of Anthropology, New York University, 25 Waverly Place, New York, NY 10003, USA
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| |
Collapse
|
72
|
Corradi G, Rosselló-Mir J, Vañó J, Chuquichambi E, Bertamini M, Munar E. The effects of presentation time on preference for curvature of real objects and meaningless novel patterns. Br J Psychol 2018; 110:670-685. [PMID: 30536967 DOI: 10.1111/bjop.12367] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 08/22/2018] [Indexed: 11/29/2022]
Abstract
Objects with curved contours are generally preferred to sharp-angled ones. In this study, we aim to determine whether different presentation times influence this preference. We used images of real objects (experiment 1) and meaningless novel patterns (experiment 2). Participants had to select one of two images from a contour pair, curved and sharp-angled versions of the same object/pattern. With real objects, the preference for curved versions was greatest when presented for 84 ms, and it faded when participants were given unlimited viewing time. Curved meaningless patterns were preferred when presented for 84 and 150 ms. However, in contrast to real objects, preference for meaningless patterns increased significantly in the unlimited viewing time condition. Participants discriminated poorly between the two versions (curved and sharp-angled) of the meaningless patterns in the 84- and 150-ms presentations (experiment 3). Therefore, in short times with meaningless patterns, participants selected mostly the curved version without being aware of the difference. In conclusion, presentation time, type of stimulus, and their interaction influence preference for curvature.
Collapse
Affiliation(s)
- Guido Corradi
- Human Evolution and Cognition Group (EvoCog), University of the Balearic Islands and IFISC, Associated Unit to CSIC, Palma de Mallorca, Spain
| | - Jaume Rosselló-Mir
- Human Evolution and Cognition Group (EvoCog), University of the Balearic Islands and IFISC, Associated Unit to CSIC, Palma de Mallorca, Spain
| | - Javier Vañó
- Human Evolution and Cognition Group (EvoCog), University of the Balearic Islands and IFISC, Associated Unit to CSIC, Palma de Mallorca, Spain
| | - Erick Chuquichambi
- Human Evolution and Cognition Group (EvoCog), University of the Balearic Islands and IFISC, Associated Unit to CSIC, Palma de Mallorca, Spain
| | - Marco Bertamini
- Department of Psychological Sciences, University of Liverpool, UK
| | - Enric Munar
- Human Evolution and Cognition Group (EvoCog), University of the Balearic Islands and IFISC, Associated Unit to CSIC, Palma de Mallorca, Spain
| |
Collapse
|
73
|
Groen IIA, Jahfari S, Seijdel N, Ghebreab S, Lamme VAF, Scholte HS. Scene complexity modulates degree of feedback activity during object detection in natural scenes. PLoS Comput Biol 2018; 14:e1006690. [PMID: 30596644 PMCID: PMC6329519 DOI: 10.1371/journal.pcbi.1006690] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 01/11/2019] [Accepted: 12/01/2018] [Indexed: 02/06/2023] Open
Abstract
Selective brain responses to objects arise within a few hundreds of milliseconds of neural processing, suggesting that visual object recognition is mediated by rapid feed-forward activations. Yet disruption of neural responses in early visual cortex beyond feed-forward processing stages affects object recognition performance. Here, we unite these discrepant findings by reporting that object recognition involves enhanced feedback activity (recurrent processing within early visual cortex) when target objects are embedded in natural scenes that are characterized by high complexity. Human participants performed an animal target detection task on natural scenes with low, medium or high complexity as determined by a computational model of low-level contrast statistics. Three converging lines of evidence indicate that feedback was selectively enhanced for high complexity scenes. First, functional magnetic resonance imaging (fMRI) activity in early visual cortex (V1) was enhanced for target objects in scenes with high, but not low or medium complexity. Second, event-related potentials (ERPs) evoked by target objects were selectively enhanced at feedback stages of visual processing (from ~220 ms onwards) for high complexity scenes only. Third, behavioral performance for high complexity scenes deteriorated when participants were pressed for time and thus less able to incorporate the feedback activity. Modeling of the reaction time distributions using drift diffusion revealed that object information accumulated more slowly for high complexity scenes, with evidence accumulation being coupled to trial-to-trial variation in the EEG feedback response. Together, these results suggest that while feed-forward activity may suffice to recognize isolated objects, the brain employs recurrent processing more adaptively in naturalistic settings, using minimal feedback for simple scenes and increasing feedback for complex scenes.
Collapse
Affiliation(s)
- Iris I. A. Groen
- New York University, Department of Psychology, New York, New York, United States of America
| | - Sara Jahfari
- Spinoza Centre for Neuroimaging, Royal Netherlands Academy of Arts and Sciences (KNAW), Amsterdam, The Netherlands
- University of Amsterdam, Department of Psychology, Section Brain and Cognition, Amsterdam, The Netherlands
| | - Noor Seijdel
- University of Amsterdam, Department of Psychology, Section Brain and Cognition, Amsterdam, The Netherlands
| | - Sennay Ghebreab
- University of Amsterdam, Department of Psychology, Section Brain and Cognition, Amsterdam, The Netherlands
- University of Amsterdam, Department of Informatics, Intelligent Systems Lab, Amsterdam, The Netherlands
| | - Victor A. F. Lamme
- University of Amsterdam, Department of Psychology, Section Brain and Cognition, Amsterdam, The Netherlands
| | - H. Steven Scholte
- University of Amsterdam, Department of Psychology, Section Brain and Cognition, Amsterdam, The Netherlands
| |
Collapse
|
74
|
Dima DC, Perry G, Singh KD. Spatial frequency supports the emergence of categorical representations in visual cortex during natural scene perception. Neuroimage 2018; 179:102-116. [PMID: 29902586 PMCID: PMC6057270 DOI: 10.1016/j.neuroimage.2018.06.033] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 06/01/2018] [Accepted: 06/09/2018] [Indexed: 11/22/2022] Open
Abstract
In navigating our environment, we rapidly process and extract meaning from visual cues. However, the relationship between visual features and categorical representations in natural scene perception is still not well understood. Here, we used natural scene stimuli from different categories and filtered at different spatial frequencies to address this question in a passive viewing paradigm. Using representational similarity analysis (RSA) and cross-decoding of magnetoencephalography (MEG) data, we show that categorical representations emerge in human visual cortex at ∼180 ms and are linked to spatial frequency processing. Furthermore, dorsal and ventral stream areas reveal temporally and spatially overlapping representations of low and high-level layer activations extracted from a feedforward neural network. Our results suggest that neural patterns from extrastriate visual cortex switch from low-level to categorical representations within 200 ms, highlighting the rapid cascade of processing stages essential in human visual perception.
Collapse
Affiliation(s)
- Diana C Dima
- Cardiff University Brain Research Imaging Centre (CUBRIC), School of Psychology, Cardiff University, Cardiff, CF24 4HQ, United Kingdom.
| | - Gavin Perry
- Cardiff University Brain Research Imaging Centre (CUBRIC), School of Psychology, Cardiff University, Cardiff, CF24 4HQ, United Kingdom
| | - Krish D Singh
- Cardiff University Brain Research Imaging Centre (CUBRIC), School of Psychology, Cardiff University, Cardiff, CF24 4HQ, United Kingdom
| |
Collapse
|
75
|
Abstract
Inferior temporal cortex (IT) is a key part of the ventral visual pathway implicated in object, face, and scene perception. But how does IT work? Here, I describe an organizational scheme that marries form and function and provides a framework for future research. The scheme consists of a series of stages arranged along the posterior-anterior axis of IT, defined by anatomical connections and functional responses. Each stage comprises a complement of subregions that have a systematic spatial relationship. The organization of each stage is governed by an eccentricity template, and corresponding eccentricity representations across stages are interconnected. Foveal representations take on a role in high-acuity object vision (including face recognition); intermediate representations compute other aspects of object vision such as behavioral valence (using color and surface cues); and peripheral representations encode information about scenes. This multistage, parallel-processing model invokes an innately determined organization refined by visual experience that is consistent with principles of cortical development. The model is also consistent with principles of evolution, which suggest that visual cortex expanded through replication of retinotopic areas. Finally, the model predicts that the most extensively studied network within IT-the face patches-is not unique but rather one manifestation of a canonical set of operations that reveal general principles of how IT works.
Collapse
Affiliation(s)
- Bevil R Conway
- Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, Maryland 28092, USA; .,National Institutes of Mental Health, National Institute of Neurological Disease and Stroke, National Institutes of Health, Bethesda, Maryland 28092, USA
| |
Collapse
|
76
|
Hansen NE, Noesen BT, Nador JD, Harel A. The influence of behavioral relevance on the processing of global scene properties: An ERP study. Neuropsychologia 2018; 114:168-180. [PMID: 29729276 DOI: 10.1016/j.neuropsychologia.2018.04.040] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Revised: 04/27/2018] [Accepted: 04/30/2018] [Indexed: 12/01/2022]
Abstract
Recent work studying the temporal dynamics of visual scene processing (Harel et al., 2016) has found that global scene properties (GSPs) modulate the amplitude of early Event-Related Potentials (ERPs). It is still not clear, however, to what extent the processing of these GSPs is influenced by their behavioral relevance, determined by the goals of the observer. To address this question, we investigated how behavioral relevance, operationalized by the task context impacts the electrophysiological responses to GSPs. In a set of two experiments we recorded ERPs while participants viewed images of real-world scenes, varying along two GSPs, naturalness (manmade/natural) and spatial expanse (open/closed). In Experiment 1, very little attention to scene content was required as participants viewed the scenes while performing an orthogonal fixation-cross task. In Experiment 2 participants saw the same scenes but now had to actively categorize them, based either on their naturalness or spatial expense. We found that task context had very little impact on the early ERP responses to the naturalness and spatial expanse of the scenes: P1, N1, and P2 could distinguish between open and closed scenes and between manmade and natural scenes across both experiments. Further, the specific effects of naturalness and spatial expanse on the ERP components were largely unaffected by their relevance for the task. A task effect was found at the N1 and P2 level, but this effect was manifest across all scene dimensions, indicating a general effect rather than an interaction between task context and GSPs. Together, these findings suggest that the extraction of global scene information reflected in the early ERP components is rapid and very little influenced by top-down observer-based goals.
Collapse
Affiliation(s)
- Natalie E Hansen
- Department of Psychology, Wright State University, Dayton, OH, United States
| | - Birken T Noesen
- Department of Psychology, Wright State University, Dayton, OH, United States
| | - Jeffrey D Nador
- Department of Psychology, Wright State University, Dayton, OH, United States
| | - Assaf Harel
- Department of Psychology, Wright State University, Dayton, OH, United States.
| |
Collapse
|
77
|
Bonner MF, Epstein RA. Computational mechanisms underlying cortical responses to the affordance properties of visual scenes. PLoS Comput Biol 2018; 14:e1006111. [PMID: 29684011 PMCID: PMC5933806 DOI: 10.1371/journal.pcbi.1006111] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 05/03/2018] [Accepted: 03/31/2018] [Indexed: 11/24/2022] Open
Abstract
Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. How does visual cortex compute behaviorally relevant properties of the local environment from sensory inputs? For decades, computational models have been able to explain only the earliest stages of biological vision, but recent advances in deep neural networks have yielded a breakthrough in the modeling of high-level visual cortex. However, these models are not explicitly designed for testing neurobiological theories, and, like the brain itself, their internal operations remain poorly understood. We examined a deep neural network for insights into the cortical representation of navigational affordances in visual scenes. In doing so, we developed a set of high-throughput techniques and statistical tools that are broadly useful for relating the internal operations of neural networks with the information processes of the brain. Our findings demonstrate that a deep neural network with purely feedforward computations can account for the processing of navigational layout in high-level visual cortex. We next performed a series of experiments and visualization analyses on this neural network. These analyses characterized a set of stimulus input features that may be critical for computing navigationally related cortical representations, and they identified a set of high-level, complex scene features that may serve as a basis set for the cortical coding of navigational layout. These findings suggest a computational mechanism through which high-level visual cortex might encode the spatial structure of the local navigational environment, and they demonstrate an experimental approach for leveraging the power of deep neural networks to understand the visual computations of the brain.
Collapse
Affiliation(s)
- Michael F. Bonner
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States of America
- * E-mail:
| | - Russell A. Epstein
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, United States of America
| |
Collapse
|
78
|
Revealing Detail along the Visual Hierarchy: Neural Clustering Preserves Acuity from V1 to V4. Neuron 2018; 98:417-428.e3. [PMID: 29606580 DOI: 10.1016/j.neuron.2018.03.009] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/06/2018] [Accepted: 03/05/2018] [Indexed: 11/20/2022]
Abstract
How primates perceive objects along with their detailed features remains a mystery. This ability to make fine visual discriminations depends upon a high-acuity analysis of spatial frequency (SF) along the visual hierarchy from V1 to inferotemporal cortex. By studying the transformation of SF across macaque parafoveal V1, V2, and V4, we discovered SF-selective functional domains in V4 encoding higher SFs up to 12 cycles/°. These intermittent higher-SF-selective domains, surrounded by domains encoding lower SFs, violate the inverse relationship between SF preference and retinal eccentricity. The neural activities of higher- and lower-SF domains correspond to local and global features, respectively, of the same stimuli. Neural response latencies in high-SF domains are around 10 ms later than in low-SF domains, consistent with the coarse-to-fine nature of perception. Thus, our finding of preserved resolution from V1 into V4, separated both spatially and temporally, may serve as a connecting link for detailed object representation.
Collapse
|
79
|
Groen II, Greene MR, Baldassano C, Fei-Fei L, Beck DM, Baker CI. Distinct contributions of functional and deep neural network features to representational similarity of scenes in human brain and behavior. eLife 2018. [PMID: 29513219 PMCID: PMC5860866 DOI: 10.7554/elife.32962] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Collapse
Affiliation(s)
- Iris Ia Groen
- Laboratory of Brain and Cognition, National Institutes of Health, Bethesda, United States.,Department of Psychology, New York University, New York City, United States
| | | | | | - Li Fei-Fei
- Stanford Vision Lab, Stanford University, Stanford, United States
| | - Diane M Beck
- Department of Psychology, University of Illinois, Urbana-Champaign, United States.,Beckman Institute, University of Illinois, Urbana-Champaign, United States
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institutes of Health, Bethesda, United States
| |
Collapse
|
80
|
Visual pathways from the perspective of cost functions and multi-task deep neural networks. Cortex 2018; 98:249-261. [DOI: 10.1016/j.cortex.2017.09.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Revised: 06/02/2017] [Accepted: 09/25/2017] [Indexed: 11/18/2022]
|
81
|
Berman D, Golomb JD, Walther DB. Scene content is predominantly conveyed by high spatial frequencies in scene-selective visual cortex. PLoS One 2017; 12:e0189828. [PMID: 29272283 PMCID: PMC5741213 DOI: 10.1371/journal.pone.0189828] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 12/01/2017] [Indexed: 11/19/2022] Open
Abstract
In complex real-world scenes, image content is conveyed by a large collection of intertwined visual features. The visual system disentangles these features in order to extract information about image content. Here, we investigate the role of one integral component: the content of spatial frequencies in an image. Specifically, we measure the amount of image content carried by low versus high spatial frequencies for the representation of real-world scenes in scene-selective regions of human visual cortex. To this end, we attempted to decode scene categories from the brain activity patterns of participants viewing scene images that contained the full spatial frequency spectrum, only low spatial frequencies, or only high spatial frequencies, all carefully controlled for contrast and luminance. Contrary to the findings from numerous behavioral studies and computational models that have highlighted how low spatial frequencies preferentially encode image content, decoding of scene categories from the scene-selective brain regions, including the parahippocampal place area (PPA), was significantly more accurate for high than low spatial frequency images. In fact, decoding accuracy was just as high for high spatial frequency images as for images containing the full spatial frequency spectrum in scene-selective areas PPA, RSC, OPA and object selective area LOC. We also found an interesting dissociation between the posterior and anterior subdivisions of PPA: categories were decodable from both high and low spatial frequency scenes in posterior PPA but only from high spatial frequency scenes in anterior PPA; and spatial frequency was explicitly decodable from posterior but not anterior PPA. Our results are consistent with recent findings that line drawings, which consist almost entirely of high spatial frequencies, elicit a neural representation of scene categories that is equivalent to that of full-spectrum color photographs. Collectively, these findings demonstrate the importance of high spatial frequencies for conveying the content of complex real-world scenes.
Collapse
Affiliation(s)
- Daniel Berman
- Department of Psychology, The Ohio State University, Columbus, Ohio, United States of America
| | - Julie D. Golomb
- Department of Psychology, The Ohio State University, Columbus, Ohio, United States of America
| | - Dirk B. Walther
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
82
|
Bracci S, Ritchie JB, de Beeck HO. On the partnership between neural representations of object categories and visual features in the ventral visual pathway. Neuropsychologia 2017; 105:153-164. [PMID: 28619529 PMCID: PMC5680697 DOI: 10.1016/j.neuropsychologia.2017.06.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Revised: 06/04/2017] [Accepted: 06/12/2017] [Indexed: 11/05/2022]
Abstract
A dominant view in the cognitive neuroscience of object vision is that regions of the ventral visual pathway exhibit some degree of category selectivity. However, recent findings obtained with multivariate pattern analyses (MVPA) suggest that apparent category selectivity in these regions is dependent on more basic visual features of stimuli. In which case a rethinking of the function and organization of the ventral pathway may be in order. We suggest that addressing this issue of functional specificity requires clear coding hypotheses, about object category and visual features, which make contrasting predictions about neuroimaging results in ventral pathway regions. One way to differentiate between categorical and featural coding hypotheses is to test for residual categorical effects: effects of category selectivity that cannot be accounted for by visual features of stimuli. A strong method for testing these effects, we argue, is to make object category and target visual features orthogonal in stimulus design. Recent studies that adopt this approach support a feature-based categorical coding hypothesis according to which regions of the ventral stream do indeed code for object category, but in a format at least partially based on the visual features of stimuli.
Collapse
|
83
|
Pantazis D, Fang M, Qin S, Mohsenzadeh Y, Li Q, Cichy RM. Decoding the orientation of contrast edges from MEG evoked and induced responses. Neuroimage 2017; 180:267-279. [PMID: 28712993 DOI: 10.1016/j.neuroimage.2017.07.022] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Revised: 06/09/2017] [Accepted: 07/12/2017] [Indexed: 10/19/2022] Open
Abstract
Visual gamma oscillations have been proposed to subserve perceptual binding, but their strong modulation by diverse stimulus features confounds interpretations of their precise functional role. Overcoming this challenge necessitates a comprehensive account of the relationship between gamma responses and stimulus features. Here we used multivariate pattern analyses on human MEG data to characterize the relationships between gamma responses and one basic stimulus feature, the orientation of contrast edges. Our findings confirmed we could decode orientation information from induced responses in two dominant frequency bands at 24-32 Hz and 50-58 Hz. Decoding was higher for cardinal than oblique orientations, with similar results also obtained for evoked MEG responses. In contrast to multivariate analyses, orientation information was mostly absent in univariate signals: evoked and induced responses in early visual cortex were similar in all orientations, with only exception an inverse oblique effect observed in induced responses, such that cardinal orientations produced weaker oscillatory signals than oblique orientations. Taken together, our results showed multivariate methods are well suited for the analysis of gamma oscillations, with multivariate patterns robustly encoding orientation information and predominantly discriminating cardinal from oblique stimuli.
Collapse
Affiliation(s)
- Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Mingtong Fang
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sheng Qin
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yalda Mohsenzadeh
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Quanzheng Li
- Department of Radiology, Massachusetts General Hospital, Boston, MA, USA
| | | |
Collapse
|
84
|
Watson DM, Andrews TJ, Hartley T. A data driven approach to understanding the organization of high-level visual cortex. Sci Rep 2017; 7:3596. [PMID: 28620238 PMCID: PMC5472563 DOI: 10.1038/s41598-017-03974-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2017] [Accepted: 05/08/2017] [Indexed: 11/16/2022] Open
Abstract
The neural representation in scene-selective regions of human visual cortex, such as the PPA, has been linked to the semantic and categorical properties of the images. However, the extent to which patterns of neural response in these regions reflect more fundamental organizing principles is not yet clear. Existing studies generally employ stimulus conditions chosen by the experimenter, potentially obscuring the contribution of more basic stimulus dimensions. To address this issue, we used a data-driven approach to describe a large database of scenes (>100,000 images) in terms of their visual properties (orientation, spatial frequency, spatial location). K-means clustering was then used to select images from distinct regions of this feature space. Images in each cluster did not correspond to typical scene categories. Nevertheless, they elicited distinct patterns of neural response in the PPA. Moreover, the similarity of the neural response to different clusters in the PPA could be predicted by the similarity in their image properties. Interestingly, the neural response in the PPA was also predicted by perceptual responses to the scenes, but not by their semantic properties. These findings provide an image-based explanation for the emergence of higher-level representations in scene-selective regions of the human brain.
Collapse
Affiliation(s)
- David M Watson
- Department of Psychology and York Neuroimaging Centre, University of York, York, YO10 5DD, United Kingdom. .,School of Psychology, The University of Nottingham, Nottingham, NG7 2RD, United Kingdom.
| | - Timothy J Andrews
- Department of Psychology and York Neuroimaging Centre, University of York, York, YO10 5DD, United Kingdom
| | - Tom Hartley
- Department of Psychology and York Neuroimaging Centre, University of York, York, YO10 5DD, United Kingdom
| |
Collapse
|
85
|
Hillstrom AP, Segabinazi JD, Godwin HJ, Liversedge SP, Benson V. Cat and mouse search: the influence of scene and object analysis on eye movements when targets change locations during search. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0106. [PMID: 28044017 DOI: 10.1098/rstb.2016.0106] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/14/2016] [Indexed: 11/12/2022] Open
Abstract
We explored the influence of early scene analysis and visible object characteristics on eye movements when searching for objects in photographs of scenes. On each trial, participants were shown sequentially either a scene preview or a uniform grey screen (250 ms), a visual mask, the name of the target and the scene, now including the target at a likely location. During the participant's first saccade during search, the target location was changed to: (i) a different likely location, (ii) an unlikely but possible location or (iii) a very implausible location. The results showed that the first saccade landed more often on the likely location in which the target re-appeared than on unlikely or implausible locations, and overall the first saccade landed nearer the first target location with a preview than without. Hence, rapid scene analysis influenced initial eye movement planning, but availability of the target rapidly modified that plan. After the target moved, it was found more quickly when it appeared in a likely location than when it appeared in an unlikely or implausible location. The findings show that both scene gist and object properties are extracted rapidly, and are used in conjunction to guide saccadic eye movements during visual search.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Anne P Hillstrom
- Psychology Department, University of Southampton, Shackleton Building, Highfield Campus, Southampton SO17 1BJ, UK
| | - Joice D Segabinazi
- Federal University of Rio Grande do Sul, CAPES Foundation, Ministry of Education of Brazil, Brasília, DF 70040-020, Brazil
| | - Hayward J Godwin
- Psychology Department, University of Southampton, Shackleton Building, Highfield Campus, Southampton SO17 1BJ, UK
| | - Simon P Liversedge
- Psychology Department, University of Southampton, Shackleton Building, Highfield Campus, Southampton SO17 1BJ, UK
| | - Valerie Benson
- Psychology Department, University of Southampton, Shackleton Building, Highfield Campus, Southampton SO17 1BJ, UK
| |
Collapse
|
86
|
Kondo HM, van Loon AM, Kawahara JI, Moore BCJ. Auditory and visual scene analysis: an overview. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0099. [PMID: 28044011 DOI: 10.1098/rstb.2016.0099] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2016] [Indexed: 01/23/2023] Open
Abstract
We perceive the world as stable and composed of discrete objects even though auditory and visual inputs are often ambiguous owing to spatial and temporal occluders and changes in the conditions of observation. This raises important questions regarding where and how 'scene analysis' is performed in the brain. Recent advances from both auditory and visual research suggest that the brain does not simply process the incoming scene properties. Rather, top-down processes such as attention, expectations and prior knowledge facilitate scene perception. Thus, scene analysis is linked not only with the extraction of stimulus features and formation and selection of perceptual objects, but also with selective attention, perceptual binding and awareness. This special issue covers novel advances in scene-analysis research obtained using a combination of psychophysics, computational modelling, neuroimaging and neurophysiology, and presents new empirical and theoretical approaches. For integrative understanding of scene analysis beyond and across sensory modalities, we provide a collection of 15 articles that enable comparison and integration of recent findings in auditory and visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Hirohito M Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - Anouk M van Loon
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands .,Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands
| | - Jun-Ichiro Kawahara
- Department of Psychology, Graduate School of Letters, Hokkaido University, Sapporo 060-0810, Japan
| | - Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK
| |
Collapse
|
87
|
Cichy RM, Teng S. Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0108. [PMID: 28044019 PMCID: PMC5206276 DOI: 10.1098/rstb.2016.0108] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/22/2016] [Indexed: 01/06/2023] Open
Abstract
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’.
Collapse
Affiliation(s)
| | - Santani Teng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
88
|
Making Sense of Real-World Scenes. Trends Cogn Sci 2016; 20:843-856. [PMID: 27769727 DOI: 10.1016/j.tics.2016.09.003] [Citation(s) in RCA: 82] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 09/06/2016] [Accepted: 09/06/2016] [Indexed: 11/23/2022]
Abstract
To interact with the world, we have to make sense of the continuous sensory input conveying information about our environment. A recent surge of studies has investigated the processes enabling scene understanding, using increasingly complex stimuli and sophisticated analyses to highlight the visual features and brain regions involved. However, there are two major challenges to producing a comprehensive framework for scene understanding. First, scene perception is highly dynamic, subserving multiple behavioral goals. Second, a multitude of different visual properties co-occur across scenes and may be correlated or independent. We synthesize the recent literature and argue that for a complete view of scene understanding, it is necessary to account for both differing observer goals and the contribution of diverse scene properties.
Collapse
|
89
|
The Temporal Dynamics of Scene Processing: A Multifaceted EEG Investigation. eNeuro 2016; 3:eN-NWR-0139-16. [PMID: 27699208 PMCID: PMC5037322 DOI: 10.1523/eneuro.0139-16.2016] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 08/12/2016] [Accepted: 09/06/2016] [Indexed: 11/25/2022] Open
Abstract
Our remarkable ability to process complex visual scenes is supported by a network of scene-selective cortical regions. Despite growing knowledge about the scene representation in these regions, much less is known about the temporal dynamics with which these representations emerge. We conducted two experiments aimed at identifying and characterizing the earliest markers of scene-specific processing. In the first experiment, human participants viewed images of scenes, faces, and everyday objects while event-related potentials (ERPs) were recorded. We found that the first ERP component to evince a significantly stronger response to scenes than the other categories was the P2, peaking ∼220 ms after stimulus onset. To establish that the P2 component reflects scene-specific processing, in the second experiment, we recorded ERPs while the participants viewed diverse real-world scenes spanning the following three global scene properties: spatial expanse (open/closed), relative distance (near/far), and naturalness (man-made/natural). We found that P2 amplitude was sensitive to these scene properties at both the categorical level, distinguishing between open and closed natural scenes, as well as at the single-image level, reflecting both computationally derived scene statistics and behavioral ratings of naturalness and spatial expanse. Together, these results establish the P2 as an ERP marker for scene processing, and demonstrate that scene-specific global information is available in the neural response as early as 220 ms.
Collapse
|