1
|
Xu ZJ, Buetti S, Xia Y, Lleras A. Skills and cautiousness predict performance in difficult search. Atten Percept Psychophys 2024; 86:1897-1912. [PMID: 38997576 DOI: 10.3758/s13414-024-02923-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2024] [Indexed: 07/14/2024]
Abstract
People differ in how well they search. What are the factors that might contribute to this variability? We tested the contribution of two cognitive abilities: visual working memory (VWM) capacity and object recognition ability. Participants completed three tasks: a difficult inefficient visual search task, where they searched for a target letter T among skewed L distractors; a VWM task, where they memorized a color array and then identified whether a probed color belonged to the previous array; and the Novel Object Memory Test (NOMT), where they learnt complex novel objects and then identified them amongst objects that closely resembled them. Exploratory and confirmatory factor analyses revealed that there are two latent factors that explain the shared variance among these three tasks: a factor indicative of the level of caution participants exercised during the challenging visual search task, and a factor representing their visual cognitive abilities. People who score high on the search cautiousness tend to perform a more accurate but slower search. People who score high on the visual cognitive ability factor tend to have a higher VWM capacity, a better object recognition ability, and a faster search speed. The results reflect two points: (1) Visual search tasks share components with visual working memory and object recognition tasks. (2) Search performance is influenced not only by the search display's properties but also by individual predispositions such as caution and general visual abilities. This study introduces new factors for consideration when interpreting variations in visual search behaviors.
Collapse
Affiliation(s)
- Zoe Jing Xu
- University of Illinois, 603 E. Daniel St., Champaign, IL, 61820, USA.
| | - Simona Buetti
- University of Illinois, 603 E. Daniel St., Champaign, IL, 61820, USA
| | - Yan Xia
- University of Illinois, 603 E. Daniel St., Champaign, IL, 61820, USA
| | - Alejandro Lleras
- University of Illinois, 603 E. Daniel St., Champaign, IL, 61820, USA
| |
Collapse
|
2
|
Wang S, Lin Y, Ding X. Unmasking social attention: The key distinction between social and non-social attention emerges in disengagement, not engagement. Cognition 2024; 249:105834. [PMID: 38797054 DOI: 10.1016/j.cognition.2024.105834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 04/04/2024] [Accepted: 05/22/2024] [Indexed: 05/29/2024]
Abstract
The debate surrounding whether social and non-social attention share the same mechanism has been contentious. While prior studies predominantly focused on engagement, we examined the potential disparity between social and non-social attention from both perspectives of engagement and disengagement, respectively. We developed a two-stage attention-shifting paradigm to capture both attention engagement and disengagement. Combining results from five eye-tracking experiments, we supported that the disengagement of social attention markedly outpaces that of non-social attention, while no significant discrepancy emerges in engagement. We uncovered that the faster disengagement of social attention came from its social nature by eliminating alternative explanations including broader fixation distribution width, reduced directional salience in the peripheral visual field, decreased cue-object categorical consistency, reduced perceived validity, and faster processing time. Our study supported that the distinction between social and non-social attention is rooted in attention disengagement, not engagement.
Collapse
Affiliation(s)
- Shengyuan Wang
- Department of Psychology, Guangdong Provincial Key Laboratory of Social Cognitive Neuroscience and Mental Health, Sun Yat-sen University, Guangzhou, China
| | - Yanhua Lin
- Department of Psychology, Guangdong Provincial Key Laboratory of Social Cognitive Neuroscience and Mental Health, Sun Yat-sen University, Guangzhou, China
| | - Xiaowei Ding
- Department of Psychology, Guangdong Provincial Key Laboratory of Social Cognitive Neuroscience and Mental Health, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
3
|
Sigurdardottir HM, Omarsdottir HR, Valgeirsdottir AS. Reading problems and their connection with visual search and attention. DYSLEXIA (CHICHESTER, ENGLAND) 2024; 30:e1764. [PMID: 38385948 DOI: 10.1002/dys.1764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 01/11/2024] [Accepted: 01/18/2024] [Indexed: 02/23/2024]
Abstract
Attention has been hypothesized to act as a sequential gating mechanism for the orderly processing of letters and words. These same visuoattentional processes are often assumed to partake in some but not all types of visual search. In the current study, 24 dyslexic and 36 typical readers completed an attentionally demanding visual conjunction search. Visual feature search served as an internal control. It has been suggested that reading problems should go hand in hand with specific problems in visual conjunction search-particularly elevated conjunction search slopes (time per search item)-often interpreted as a problem with visual attention. Results showed that reading problems were associated with slower visual search, especially conjunction search. However, reading deficits were not associated with increased conjunction search slopes but instead with increased search intercepts, traditionally not interpreted as reflecting attention. We discuss these results in the context of hypothesized visuoattentional problems in dyslexia. Remaining open to multiple interpretations of the data, the current study demonstrates that difficulties in visual search are associated with reading problems, in accordance with growing literature on visual cognition problems in developmental dyslexia.
Collapse
Affiliation(s)
| | - Hilma Ros Omarsdottir
- Icelandic Vision Lab, Department of Psychology, University of Iceland, Reykjavik, Iceland
| | | |
Collapse
|
4
|
Bertamini M, Oletto CM, Contemori G. The Role of Uniform Textures in Making Texture Elements Visible in the Visual Periphery. Open Mind (Camb) 2024; 8:462-482. [PMID: 38665546 PMCID: PMC11045036 DOI: 10.1162/opmi_a_00136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 02/25/2024] [Indexed: 04/28/2024] Open
Abstract
There are important differences between central and peripheral vision. With respect to shape, contours retain phenomenal sharpness, although some contours disappear if they are near other contours. This leads to some uniform textures to appear non-uniform (Honeycomb illusion, Bertamini et al., 2016). Unlike other phenomena of shape perception in the periphery, this illusion is showing how continuity of the texture does not contribute to phenomenal continuity. We systematically varied the relationship between central and peripheral regions, and we collected subjective reports (how far can one see lines) as well as judgments of line orientation. We used extended textures created with a square grid and some additional lines that are invisible when they are located at the corners of the grid, or visible when they are separated from the grid (control condition). With respects to subjective reports, we compared the region of visibility for cases in which the texture was uniform (Exp 1a), or when in a central region the lines were different (Exp 1b). There were no differences, showing no role of objective uniformity on visibility. Next, in addition to the region of visibility we measured sensitivity using a forced-choice task (line tilted left or right) (Exp 2). The drop in sensitivity with eccentricity matched the size of the region in which lines were perceived in the illusion condition, but not in the control condition. When participants were offered a choice to report of the lines were present or absent (Exp 3) they confirmed that they did not see them in the illusion condition, but saw them in the control condition. We conclude that mechanisms that control perception of contours operate differently in the periphery, and override prior expectations, including that of uniformity. Conversely, when elements are detected in the periphery, we assign to them properties based on information from central vision, but these shapes cannot be identified correctly when the task requires such discrimination.
Collapse
|
5
|
Semizer Y, Yu D, Rosenholtz R. Peripheral vision and crowding in mental maze-solving. J Vis 2024; 24:22. [PMID: 38662347 PMCID: PMC11055501 DOI: 10.1167/jov.24.4.22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Indexed: 04/26/2024] Open
Abstract
Solving a maze effectively relies on both perception and cognition. Studying maze-solving behavior contributes to our knowledge about these important processes. Through psychophysical experiments and modeling simulations, we examine the role of peripheral vision, specifically visual crowding in the periphery, in mental maze-solving. Experiment 1 measured gaze patterns while varying maze complexity, revealing a direct relationship between visual complexity and maze-solving efficiency. Simulations of the maze-solving task using a peripheral vision model confirmed the observed crowding effects while making an intriguing prediction that saccades provide a conservative measure of how far ahead observers can perceive the path. Experiment 2 confirms that observers can judge whether a point lies on the path at considerably greater distances than their average saccade. Taken together, our findings demonstrate that peripheral vision plays a key role in mental maze-solving.
Collapse
Affiliation(s)
- Yelda Semizer
- Department of Humanities and Social Sciences, New Jersey Institute of Technology, Newark, NJ, USA
| | - Dian Yu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ruth Rosenholtz
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
6
|
Keshvari S, Wijntjes MWA. Peripheral material perception. J Vis 2024; 24:13. [PMID: 38625088 PMCID: PMC11033595 DOI: 10.1167/jov.24.4.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 02/19/2024] [Indexed: 04/17/2024] Open
Abstract
Humans can rapidly identify materials, such as wood or leather, even within a complex visual scene. Given a single image, one can easily identify the underlying "stuff," even though a given material can have highly variable appearance; fabric comes in unlimited variations of shape, pattern, color, and smoothness, yet we have little trouble categorizing it as fabric. What visual cues do we use to determine material identity? Prior research suggests that simple "texture" features of an image, such as the power spectrum, capture information about material properties and identity. Few studies, however, have tested richer and biologically motivated models of texture. We compared baseline material classification performance to performance with synthetic textures generated from the Portilla-Simoncelli model and several common image degradations. The textures retain statistical information but are otherwise random. We found that performance with textures and most degradations was well below baseline, suggesting insufficient information to support foveal material perception. Interestingly, modern research suggests that peripheral vision might use a statistical, texture-like representation. In a second set of experiments, we found that peripheral performance is more closely predicted by texture and other image degradations. These findings delineate the nature of peripheral material classification.
Collapse
Affiliation(s)
| | - Maarten W A Wijntjes
- Perceptual Intelligence Lab, Industrial Design Engineering, Delft University of Technology, Delft, Netherlands
| |
Collapse
|
7
|
Veríssimo IS, Nudelman Z, Olivers CNL. Does crowding predict conjunction search? An individual differences approach. Vision Res 2024; 216:108342. [PMID: 38198971 DOI: 10.1016/j.visres.2023.108342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 11/27/2023] [Accepted: 12/07/2023] [Indexed: 01/12/2024]
Abstract
Searching for objects in the visual environment is an integral part of human behavior. Most of the information used during such visual search comes from the periphery of our vision, and understanding the basic mechanisms of search therefore requires taking into account the inherent limitations of peripheral vision. Our previous work using an individual differences approach has shown that one of the major factors limiting peripheral vision (crowding) is predictive of single feature search, as reflected in response time and eye movement measures. Here we extended this work, by testing the relationship between crowding and visual search in a conjunction-search paradigm. Given that conjunction search involves more fine-grained discrimination and more serial behavior, we predicted it would be strongly affected by crowding. We tested sixty participants with regard to their sensitivity to both orientation and color-based crowding (as measured by critical spacing) and their efficiency in searching for a color/orientation conjunction (as indicated by manual response times and eye movements). While the correlations between the different crowding tasks were high, the correlations between the different crowding measures and search performance were relatively modest, and no higher than those previously observed for single-feature search. Instead, observers showed very strong color selectivity during search. The results suggest that conjunction search behavior relies more on top-down guidance (here by color) and is therefore relatively less determined by individual differences in sensory limitations as caused by crowding.
Collapse
Affiliation(s)
- Inês S Veríssimo
- Department of Experimental and Applied Psychology, Cognitive Psychology Section, Vrije Universiteit Amsterdam, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands; Institute for Brain and Behavior, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands.
| | - Zachary Nudelman
- Department of Experimental and Applied Psychology, Cognitive Psychology Section, Vrije Universiteit Amsterdam, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands
| | - Christian N L Olivers
- Department of Experimental and Applied Psychology, Cognitive Psychology Section, Vrije Universiteit Amsterdam, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands
| |
Collapse
|
8
|
Tiurina NA, Markov YA, Whitney D, Pascucci D. The functional role of spatial anisotropies in ensemble perception. BMC Biol 2024; 22:28. [PMID: 38317216 PMCID: PMC10845794 DOI: 10.1186/s12915-024-01822-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/10/2024] [Indexed: 02/07/2024] Open
Abstract
BACKGROUND The human brain can rapidly represent sets of similar stimuli by their ensemble summary statistics, like the average orientation or size. Classic models assume that ensemble statistics are computed by integrating all elements with equal weight. Challenging this view, here, we show that ensemble statistics are estimated by combining parafoveal and foveal statistics in proportion to their reliability. In a series of experiments, observers reproduced the average orientation of an ensemble of stimuli under varying levels of visual uncertainty. RESULTS Ensemble statistics were affected by multiple spatial biases, in particular, a strong and persistent bias towards the center of the visual field. This bias, evident in the majority of subjects and in all experiments, scaled with uncertainty: the higher the uncertainty in the ensemble statistics, the larger the bias towards the element shown at the fovea. CONCLUSION Our findings indicate that ensemble perception cannot be explained by simple uniform pooling. The visual system weights information anisotropically from both the parafovea and the fovea, taking the intrinsic spatial anisotropies of vision into account to compensate for visual uncertainty.
Collapse
Affiliation(s)
- Natalia A Tiurina
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
- Department of Psychology, Technische Universität Dresden, Dresden, Germany.
| | - Yuri A Markov
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Department of Psychology, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - David Whitney
- Vision Science Graduate Group, University of California, Berkeley, Berkeley, USA
- Department of Psychology, University of California, Berkeley, Berkeley, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, USA
| | - David Pascucci
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
9
|
Hughes AE, Nowakowska A, Clarke ADF. Bayesian multi-level modelling for predicting single and double feature visual search. Cortex 2024; 171:178-193. [PMID: 38007862 DOI: 10.1016/j.cortex.2023.10.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/19/2023] [Accepted: 10/05/2023] [Indexed: 11/28/2023]
Abstract
Performance in visual search tasks is frequently summarised by "search slopes" - the additional cost in reaction time for each additional distractor. While search tasks with a shallow search slopes are termed efficient (pop-out, parallel, feature), there is no clear dichotomy between efficient and inefficient (serial, conjunction) search. Indeed, a range of search slopes are observed in empirical data. The Target Contrast Signal (TCS) Theory is a rare example of quantitative model that attempts to predict search slopes for efficient visual search. One study using the TCS framework has shown that the search slope in a double-feature search (where the target differs in both colour and shape from the distractors) can be estimated from the slopes of the associated single-feature searches. This estimation is done using a contrast combination model, and a collinear contrast integration model was shown to outperform other options. In our work, we extend TCS to a Bayesian multi-level framework. We investigate modelling using normal and shifted-lognormal distributions, and show that the latter allows for a better fit to previously published data. We run a new fully within-subjects experiment to attempt to replicate the key original findings, and show that overall, TCS does a good job of predicting the data. However, we do not replicate the finding that the collinear combination model outperforms the other contrast combination models, instead finding that it may be difficult to conclusively distinguish between them.
Collapse
Affiliation(s)
- Anna E Hughes
- Department of Psychology, University of Essex, Colchester, CO4 3SQ, UK.
| | - Anna Nowakowska
- School of Psychology, University of Aberdeen, Aberdeen, AB24 3FX, UK; School of Psychology and Vision Sciences, University of Leicester, Leicester, LE1 7RH, UK
| | | |
Collapse
|
10
|
Van der Burg E, Cass J, Olivers CNL. A CODE model bridging crowding in sparse and dense displays. Vision Res 2024; 215:108345. [PMID: 38142531 DOI: 10.1016/j.visres.2023.108345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 12/04/2023] [Accepted: 12/04/2023] [Indexed: 12/26/2023]
Abstract
Visual crowding is arguably the strongest limitation imposed on extrafoveal vision, and is a relatively well-understood phenomenon. However, most investigations and theories are based on sparse displays consisting of a target and at most a handful of flanker objects. Recent findings suggest that the laws thought to govern crowding may not hold for densely cluttered displays, and that grouping and nearest neighbour effects may be more important. Here we present a computational model that accounts for crowding effects in both sparse and dense displays. The model is an adaptation and extension of an earlier model that has previously successfully accounted for spatial clustering, numerosity and object-based attention phenomena. Our model combines grouping by proximity and similarity with a nearest neighbour rule, and defines crowding as the extent to which target and flankers fail to segment. We show that when the model is optimized for explaining crowding phenomena in classic, sparse displays, it also does a good job in capturing novel crowding patterns in dense displays, in both existing and new data sets. The model thus ties together different principles governing crowding, specifically Bouma's law, grouping, and nearest neighbour similarity effects.
Collapse
Affiliation(s)
| | - John Cass
- MARCS Institute of Brain, Behaviour & Development, Western Sydney University, Australia
| | - Christian N L Olivers
- Institute for Brain and Behaviour Amsterdam, the Netherlands; Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, the Netherlands
| |
Collapse
|
11
|
Siviengphanom S, Lewis SJ, Brennan PC, Gandomkar Z. Computer-extracted global radiomic features can predict the radiologists' first impression about the abnormality of a screening mammogram. Br J Radiol 2024; 97:168-179. [PMID: 38263826 PMCID: PMC11027311 DOI: 10.1093/bjr/tqad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/07/2023] [Accepted: 10/25/2023] [Indexed: 01/25/2024] Open
Abstract
OBJECTIVE Radiologists can detect the gist of abnormal based on their rapid initial impression on a mammogram (ie, global gist signal [GGS]). This study explores (1) whether global radiomic (ie, computer-extracted) features can predict the GGS; and if so, (ii) what features are the most important drivers of the signals. METHODS The GGS of cases in two extreme conditions was considered: when observers detect a very strong gist (high-gist) and when the gist of abnormal was not/poorly perceived (low-gist). Gist signals/scores from 13 observers reading 4191 craniocaudal mammograms were collected. As gist is a noisy signal, the gist scores from all observers were averaged and assigned to each image. The high-gist and low-gist categories contained all images in the fourth and first quartiles, respectively. One hundred thirty handcrafted global radiomic features (GRFs) per mammogram were extracted and utilized to construct eight separate machine learning random forest classifiers (All, Normal, Cancer, Prior-1, Prior-2, Missed, Prior-Visible, and Prior-Invisible) for characterizing high-gist from low-gist images. The models were trained and validated using the 10-fold cross-validation approach. The models' performances were evaluated by the area under receiver operating characteristic curve (AUC). Important features for each model were identified through a scree test. RESULTS The Prior-Visible model achieved the highest AUC of 0.84 followed by the Prior-Invisible (0.83), Normal (0.82), Prior-1 (0.81), All (0.79), Prior-2 (0.77), Missed (0.75), and Cancer model (0.69). Cluster shade, standard deviation, skewness, kurtosis, and range were identified to be the most important features. CONCLUSIONS Our findings suggest that GRFs can accurately classify high- from low-gist images. ADVANCES IN KNOWLEDGE Global mammographic radiomic features can accurately predict high- from low-gist images with five features identified to be valuable in describing high-gist images. These are critical in providing better understanding of the mammographic image characteristics that drive the strength of the GGSs which could be exploited to advance breast cancer (BC) screening and risk prediction, enabling early detection and treatment of BC thereby further reducing BC-related deaths.
Collapse
Affiliation(s)
- Somphone Siviengphanom
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
| | - Sarah J Lewis
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
| | - Patrick C Brennan
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
| | - Ziba Gandomkar
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
12
|
Bao X, Gu Z, Yang J, Li Y, Wang D, Tian Y. Duration perception in peripheral vision: Underestimation increases with greater stimuli eccentricity. Atten Percept Psychophys 2024; 86:237-247. [PMID: 38087157 DOI: 10.3758/s13414-023-02822-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2023] [Indexed: 01/06/2024]
Abstract
Duration perception plays a fundamental role in our daily visual activities; however, it can be easily distorted, even in the retinal location. While this topic has been extensively investigated in central vision, similar exploration in peripheral vision is still at an early stage. To investigate the influence of eccentricity, a commonly used indicator for quantifying retinal location, on duration perception in peripheral vision, we conducted two psychophysical experiments. In Experiment 1, we observed that the retinal location influenced the Point of Subjective Equality (PSE) but not the Weber Fraction (WF) of stimuli appearing at eccentricities ranging from 30° to 70°. Except at 30°, the PSEs were significantly longer than 416.7 ms (25 frames), which was the duration of standard stimuli. This suggested that participants underestimated duration, and this underestimation increased with greater distance from the central fixation point on the retina. To eliminate the potential interference of the central task used in Experiment 1, we conducted a supplementary experiment (Experiment 2) that demonstrated that this central task did not change the underestimation (PSE) but did influence the sensitivity (WF) at an eccentricity of 50°. In summary, our findings revealed a compressive effect of eccentricity on duration perception in peripheral vision: as stimuli appeared more peripheral on the retina, there was an increasing underestimation of subjective duration. Reasons and survival advantages of this underestimation are discussed. Findings provide new insight on duration perception in peripheral vision, highlighting an expanding compressive underestimation effect with greater eccentricity.
Collapse
Affiliation(s)
- Xinle Bao
- Department of Psychology, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Zhengyin Gu
- Department of Psychology, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Jinxing Yang
- Department of Psychology, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - You Li
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, 100094, China
| | - Duming Wang
- Department of Psychology, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China.
| | - Yu Tian
- National Key Laboratory of Human Factors Engineering, China Astronaut Research and Training Center, Beijing, 100094, China
| |
Collapse
|
13
|
Balas B, Greene MR. The role of texture summary statistics in material recognition from drawings and photographs. J Vis 2023; 23:3. [PMID: 38064227 PMCID: PMC10709799 DOI: 10.1167/jov.23.14.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 10/29/2023] [Indexed: 12/18/2023] Open
Abstract
Material depictions in artwork are useful tools for revealing image features that support material categorization. For example, artistic recipes for drawing specific materials make explicit the critical information leading to recognizable material properties (Di Cicco, Wjintjes, & Pont, 2020) and investigating the recognizability of material renderings as a function of their visual features supports conclusions about the vocabulary of material perception. Here, we examined how the recognition of materials from photographs and drawings was affected by the application of the Portilla-Simoncelli texture synthesis model. This manipulation allowed us to examine how categorization may be affected differently across materials and image formats when only summary statistic information about appearance was retained. Further, we compared human performance to the categorization accuracy obtained from a pretrained deep convolutional neural network to determine if observers' performance was reflected in the network. Although we found some similarities between human and network performance for photographic images, the results obtained from drawings differed substantially. Our results demonstrate that texture statistics play a variable role in material categorization across rendering formats and material categories and that the human perception of material drawings is not effectively captured by deep convolutional neural networks trained for object recognition.
Collapse
Affiliation(s)
- Benjamin Balas
- Psychology Department, North Dakota State University, Fargo, ND, USA
| | - Michelle R Greene
- Psychology Department, Barnard College, Columbia University, New York, NY, USA
| |
Collapse
|
14
|
Henderson MM, Tarr MJ, Wehbe L. A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex. J Neurosci 2023; 43:4144-4161. [PMID: 37127366 PMCID: PMC10255092 DOI: 10.1523/jneurosci.1822-22.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/26/2023] [Indexed: 05/03/2023] Open
Abstract
Midlevel features, such as contour and texture, provide a computational link between low- and high-level visual representations. Although the nature of midlevel representations in the brain is not fully understood, past work has suggested a texture statistics model, called the P-S model (Portilla and Simoncelli, 2000), is a candidate for predicting neural responses in areas V1-V4 as well as human behavioral data. However, it is not currently known how well this model accounts for the responses of higher visual cortex to natural scene images. To examine this, we constructed single-voxel encoding models based on P-S statistics and fit the models to fMRI data from human subjects (both sexes) from the Natural Scenes Dataset (Allen et al., 2022). We demonstrate that the texture statistics encoding model can predict the held-out responses of individual voxels in early retinotopic areas and higher-level category-selective areas. The ability of the model to reliably predict signal in higher visual cortex suggests that the representation of texture statistics features is widespread throughout the brain. Furthermore, using variance partitioning analyses, we identify which features are most uniquely predictive of brain responses and show that the contributions of higher-order texture features increase from early areas to higher areas on the ventral and lateral surfaces. We also demonstrate that patterns of sensitivity to texture statistics can be used to recover broad organizational axes within visual cortex, including dimensions that capture semantic image content. These results provide a key step forward in characterizing how midlevel feature representations emerge hierarchically across the visual system.SIGNIFICANCE STATEMENT Intermediate visual features, like texture, play an important role in cortical computations and may contribute to tasks like object and scene recognition. Here, we used a texture model proposed in past work to construct encoding models that predict the responses of neural populations in human visual cortex (measured with fMRI) to natural scene stimuli. We show that responses of neural populations at multiple levels of the visual system can be predicted by this model, and that the model is able to reveal an increase in the complexity of feature representations from early retinotopic cortex to higher areas of ventral and lateral visual cortex. These results support the idea that texture-like representations may play a broad underlying role in visual processing.
Collapse
Affiliation(s)
- Margaret M Henderson
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Michael J Tarr
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| | - Leila Wehbe
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
- Department of Psychology
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
15
|
Yu X, Zhou Z, Becker SI, Boettcher SEP, Geng JJ. Good-enough attentional guidance. Trends Cogn Sci 2023; 27:391-403. [PMID: 36841692 DOI: 10.1016/j.tics.2023.01.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 01/24/2023] [Accepted: 01/25/2023] [Indexed: 02/27/2023]
Abstract
Theories of attention posit that attentional guidance operates on information held in a target template within memory. The template is often thought to contain veridical target features, akin to a photograph, and to guide attention to objects that match the exact target features. However, recent evidence suggests that attentional guidance is highly flexible and often guided by non-veridical features, a subset of features, or only associated features. We integrate these findings and propose that attentional guidance maximizes search efficiency based on a 'good-enough' principle to rapidly localize candidate target objects. Candidates are then serially interrogated to make target-match decisions using more precise information. We suggest that good-enough guidance optimizes the speed-accuracy-effort trade-offs inherent in each stage of visual search.
Collapse
Affiliation(s)
- Xinger Yu
- Center for Mind and Brain, University of California Davis, Davis, CA, USA; Department of Psychology, University of California Davis, Davis, CA, USA
| | - Zhiheng Zhou
- Center for Mind and Brain, University of California Davis, Davis, CA, USA
| | - Stefanie I Becker
- School of Psychology, University of Queensland, Brisbane, QLD, Australia
| | | | - Joy J Geng
- Center for Mind and Brain, University of California Davis, Davis, CA, USA; Department of Psychology, University of California Davis, Davis, CA, USA.
| |
Collapse
|
16
|
Ensemble perception without phenomenal awareness of elements. Sci Rep 2022; 12:11922. [PMID: 35831387 PMCID: PMC9279487 DOI: 10.1038/s41598-022-15850-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 06/30/2022] [Indexed: 11/09/2022] Open
Abstract
Humans efficiently recognize complex scenes by grouping multiple features and objects into ensembles. It has been suggested that ensemble processing does not require, or even impairs, conscious discrimination of individual element properties. The present study examined whether ensemble perception requires phenomenal awareness of elements. We asked observers to judge the mean orientation of a line-based texture pattern whose central region was made invisible by backward masks. Masks were composed of either a Mondrian pattern (Exp. 1) or of an annular contour (Exp. 2) which, unlike the Mondrian, did not overlap spatially with elements in the central region. In the Mondrian-mask experiment, perceived mean orientation was determined only by visible elements outside the central region. However, in the annular-mask experiment, perceived mean orientation matched the mean orientation of all elements, including invisible elements within the central region. Results suggest that the visual system can compute spatial ensembles even without phenomenal awareness of stimuli.
Collapse
|
17
|
Chapman AF, Störmer VS. Feature similarity is non-linearly related to attentional selection: Evidence from visual search and sustained attention tasks. J Vis 2022; 22:4. [PMID: 35834377 PMCID: PMC9290316 DOI: 10.1167/jov.22.8.4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Although many theories of attention highlight the importance of similarity between target and distractor items for selection, few studies have directly quantified the function underlying this relationship. Across two commonly used tasks-visual search and sustained attention-we investigated how target-distractor similarity impacts feature-based attentional selection. Importantly, we found comparable patterns of performance in both visual search and sustained feature-based attention tasks, with performance (response times and d', respectively) plateauing at medium target-distractor distances (40°-50° around a luminance-matched color wheel). In contrast, visual search efficiency, as measured by search slopes, was affected by a much more narrow range of similarity levels (10°-20°). We assessed the relationship between target-distractor similarity and attentional performance using both a stimulus-based and psychologically-based measure of similarity and found this nonlinear relationship in both cases. However, psychological similarity accounted for some of the nonlinearities observed in the data, suggesting that measures of psychological similarity are more appropriate when studying effects of target-distractor similarities. These findings place novel constraints on models of selective attention and emphasize the importance of considering the similarity structure of the feature space over which attention operates. Broadly, the nonlinear effects of similarity on attention are consistent with accounts that propose attention exaggerates the distance between competing representations, possibly through enhancement of off-tuned neurons.
Collapse
Affiliation(s)
- Angus F Chapman
- Department of Psychology, University of California, San Diego, La Jolla, CA, USA.,
| | - Viola S Störmer
- Department of Brain and Psychological Sciences, Dartmouth College, Hanover, NH, USA.,
| |
Collapse
|
18
|
Herzog MH. The Irreducibility of Vision: Gestalt, Crowding and the Fundamentals of Vision. Vision (Basel) 2022; 6:vision6020035. [PMID: 35737422 PMCID: PMC9228288 DOI: 10.3390/vision6020035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/25/2022] [Accepted: 05/31/2022] [Indexed: 11/16/2022] Open
Abstract
What is fundamental in vision has been discussed for millennia. For philosophical realists and the physiological approach to vision, the objects of the outer world are truly given, and failures to perceive objects properly, such as in illusions, are just sporadic misperceptions. The goal is to replace the subjectivity of the mind by careful physiological analyses. Continental philosophy and the Gestaltists are rather skeptical or ignorant about external objects. The percepts themselves are their starting point, because it is hard to deny the truth of one own′s percepts. I will show that, whereas both approaches can well explain many visual phenomena with classic visual stimuli, they both have trouble when stimuli become slightly more complex. I suggest that these failures have a deeper conceptual reason, namely that their foundations (objects, percepts) do not hold true. I propose that only physical states exist in a mind independent manner and that everyday objects, such as bottles and trees, are perceived in a mind-dependent way. The fundamental processing units to process objects are extended windows of unconscious processing, followed by short, discrete conscious percepts.
Collapse
Affiliation(s)
- Michael H Herzog
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
| |
Collapse
|
19
|
Wolfe B, Sawyer BD, Rosenholtz R. Toward a Theory of Visual Information Acquisition in Driving. HUMAN FACTORS 2022; 64:694-713. [PMID: 32678682 PMCID: PMC9136385 DOI: 10.1177/0018720820939693] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 06/09/2020] [Indexed: 06/01/2023]
Abstract
OBJECTIVE The aim of this study is to describe information acquisition theory, explaining how drivers acquire and represent the information they need. BACKGROUND While questions of what drivers are aware of underlie many questions in driver behavior, existing theories do not directly address how drivers in particular and observers in general acquire visual information. Understanding the mechanisms of information acquisition is necessary to build predictive models of drivers' representation of the world and can be applied beyond driving to a wide variety of visual tasks. METHOD We describe our theory of information acquisition, looking to questions in driver behavior and results from vision science research that speak to its constituent elements. We focus on the intersection of peripheral vision, visual attention, and eye movement planning and identify how an understanding of these visual mechanisms and processes in the context of information acquisition can inform more complete models of driver knowledge and state. RESULTS We set forth our theory of information acquisition, describing the gap in understanding that it fills and how existing questions in this space can be better understood using it. CONCLUSION Information acquisition theory provides a new and powerful way to study, model, and predict what drivers know about the world, reflecting our current understanding of visual mechanisms and enabling new theories, models, and applications. APPLICATION Using information acquisition theory to understand how drivers acquire, lose, and update their representation of the environment will aid development of driver assistance systems, semiautonomous vehicles, and road safety overall.
Collapse
|
20
|
Yang S, Wilson K, Roady T, Kuo J, Lenné MG. Beyond gaze fixation: Modeling peripheral vision in relation to speed, Tesla Autopilot, cognitive load, and age in highway driving. ACCIDENT; ANALYSIS AND PREVENTION 2022; 171:106670. [PMID: 35429654 DOI: 10.1016/j.aap.2022.106670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 04/06/2022] [Accepted: 04/08/2022] [Indexed: 06/14/2023]
Abstract
OBJECTIVE The study aims to model driver perception across the visual field in dynamic, real-world highway driving. BACKGROUND Peripheral vision acquires information across the visual field and guides a driver's information search. Studies in naturalistic settings are lacking however, with most research having been conducted in controlled simulation environments with limited eccentricities and driving dynamics. METHODS We analyzed data from 24 participants who drove a Tesla Model S with Autopilot on the highway. While driving, participants completed the peripheral detection task (PDT) using LEDs and the N-back task to generate cognitive load. The I-DT (identification by dispersion threshold) algorithm sampled naturalistic gaze fixations during PDTs to cover a broader and continuous spectrum of eccentricity. A generalized Bayesian regression model predicted LED detection probability during the PDT-as a surrogate for peripheral vision-in relation to eccentricity, vehicle speed, driving mode, cognitive load, and age. RESULTS The model predicted that LED detection probability was high and stable through near-peripheral vision but it declined rapidly beyond 20°-30° eccentricity, showing a narrower useful field over a broader visual field (maximum 70°) during highway driving. Reduced speed (while following another vehicle), cognitive load, and older age were the main factors that degraded the mid-peripheral vision (20°-50°), while using Autopilot had little effect. CONCLUSIONS Drivers can reliably detect objects through near-peripheral vision, but their peripheral detection degrades gradually due to further eccentricity, foveal demand during low-speed vehicle following, cognitive load, and age. APPLICATIONS The findings encourage the development of further multivariate computational models to estimate peripheral vision and assess driver situation awareness for crash prevention.
Collapse
Affiliation(s)
- Shiyan Yang
- Seeing Machines, 80 Mildura St, Fyshwick 2609 ACT, Australia.
| | - Kyle Wilson
- Seeing Machines, 80 Mildura St, Fyshwick 2609 ACT, Australia; Department of Psychology, University of Huddersfield, West Yorkshire, UK
| | - Trey Roady
- Seeing Machines, 80 Mildura St, Fyshwick 2609 ACT, Australia
| | - Jonny Kuo
- Seeing Machines, 80 Mildura St, Fyshwick 2609 ACT, Australia
| | - Michael G Lenné
- Seeing Machines, 80 Mildura St, Fyshwick 2609 ACT, Australia
| |
Collapse
|
21
|
Rummens K, Sayim B. Multidimensional feature interactions in visual crowding: When configural cues eliminate the polarity advantage. J Vis 2022; 22:2. [PMID: 35503508 PMCID: PMC9078080 DOI: 10.1167/jov.22.6.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 03/21/2022] [Indexed: 11/24/2022] Open
Abstract
Crowding occurs when surrounding objects (flankers) impair target perception. A key property of crowding is the weaker interference when target and flankers strongly differ on a given dimension. For instance, identification of a target letter is usually superior with flankers of opposite versus the same contrast polarity as the target (the "polarity advantage"). High performance when target-flanker similarity is low has been attributed to the ungrouping of target and flankers. Here, we show that configural cues can override the usual advantage of low target-flanker similarity, and strong target-flanker grouping can reduce - instead of exacerbate - crowding. In Experiment 1, observers were presented with line triplets in the periphery and reported the tilt (left or right) of the central line. Target and flankers had the same (uniform condition) or opposite contrast polarity (alternating condition). Flanker configurations were either upright (||), unidirectionally tilted (\\ or //), or bidirectionally tilted (\/ or /\). Upright flankers yielded stronger crowding than unidirectional flankers, and weaker crowding than bidirectional flankers. Importantly, our results revealed a clear interaction between contrast polarity and flanker configuration. Triplets with upright and bidirectional flankers, but not unidirectional flankers, showed the polarity advantage. In Experiments 2 and 3, we showed that emergent features and redundancy masking (i.e. the reduction of the number of perceived items in repeating configurations) made it easier to discriminate between uniform triplets when flanker tilts were unidirectional (but not when bidirectional). We propose that the spatial configurations of uniform triplets with unidirectional flankers provided sufficient task-relevant information to enable a similar performance as with alternating triplets: strong-target flanker grouping alleviated crowding. We suggest that features which modulate crowding strength can interact non-additively, limiting the validity of typical crowding rules to contexts where only single, independent dimensions determine the effects of target-flanker similarity.
Collapse
Affiliation(s)
- Koen Rummens
- University of Bern, Institute of Psychology, Bern, Switzerland
| | - Bilge Sayim
- University of Bern, Institute of Psychology, Bern, Switzerland
- Université de Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, Lille, France
| |
Collapse
|
22
|
Abstract
Humans are exquisitely sensitive to the spatial arrangement of visual features in objects and scenes, but not in visual textures. Category-selective regions in the visual cortex are widely believed to underlie object perception, suggesting such regions should distinguish natural images of objects from synthesized images containing similar visual features in scrambled arrangements. Contrarily, we demonstrate that representations in category-selective cortex do not discriminate natural images from feature-matched scrambles but can discriminate images of different categories, suggesting a texture-like encoding. We find similar insensitivity to feature arrangement in Imagenet-trained deep convolutional neural networks. This suggests the need to reconceptualize the role of category-selective cortex as representing a basis set of complex texture-like features, useful for a myriad of behaviors. The human visual ability to recognize objects and scenes is widely thought to rely on representations in category-selective regions of the visual cortex. These representations could support object vision by specifically representing objects, or, more simply, by representing complex visual features regardless of the particular spatial arrangement needed to constitute real-world objects, that is, by representing visual textures. To discriminate between these hypotheses, we leveraged an image synthesis approach that, unlike previous methods, provides independent control over the complexity and spatial arrangement of visual features. We found that human observers could easily detect a natural object among synthetic images with similar complex features that were spatially scrambled. However, observer models built from BOLD responses from category-selective regions, as well as a model of macaque inferotemporal cortex and Imagenet-trained deep convolutional neural networks, were all unable to identify the real object. This inability was not due to a lack of signal to noise, as all observer models could predict human performance in image categorization tasks. How then might these texture-like representations in category-selective regions support object perception? An image-specific readout from category-selective cortex yielded a representation that was more selective for natural feature arrangement, showing that the information necessary for natural object discrimination is available. Thus, our results suggest that the role of the human category-selective visual cortex is not to explicitly encode objects but rather to provide a basis set of texture-like features that can be infinitely reconfigured to flexibly learn and identify new object categories.
Collapse
|
23
|
Gupta SK, Zhang M, Wu CC, Wolfe JM, Kreiman G. Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2021; 34:6946-6959. [PMID: 36062138 PMCID: PMC9436507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found. The model integrates eccentricity-dependent visual recognition with target-dependent top-down cues. We compared the model against human behavior in six paradigmatic search tasks that show asymmetry in humans. Without prior exposure to the stimuli or task-specific training, the model provides a plausible mechanism for search asymmetry. We hypothesized that the polarity of search asymmetry arises from experience with the natural environment. We tested this hypothesis by training the model on augmented versions of ImageNet where the biases of natural images were either removed or reversed. The polarity of search asymmetry disappeared or was altered depending on the training protocol. This study highlights how classical perceptual properties can emerge in neural network models, without the need for task-specific training, but rather as a consequence of the statistical properties of the developmental diet fed to the model. All source code and data are publicly available at https://github.com/kreimanlab/VisualSearchAsymmetry.
Collapse
Affiliation(s)
| | - Mengmi Zhang
- Children's Hospital, Harvard Medical School
- Center for Brains, Minds and Machines
| | - Chia-Chien Wu
- Brigham and Women's Hospital, Harvard Medical School
| | | | - Gabriel Kreiman
- Children's Hospital, Harvard Medical School
- Center for Brains, Minds and Machines
| |
Collapse
|
24
|
Bornet A, Choung OH, Doerig A, Whitney D, Herzog MH, Manassi M. Global and high-level effects in crowding cannot be predicted by either high-dimensional pooling or target cueing. J Vis 2021; 21:10. [PMID: 34812839 PMCID: PMC8626847 DOI: 10.1167/jov.21.12.10] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 09/30/2021] [Indexed: 11/24/2022] Open
Abstract
In visual crowding, the perception of a target deteriorates in the presence of nearby flankers. Traditionally, target-flanker interactions have been considered as local, mostly deleterious, low-level, and feature specific, occurring when information is pooled along the visual processing hierarchy. Recently, a vast literature of high-level effects in crowding (grouping effects and face-holistic crowding in particular) led to a different understanding of crowding, as a global, complex, and multilevel phenomenon that cannot be captured or explained by simple pooling models. It was recently argued that these high-level effects may still be captured by more sophisticated pooling models, such as the Texture Tiling model (TTM). Unlike simple pooling models, the high-dimensional pooling stage of the TTM preserves rich information about a crowded stimulus and, in principle, this information may be sufficient to drive high-level and global aspects of crowding. In addition, it was proposed that grouping effects in crowding may be explained by post-perceptual target cueing. Here, we extensively tested the predictions of the TTM on the results of six different studies that highlighted high-level effects in crowding. Our results show that the TTM cannot explain any of these high-level effects, and that the behavior of the model is equivalent to a simple pooling model. In addition, we show that grouping effects in crowding cannot be predicted by post-perceptual factors, such as target cueing. Taken together, these results reinforce once more the idea that complex target-flanker interactions determine crowding and that crowding occurs at multiple levels of the visual hierarchy.
Collapse
Affiliation(s)
- Alban Bornet
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Oh-Hyeon Choung
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Adrien Doerig
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
| | - David Whitney
- Department of Psychology, University of California, Berkeley, California, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, California, USA
- Vision Science Group, University of California, Berkeley, California, USA
| | - Michael H Herzog
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Mauro Manassi
- School of Psychology, University of Aberdeen, King's College, Aberdeen, UK
| |
Collapse
|
25
|
Abstract
In crowding, perception of a target deteriorates in the presence of nearby flankers. Surprisingly, perception can be rescued from crowding if additional flankers are added (uncrowding). Uncrowding is a major challenge for all classic models of crowding and vision in general, because the global configuration of the entire stimulus is crucial. However, it is unclear which characteristics of the configuration impact (un)crowding. Here, we systematically dissected flanker configurations and showed that (un)crowding cannot be easily explained by the effects of the sub-parts or low-level features of the stimulus configuration. Our modeling results suggest that (un)crowding requires global processing. These results are well in line with previous studies showing the importance of global aspects in crowding.
Collapse
Affiliation(s)
- Oh-Hyeon Choung
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alban Bornet
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Adrien Doerig
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Michael H Herzog
- Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
26
|
Wu R, Wang B, Zhuo Y, Chen L. Topological dominance in peripheral vision. J Vis 2021; 21:19. [PMID: 34570176 PMCID: PMC8479572 DOI: 10.1167/jov.21.10.19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The question of what peripheral vision is good for, especially in pattern recognition, is one of the most important and controversial issues in cognitive science. In a series of experiments, we provide substantial evidence that observers’ behavioral performance in the periphery is consistently superior to central vision for topological change detection, while nontopological change detection deteriorates with increasing eccentricity. These experiments generalize the topological account of object perception in the periphery to different kinds of topological changes (i.e., including introduction, disappearance, and change in number of holes) in comparison with a broad spectrum of geometric properties (e.g., luminance, similarity, spatial frequency, perimeter, and shape of the contour). Moreover, when the stimuli were scaled according to cortical magnification factor and the task difficulty was well controlled by adjusting luminance of the background, the advantage of topological change detection in the periphery remained. The observed advantage of topological change detection in the periphery supports the view that the topological definition of objects provides a coherent account for object perception in peripheral vision, allowing pattern recognition with limited acuity.
Collapse
Affiliation(s)
- Ruijie Wu
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,
| | - Bo Wang
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,Hefei Comprehensive National Science Center, Institute of Artificial Intelligence, Hefei, China.,University of Chinese Academy of Sciences, Beijing, China.,
| | - Yan Zhuo
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,
| | - Lin Chen
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.,Hefei Comprehensive National Science Center, Institute of Artificial Intelligence, Hefei, China.,University of Chinese Academy of Sciences, Beijing, China.,CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, China.,
| |
Collapse
|
27
|
Variance misperception under skewed empirical noise statistics explains overconfidence in the visual periphery. Atten Percept Psychophys 2021; 84:161-178. [PMID: 34426932 DOI: 10.3758/s13414-021-02358-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/16/2021] [Indexed: 11/08/2022]
Abstract
Perceptual confidence typically corresponds to accuracy. However, observers can be overconfident relative to accuracy, termed "subjective inflation." Inflation is stronger in the visual periphery relative to central vision, especially under conditions of peripheral inattention. Previous literature suggests inflation stems from errors in estimating noise (i.e., "variance misperception"). However, despite previous Bayesian hypotheses about metacognitive noise estimation, no work has systematically explored how noise estimation may critically depend on empirical noise statistics, which may differ across the visual field, with central noise distributed symmetrically but peripheral noise positively skewed. Here, we examined central and peripheral vision predictions from five Bayesian-inspired noise-estimation algorithms under varying usage of noise priors, including effects of attention. Models that failed to optimally estimate noise exhibited peripheral inflation, but only models that explicitly used peripheral noise priors-but used them incorrectly-showed increasing peripheral inflation under increasing peripheral inattention. Further, only one model successfully captured previous empirical results, which showed a selective increase in confidence in incorrect responses under performance reductions due to inattention accompanied by no change in confidence in correct responses; this was the model that implemented Bayesian estimation of peripheral noise, but using an (incorrect) symmetric rather than the correct positively skewed peripheral noise prior. Our findings explain peripheral inflation, especially under inattention, and suggest future experiments that might reveal the noise expectations used by the visual metacognitive system.
Collapse
|
28
|
Redundancy between spectral and higher-order texture statistics for natural image segmentation. Vision Res 2021; 187:55-65. [PMID: 34217005 DOI: 10.1016/j.visres.2021.06.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 06/09/2021] [Accepted: 06/11/2021] [Indexed: 11/23/2022]
Abstract
Visual texture, defined by local image statistics, provides important information to the human visual system for perceptual segmentation. Second-order or spectral statistics (equivalent to the Fourier power spectrum) are a well-studied segmentation cue. However, the role of higher-order statistics (HOS) in segmentation remains unclear, particularly for natural images. Recent experiments indicate that, in peripheral vision, the HOS of the widely adopted Portilla-Simoncelli texture model are a weak segmentation cue compared to spectral statistics, despite the fact that both are necessary to explain other perceptual phenomena and to support high-quality texture synthesis. Here we test whether this discrepancy reflects a property of natural image statistics. First, we observe that differences in spectral statistics across segments of natural images are redundant with differences in HOS. Second, using linear and nonlinear classifiers, we show that each set of statistics individually affords high performance in natural scenes and texture segmentation tasks, but combining spectral statistics and HOS produces relatively small improvements. Third, we find that HOS improve segmentation for a subset of images, although these images are difficult to identify. We also find that different subsets of HOS improve segmentation to a different extent, in agreement with previous physiological and perceptual work. These results show that the HOS add modestly to spectral statistics for natural image segmentation. We speculate that tuning to natural image statistics under resource constraints could explain the weak contribution of HOS to perceptual segmentation in human peripheral vision.
Collapse
|
29
|
Veríssimo IS, Hölsken S, Olivers CNL. Individual differences in crowding predict visual search performance. J Vis 2021; 21:29. [PMID: 34038508 PMCID: PMC8164367 DOI: 10.1167/jov.21.5.29] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 03/12/2021] [Indexed: 11/24/2022] Open
Abstract
Visual search is an integral part of human behavior and has proven important to understanding mechanisms of perception, attention, memory, and oculomotor control. Thus far, the dominant theoretical framework posits that search is mainly limited by covert attentional mechanisms, comprising a central bottleneck in visual processing. A different class of theories seeks the cause in the inherent limitations of peripheral vision, with search being constrained by what is known as the functional viewing field (FVF). One of the major factors limiting peripheral vision, and thus the FVF, is crowding. We adopted an individual differences approach to test the prediction from FVF theories that visual search performance is determined by the efficacy of peripheral vision, in particular crowding. Forty-four participants were assessed with regard to their sensitivity to crowding (as measured by critical spacing) and their search efficiency (as indicated by manual responses and eye movements). This revealed substantial correlations between the two tasks, as stronger susceptibility to crowding was predictive of slower search, more eye movements, and longer fixation durations. Our results support FVF theories in showing that peripheral vision is an important determinant of visual search efficiency.
Collapse
Affiliation(s)
- Inês S Veríssimo
- Cognitive Psychology, Institute for Brain and Behavior, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Stefanie Hölsken
- Cognitive Psychology, Institute for Brain and Behavior, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Christian N L Olivers
- Cognitive Psychology, Institute for Brain and Behavior, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- https://www.vupsy.nl/
| |
Collapse
|
30
|
Bates CJ, Jacobs RA. Optimal attentional allocation in the presence of capacity constraints in uncued and cued visual search. J Vis 2021; 21:3. [PMID: 33944906 PMCID: PMC8107488 DOI: 10.1167/jov.21.5.3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 03/09/2021] [Indexed: 11/24/2022] Open
Abstract
The vision sciences literature contains a large diversity of experimental and theoretical approaches to the study of visual attention. We argue that this diversity arises, at least in part, from the field's inability to unify differing theoretical perspectives. In particular, the field has been hindered by a lack of a principled formal framework for simultaneously thinking about both optimal attentional processing and capacity-limited attentional processing, where capacity is limited in a general, task-independent manner. Here, we supply such a framework based on rate-distortion theory (RDT) and optimal lossy compression. Our approach defines Bayes-optimal performance when an upper limit on information processing rate is imposed. In this article, we compare Bayesian and RDT accounts in both uncued and cued visual search tasks. We start by highlighting a typical shortcoming of unlimited-capacity Bayesian models that is not shared by RDT models, namely, that they often overestimate task performance when information-processing demands are increased. Next, we reexamine data from two cued-search experiments that have previously been modeled as the result of unlimited-capacity Bayesian inference and demonstrate that they can just as easily be explained as the result of optimal lossy compression. To model cued visual search, we introduce the concept of a "conditional communication channel." This simple extension generalizes the lossy-compression framework such that it can, in principle, predict optimal attentional-shift behavior in any kind of perceptual task, even when inputs to the model are raw sensory data such as image pixels. To demonstrate this idea's viability, we compare our idealized model of cued search, which operates on a simplified abstraction of the stimulus, to a deep neural network version that performs approximately optimal lossy compression on the real (pixel-level) experimental stimuli.
Collapse
Affiliation(s)
| | - Robert A Jacobs
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| |
Collapse
|
31
|
Abstract
This chapter starts by reviewing the various interpretations of Bálint syndrome over time. We then develop a novel integrative view in which we propose that the various symptoms, historically reported and labeled by various authors, result from a core mislocalization deficit. This idea is in accordance with our previous proposal that the core deficit of Bálint syndrome is attentional (Pisella et al., 2009, 2013, 2017) since covert attention improves spatial resolution in visual periphery (Yeshurun and Carrasco, 1998); a deficit of covert attention would thus increase spatial uncertainty and thereby impair both visual object identification and visuomotor accuracy. In peripheral vision, we perceive the intrinsic characteristics of the perceptual elements surrounding us, but not their precise localization (Rosenholtz et al., 2012a,b), such that without covert attention we cannot organize them to their respective and recognizable objects; this explains why perceptual symptoms (simultanagnosia, neglect) could result from visual mislocalization. The visuomotor symptoms (optic ataxia) can be accounted for by both visual and proprioceptive mislocalizations in an oculocentric reference frame, leading to field and hand effects, respectively. This new pathophysiological account is presented along with a model of posterior parietal cortex organization in which the superior part is devoted to covert attention, while the right inferior part is involved in visual remapping. When the right inferior parietal cortex is damaged, additional representational mislocalizations across saccades worsen the clinical picture of peripheral mislocalizations due to an impairment of covert attention.
Collapse
|
32
|
Contributions of ensemble perception to outlier representation precision. Atten Percept Psychophys 2021; 83:1141-1151. [PMID: 33728510 DOI: 10.3758/s13414-021-02270-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/31/2021] [Indexed: 11/08/2022]
Abstract
It is known that the visual system can efficiently extract mean and variance information, facilitating the detection of outliers. However, no research to date has directly investigated whether ensemble perception mechanisms contribute to outlier representation precision. We specifically were interested in how the distinctiveness of outliers impacts their precision. Across two experiments, we compared how accurately viewers represented the orientation of spatial outliers that varied in distinctiveness and found that increased outlier distinctiveness resulted in greater precision. Based on comparisons of our data to simulations reflecting particular selective strategies, we eliminated the possibility that participants were selectively processing the outlier, at the expense of the ensemble. Thus, we argued that participants separately represented distinct outliers along with ensemble summaries of the remaining items in a display. We also found that outlier distinctiveness moderated the precision of how the remaining items were summarized. We discuss these findings in relation to computational capacity and constraints of ensemble perception mechanisms.
Collapse
|
33
|
Yildirim FZ, Coates DR, Sayim B. Redundancy masking: The loss of repeated items in crowded peripheral vision. J Vis 2021; 20:14. [PMID: 32330230 PMCID: PMC7405779 DOI: 10.1167/jov.20.4.14] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Crowding is the deterioration of target identification in the presence of neighboring objects. Recent studies using appearance-based methods showed that the perceived number of target elements is often diminished in crowding. Here we introduce a related type of diminishment in repeating patterns (sets of parallel lines), which we term “redundancy masking.” In four experiments, observers were presented with arrays of small numbers of lines centered at 10° eccentricity. The task was to indicate the number of lines. In Experiment 1, spatial characteristics of redundancy masking were examined by varying the inter-line spacing. We found that redundancy masking decreased with increasing inter-line spacing and ceased at spacings of approximately 0.25 times the eccentricity. In Experiment 2, we assessed whether the strength of redundancy masking differed between radial and tangential arrangements of elements as it does in crowding. Redundancy masking was strong with radially arranged lines (horizontally arranged vertical lines), and absent with tangentially arranged lines (vertically arranged horizontal lines). In Experiment 3, we investigated whether target size (line width and length) modulated redundancy masking. There was an effect of width: Thinner lines yielded stronger redundancy masking. We did not find any differences between the tested line lengths. In Experiment 4, we varied the regularity of the line arrays by vertically or horizontally jittering the positions of the lines. Redundancy masking was strongest with regular spacings and weakened with decreasing regularity. Our experiments show under which conditions whole items are lost in crowded displays, and how this redundancy masking resembles—and partly diverges from—crowded identification. We suggest that redundancy masking is a contributor to the deterioration of performance in crowded displays with redundant patterns.
Collapse
|
34
|
Multiple concurrent centroid judgments imply multiple within-group salience maps. Atten Percept Psychophys 2021; 83:934-955. [PMID: 33400221 DOI: 10.3758/s13414-020-02197-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/30/2020] [Indexed: 11/08/2022]
Abstract
Subjects viewed a brief flash of 8-24 dots of either two or three colors randomly arrayed. Their task was to move a mouse cursor to the centroid (center-of-gravity) of each color in a pre-designated order. Conventional and idea-detector analyses show that subjects accurately judged all three centroids utilizing an astounding 13/24 stimulus dots, with only a modest loss of accuracy compared to judging a single-predesignated color centroid. The ability to concurrently compute three centroids is important because it is believed that centroid judgments are made on salience maps that record only salience and are ignorant of the features that produced the salience. Our explanation, instantiated in a computational model of salience processing, is that subjects have three salience maps. Dots are initially segregated into three groups according to color, then each color-group is recorded on a different salience map to compute a centroid. In Part 2, the data are analyzed in terms of Attention Operating Characteristics to characterize impairments in subjects' color-attention filters (mostly insignificant) and encoding efficiency (20% drop for the hardest task) in making multiple versus single centroid judgments. A new, more sensitive analysis measured five sources of subject error variance, four independent, additive sources of error variance: imperfect color-attention filters; a Bayesian-like bias towards a central tendency; storage, retrieval, and cursor misplacement error; a large residual error due mostly to inefficient encoding; and fifth, an interactive source - error in all four components that increases when multiple centroid judgments versus a single centroid judgment are required on each trial. SIGNIFICANCE STATEMENT: An important brain process is a salience map, a representation of the relative importance (salience) of the locations of visual space. It is needed to guide where to look next, for computing the center (technically "centroid") of a cluster of items, and for other important computations. Here we show that in a brief flash of dots of three different colors, randomly interleaved, subjects can compute all three centroids. As a single salience map cannot discriminate dots of different colors, accurately reporting three centroids demonstrates that subjects have not just one, as is commonly believed, but at least three salience maps.
Collapse
|
35
|
Herrera-Esposito D, Coen-Cagli R, Gomez-Sena L. Flexible contextual modulation of naturalistic texture perception in peripheral vision. J Vis 2021; 21:1. [PMID: 33393962 PMCID: PMC7794279 DOI: 10.1167/jov.21.1.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 12/01/2020] [Indexed: 11/24/2022] Open
Abstract
Peripheral vision comprises most of our visual field, and is essential in guiding visual behavior. Its characteristic capabilities and limitations, which distinguish it from foveal vision, have been explained by the most influential theory of peripheral vision as the product of representing the visual input using summary statistics. Despite its success, this account may provide a limited understanding of peripheral vision, because it neglects processes of perceptual grouping and segmentation. To test this hypothesis, we studied how contextual modulation, namely the modulation of the perception of a stimulus by its surrounds, interacts with segmentation in human peripheral vision. We used naturalistic textures, which are directly related to summary-statistics representations. We show that segmentation cues affect contextual modulation, and that this is not captured by our implementation of the summary-statistics model. We then characterize the effects of different texture statistics on contextual modulation, providing guidance for extending the model, as well as for probing neural mechanisms of peripheral vision.
Collapse
Affiliation(s)
- Daniel Herrera-Esposito
- Laboratorio de Neurociencias, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Ruben Coen-Cagli
- Department of Systems and Computational Biology and Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Leonel Gomez-Sena
- Laboratorio de Neurociencias, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
36
|
Vialatte A, Yeshurun Y, Khan AZ, Rosenholtz R, Pisella L. Superior Parietal Lobule: A Role in Relative Localization of Multiple Different Elements. Cereb Cortex 2021; 31:658-671. [PMID: 32959044 DOI: 10.1093/cercor/bhaa250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 08/10/2020] [Accepted: 08/10/2020] [Indexed: 12/13/2022] Open
Abstract
Simultanagnosia is an impairment in processing multiple visual elements simultaneously consecutive to bilateral posterior parietal damage, and neuroimaging data have specifically implicated the superior parietal lobule (SPL) in multiple element processing. We previously reported that a patient with focal and bilateral lesions of the SPL performed slower than controls in visual search but only for stimuli consisting of separable lines. Here, we further explored this patient's visual processing of plain object (colored disk) versus object consisting of separable lines (letter), presented in isolation (single object) versus in triplets. Identification of objects was normal in isolation but dropped to chance level when surrounded by distracters, irrespective of eccentricity and spacing. We speculate that this poor performance reflects a deficit in processing objects' relative locations within the triplet (for colored disks), aggravated by a deficit in processing the relative location of each separable line (for letters). Confirming this, performance improved when the patient just had to detect the presence of a specific colored disk within the triplets (visual search instruction), while the inability to identify the middle letter was alleviated when the distracters were identical letters that could be grouped, thereby reducing the number of ways individual lines could be bound.
Collapse
Affiliation(s)
- A Vialatte
- Integrative Multisensory Perception Action & Cognition Team (ImpAct), INSERM U1028, CNRS UMR5292, Lyon Neuroscience Research Center (CRNL), Lyon, France.,University of Lyon 1, Lyon, France.,Hospices Civils de Lyon, Mouvement & Handicap, Neuro-Immersion Platforms, Lyon, France
| | - Y Yeshurun
- Psychology Department, University of Haifa, Haifa, Israel
| | - A Z Khan
- School of Optometry, University of Montreal, Montreal, Canada
| | - R Rosenholtz
- Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - L Pisella
- Integrative Multisensory Perception Action & Cognition Team (ImpAct), INSERM U1028, CNRS UMR5292, Lyon Neuroscience Research Center (CRNL), Lyon, France.,University of Lyon 1, Lyon, France.,Hospices Civils de Lyon, Mouvement & Handicap, Neuro-Immersion Platforms, Lyon, France
| |
Collapse
|
37
|
Vacher J, Davila A, Kohn A, Coen-Cagli R. Texture Interpolation for Probing Visual Perception. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2020; 33:22146-22157. [PMID: 36420050 PMCID: PMC9681139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Texture synthesis models are important tools for understanding visual processing. In particular, statistical approaches based on neurally relevant features have been instrumental in understanding aspects of visual perception and of neural coding. New deep learning-based approaches further improve the quality of synthetic textures. Yet, it is still unclear why deep texture synthesis performs so well, and applications of this new framework to probe visual perception are scarce. Here, we show that distributions of deep convolutional neural network (CNN) activations of a texture are well described by elliptical distributions and therefore, following optimal transport theory, constraining their mean and covariance is sufficient to generate new texture samples. Then, we propose the natural geodesics (i.e. the shortest path between two points) arising with the optimal transport metric to interpolate between arbitrary textures. Compared to other CNN-based approaches, our interpolation method appears to match more closely the geometry of texture perception, and our mathematical framework is better suited to study its statistical nature. We apply our method by measuring the perceptual scale associated to the interpolation parameter in human observers, and the neural sensitivity of different areas of visual cortex in macaque monkeys.
Collapse
Affiliation(s)
- Jonathan Vacher
- Albert Einstein College of Medicine, Dept. of Systems and Comp. Biology, 10461 Bronx, NY, USA
| | - Aida Davila
- Albert Einstein College of Medicine, Dominick P. Purpura Dept. of Neuroscience, 10461 Bronx, NY, USA
| | - Adam Kohn
- Albert Einstein College of Medicine, Dept. of Systems and Comp. Biology, and Dominick P. Purpura Dept. of Neuroscience, 10461 Bronx, NY, USA
| | - Ruben Coen-Cagli
- Albert Einstein College of Medicine, Dept. of Systems and Comp. Biology, and Dominick P. Purpura Dept. of Neuroscience, 10461 Bronx, NY, USA
| |
Collapse
|
38
|
Abstract
Feature Integration Theory (FIT) set out the groundwork for much of the work in visual cognition since its publication. One of the most important legacies of this theory has been the emphasis on feature-specific processing. Nowadays, visual features are thought of as a sort of currency of visual attention (e.g., features can be attended, processing of attended features is enhanced), and attended features are thought to guide attention towards likely targets in a scene. Here we propose an alternative theory - the Target Contrast Signal Theory - based on the idea that when we search for a specific target, it is not the target-specific features that guide our attention towards the target; rather, what determines behavior is the result of an active comparison between the target template in mind and every element present in the scene. This comparison occurs in parallel and is aimed at rejecting from consideration items that peripheral vision can confidently reject as being non-targets. The speed at which each item is evaluated is determined by the overall contrast between that item and the target template. We present computational simulations to demonstrate the workings of the theory as well as eye-movement data that support core predictions of the theory. The theory is discussed in the context of FIT and other important theories of visual search.
Collapse
|
39
|
Alexander RG, Waite S, Macknik SL, Martinez-Conde S. What do radiologists look for? Advances and limitations of perceptual learning in radiologic search. J Vis 2020; 20:17. [PMID: 33057623 PMCID: PMC7571277 DOI: 10.1167/jov.20.10.17] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 09/14/2020] [Indexed: 12/31/2022] Open
Abstract
Supported by guidance from training during residency programs, radiologists learn clinically relevant visual features by viewing thousands of medical images. Yet the precise visual features that expert radiologists use in their clinical practice remain unknown. Identifying such features would allow the development of perceptual learning training methods targeted to the optimization of radiology training and the reduction of medical error. Here we review attempts to bridge current gaps in understanding with a focus on computational saliency models that characterize and predict gaze behavior in radiologists. There have been great strides toward the accurate prediction of relevant medical information within images, thereby facilitating the development of novel computer-aided detection and diagnostic tools. In some cases, computational models have achieved equivalent sensitivity to that of radiologists, suggesting that we may be close to identifying the underlying visual representations that radiologists use. However, because the relevant bottom-up features vary across task context and imaging modalities, it will also be necessary to identify relevant top-down factors before perceptual expertise in radiology can be fully understood. Progress along these dimensions will improve the tools available for educating new generations of radiologists, and aid in the detection of medically relevant information, ultimately improving patient health.
Collapse
Affiliation(s)
- Robert G Alexander
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Stephen Waite
- Department of Radiology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Stephen L Macknik
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Susana Martinez-Conde
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
40
|
Abstract
Area V4-the focus of this review-is a mid-level processing stage along the ventral visual pathway of the macaque monkey. V4 is extensively interconnected with other visual cortical areas along the ventral and dorsal visual streams, with frontal cortical areas, and with several subcortical structures. Thus, it is well poised to play a broad and integrative role in visual perception and recognition-the functional domain of the ventral pathway. Neurophysiological studies in monkeys engaged in passive fixation and behavioral tasks suggest that V4 responses are dictated by tuning in a high-dimensional stimulus space defined by form, texture, color, depth, and other attributes of visual stimuli. This high-dimensional tuning may underlie the development of object-based representations in the visual cortex that are critical for tracking, recognizing, and interacting with objects. Neurophysiological and lesion studies also suggest that V4 responses are important for guiding perceptual decisions and higher-order behavior.
Collapse
Affiliation(s)
- Anitha Pasupathy
- Department of Biological Structure, University of Washington, Seattle, Washington 98195, USA; ,
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98121, USA
| | - Dina V Popovkina
- Department of Psychology, University of Washington, Seattle, Washington 98105, USA;
| | - Taekjun Kim
- Department of Biological Structure, University of Washington, Seattle, Washington 98195, USA; ,
- Washington National Primate Research Center, University of Washington, Seattle, Washington 98121, USA
| |
Collapse
|
41
|
Tanrıkulu ÖD, Chetverikov A, Kristjánsson Á. Encoding perceptual ensembles during visual search in peripheral vision. J Vis 2020; 20:20. [PMID: 32810275 PMCID: PMC7445363 DOI: 10.1167/jov.20.8.20] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 06/24/2020] [Indexed: 11/24/2022] Open
Abstract
Observers can learn complex statistical properties of visual ensembles, such as their probability distributions. Even though ensemble encoding is considered critical for peripheral vision, whether observers learn such distributions in the periphery has not been studied. Here, we used a visual search task to investigate how the shape of distractor distributions influences search performance and ensemble encoding in peripheral and central vision. Observers looked for an oddly oriented bar among distractors taken from either uniform or Gaussian orientation distributions with the same mean and range. The search arrays were either presented in the foveal or peripheral visual fields. The repetition and role reversal effects on search times revealed observers' internal model of distractor distributions. Our results showed that the shape of the distractor distribution influenced search times only in foveal, but not in peripheral search. However, role reversal effects revealed that the shape of the distractor distribution could be encoded peripherally depending on the interitem spacing in the search array. Our results suggest that, although peripheral vision might rely heavily on summary statistical representations of feature distributions, it can also encode information about the distributions themselves.
Collapse
Affiliation(s)
- Ömer Dağlar Tanrıkulu
- Faculty of Psychology, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Andrey Chetverikov
- Visual Computation Lab, Center for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behavior, Nijmegen, the Netherlands
| | - Árni Kristjánsson
- Faculty of Psychology, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- School of Psychology, National Research University Higher School of Economics, Moscow, Russia
| |
Collapse
|
42
|
Zinchenko A, Conci M, Hauser J, Müller HJ, Geyer T. Distributed attention beats the down-side of statistical context learning in visual search. J Vis 2020; 20:4. [PMID: 38755793 PMCID: PMC7424102 DOI: 10.1167/jov.20.7.4] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 05/15/2020] [Indexed: 11/24/2022] Open
Abstract
Spatial attention can be deployed with a narrower focus to process individual items or distributed relatively broadly to process larger parts of a scene. This study investigated how focused- versus distributed-attention modes contribute to the adaptation of context-based memories that guide visual search. In two experiments, participants were either required to fixate the screen center and use peripheral vision for search ("distributed attention"), or they could freely move their eyes, enabling serial scanning of the search array ("focused attention"). Both experiments consisted of an initial learning phase and a subsequent test phase. During learning, participants searched for targets presented either among repeated (invariant) or nonrepeated (randomly generated) spatial layouts of distractor items. Prior research showed that repeated encounters of invariant display arrangements lead to long-term context memory about these arrays, which can then come to guide search (contextual-cueing effect). The crucial manipulation in the test phase was a change of the target location within an otherwise constant distractor layout, which has previously been shown to abolish the cueing effect. The current results replicated these findings, although importantly only when attention was focused. By contrast, with distributed attention, the cueing effect recovered rapidly and attained a level comparable to the initial effect (before the target location change). This indicates that contextual cueing can adapt more easily when attention is distributed, likely because a broad attentional set facilitates the flexible updating of global (distractor-distractor), as compared to more local (distractor-target), context representations-allowing local changes to be incorporated more readily.
Collapse
Affiliation(s)
- Artyom Zinchenko
- Department Psychologie, Ludwig-Maximilians-Universität München , Munich , Germany
| | - Markus Conci
- Department Psychologie, Ludwig-Maximilians-Universität München , Munich , Germany
| | - Johannes Hauser
- Department Psychologie, Ludwig-Maximilians-Universität München , Munich , Germany
| | - Hermann J Müller
- Department Psychologie, Ludwig-Maximilians-Universität München , Munich , Germany
| | - Thomas Geyer
- Department Psychologie, Ludwig-Maximilians-Universität München , Munich , Germany
| |
Collapse
|
43
|
Visual search asymmetry depends on target-distractor feature similarity: Is the asymmetry simply a result of distractor rejection speed? Atten Percept Psychophys 2020; 82:80-97. [PMID: 31359376 DOI: 10.3758/s13414-019-01818-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Previous studies have shown that in visual search, varying the target and distractor familiarity produces a search asymmetry: Detecting a novel target among familiar distractors is more efficient than detecting a familiar target among novel distractors. One explanation is that novel targets have enhanced salience and are detected preattentively. Conversely, familiar distractors may be easier to reject. The current study postulates that target-distractor feature similarity, in addition to target or distractor familiarity, is a key determinant of visual search efficiency. The results of two experiments reveal that visual search is more efficient when distractors are familiar regardless of target familiarity, but only when the target-distractor similarity is high. When similarity is low, the visual search asymmetry disappears and the search times become highly efficient, with search slopes not different from zero regardless of target or distractor familiarity. However, although distractor familiarity plays an important role in inducing the search asymmetry, comparisons of search efficiency in target-present and target-absent trials reveal that search asymmetries cannot be explained solely by the faster speed of rejecting familiar distractors, as proposed by previous studies. Rather, distractor familiarity influences processes outside of stimulus selection, such as search monitoring and termination decisions. Competition among bottom-up item salience effects and top-down shape recognition processes is proposed to account for these findings.
Collapse
|
44
|
Medium versus difficult visual search: How a quantitative change in the functional visual field leads to a qualitative difference in performance. Atten Percept Psychophys 2020; 82:118-139. [PMID: 31267479 PMCID: PMC6994550 DOI: 10.3758/s13414-019-01787-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The dominant theories of visual search assume that search is a process involving comparisons of individual items against a target description that is based on the properties of the target in isolation. Here, we present four experiments that demonstrate that this holds true only in difficult search. In medium search it seems that the relation between the target and neighbouring items is also part of the target description. We used two sets of oriented lines to construct the search items. The cardinal set contained horizontal and vertical lines, the diagonal set contained left diagonal and right diagonal lines. In all experiments, participants knew the identity of the target and the line set used to construct it. In difficult search this knowledge allowed performance to improve in displays where only half of the search items came from the same line set as the target (50% eligibility), relative to displays where all items did (100% eligibility). However, in medium search, performance was actually poorer for 50% eligibility, especially on target-absent trials. This opposite effect of ineligible items in medium search and difficult search is hard to reconcile with theories based on individual items. It is more in line with theories that conceive search as a sequence of fixations where the number of items processed during a fixation depends on the difficulty of the search task: When search is medium, multiple items are processed per fixation. But when search is difficult, only a single item is processed.
Collapse
|
45
|
Rosenholtz R. Demystifying visual awareness: Peripheral encoding plus limited decision complexity resolve the paradox of rich visual experience and curious perceptual failures. Atten Percept Psychophys 2020; 82:901-925. [PMID: 31970709 PMCID: PMC7303063 DOI: 10.3758/s13414-019-01968-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Human beings subjectively experience a rich visual percept. However, when behavioral experiments probe the details of that percept, observers perform poorly, suggesting that vision is impoverished. What can explain this awareness puzzle? Is the rich percept a mere illusion? How does vision work as well as it does? This paper argues for two important pieces of the solution. First, peripheral vision encodes its inputs using a scheme that preserves a great deal of useful information, while losing the information necessary to perform certain tasks. The tasks rendered difficult by the peripheral encoding include many of those used to probe the details of visual experience. Second, many tasks used to probe attentional and working memory limits are, arguably, inherently difficult, and poor performance on these tasks may indicate limits on decision complexity. Two assumptions are critical to making sense of this hypothesis: (1) All visual perception, conscious or not, results from performing some visual task; and (2) all visual tasks face the same limit on decision complexity. Together, peripheral encoding plus decision complexity can explain a wide variety of phenomena, including vision's marvelous successes, its quirky failures, and our rich subjective impression of the visual world.
Collapse
Affiliation(s)
- Ruth Rosenholtz
- MIT Department of Brain & Cognitive Sciences, CSAIL, Cambridge, MA, 02139, USA.
| |
Collapse
|
46
|
Cant JS, Xu Y. One bad apple spoils the whole bushel: The neural basis of outlier processing. Neuroimage 2020; 211:116629. [PMID: 32057998 PMCID: PMC7942194 DOI: 10.1016/j.neuroimage.2020.116629] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 02/01/2020] [Accepted: 02/09/2020] [Indexed: 10/25/2022] Open
Abstract
How are outliers in an otherwise homogeneous object ensemble represented by our visual system? Are outliers ignored because they are the minority? Or do outliers alter our perception of an otherwise homogeneous ensemble? We have previously demonstrated ensemble representation in human anterior-medial ventral visual cortex (overlapping the scene-selective parahippocampal place area; PPA). In this study we investigated how outliers impact object-ensemble representation in this human brain region as well as visual representation throughout posterior brain regions. We presented a homogeneous ensemble followed by an ensemble containing either identical elements or a majority of identical elements with a few outliers. Human participants ignored the outliers and made a same/different judgment between the two ensembles. In PPA, fMRI adaptation was observed when the outliers in the second ensemble matched the items in the first, even though the majority of the elements in the second ensemble were distinct from those in the first; conversely, release from fMRI adaptation was observed when the outliers in the second ensemble were distinct from the items in the first, even though the majority of the elements in the second ensemble were identical to those in the first. A similarly robust outlier effect was also found in other brain regions, including a shape-processing region in lateral occipital cortex (LO) and task-processing fronto-parietal regions. These brain regions likely work in concert to flag the presence of outliers during visual perception and then weigh the outliers appropriately in subsequent behavioral decisions. To our knowledge, this is the first time the neural mechanisms involved in outlier processing have been systematically documented in the human brain. Such an outlier effect could well provide the neural basis mediating our perceptual experience in situations like "one bad apple spoils the whole bushel".
Collapse
Affiliation(s)
- Jonathan S Cant
- Department of Psychology, University of Toronto Scarborough, Toronto, ON, M1C 1A4, Canada.
| | - Yaoda Xu
- Department of Psychology, Yale University, New Haven, CT, 06477, USA
| |
Collapse
|
47
|
The optimal spatial noise for continuous flash suppression masking is pink. Sci Rep 2020; 10:6943. [PMID: 32332984 PMCID: PMC7181696 DOI: 10.1038/s41598-020-63888-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Accepted: 04/01/2020] [Indexed: 12/01/2022] Open
Abstract
A basic question in cognitive neuroscience is how sensory stimuli are processed within and outside of conscious awareness. In the past decade, CFS has become the most popular tool for investigating unconscious visual processing, although the exact nature of some of the underlying mechanisms remains unclear. Here, we investigate which kind of random noise is optimal for CFS masking, and whether the addition of visible edges to noise patterns affects suppression duration. We tested noise patterns of various density as well as composite patterns with added edges, and classic Mondrian masks as well as phase scrambled (edgeless) Mondrian masks for comparison. We find that spatial pink noise (1/F noise) achieved the longest suppression of the tested random noises, however classic Mondrian masks are still significantly more effective in terms of suppression duration. Further analysis reveals that global contrast and general spectral similarity between target and mask cannot account for this difference in effectiveness.
Collapse
|
48
|
Nasr S, LaPierre C, Vaughn CE, Witzel T, Stockmann JP, Polimeni JR. In vivo functional localization of the temporal monocular crescent representation in human primary visual cortex. Neuroimage 2020; 209:116516. [PMID: 31904490 DOI: 10.1016/j.neuroimage.2020.116516] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 12/02/2019] [Accepted: 01/01/2020] [Indexed: 12/19/2022] Open
Abstract
The temporal monocular crescent (TMC) is the most peripheral portion of the visual field whose perception relies solely on input from the ipsilateral eye. According to a handful of post-mortem histological studies in humans and non-human primates, the TMC is represented visuotopically within the most anterior portion of the primary visual cortical area (V1). However, functional evidence of the TMC visuotopic representation in human visual cortex is rare, mostly due to the small size of the TMC representation (~6% of V1) and due to the technical challenges of stimulating the most peripheral portion of the visual field inside the MRI scanner. In this study, by taking advantage of custom-built MRI-compatible visual stimulation goggles with curved displays, we successfully stimulated the TMC region of the visual field in eight human subjects, half of them right-eye dominant, inside a 3 T MRI scanner. This enabled us to localize the representation of TMC, along with the blind spot representation (another visuotopic landmark in V1), in all volunteers, which match the expected spatial pattern based on prior anatomical studies. In all hemispheres, the TMC visuotopic representation was localized along the peripheral border of V1, within the most anterior portion of the calcarine sulcus, without any apparent extension into the second visual area (V2). We further demonstrate the reliability of this localization within/across experimental sessions, and consistency in the spatial location of TMC across individuals after accounting for inter-subject structural differences.
Collapse
Affiliation(s)
- Shahin Nasr
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States; Harvard Medical School, Boston, MA, United States.
| | - Cristen LaPierre
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States
| | - Christopher E Vaughn
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States
| | - Thomas Witzel
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States; Harvard Medical School, Boston, MA, United States
| | - Jason P Stockmann
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States; Harvard Medical School, Boston, MA, United States
| | - Jonathan R Polimeni
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, 02129, United States; Harvard Medical School, Boston, MA, United States; Massachusetts Institute of Technology, Division of Health Sciences and Technology, Cambridge, MA, United States
| |
Collapse
|
49
|
Reiter S, Laurent G. Visual perception and cuttlefish camouflage. Curr Opin Neurobiol 2019; 60:47-54. [PMID: 31837480 DOI: 10.1016/j.conb.2019.10.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 10/08/2019] [Indexed: 12/01/2022]
Abstract
Visual perception is inherently statistical: brains exploit repeating features of natural scenes to disambiguate images that could, in principle, have many causes. A clear case for the relevance of statistical inference in vision is animal camouflage. Although visual scenes are each composed of unique arrangements of pixels, they are usually perceived mainly as groupings of statistically defined patches (sandy/leafy/smooth etc…); this fact is exploited by camouflaging animals. The unique ability of certain cephalopods to camouflage actively within many different surroundings provides a rare and direct behavioral readout for texture perception. In addition, because cephalopods and chordates each arose after a phylogenetic split that occurred some 600M years ago, the apparent convergence of texture perception across these groups suggests common principles. Studying cephalopod camouflage may thus help us resolve general problems of visual perception.
Collapse
Affiliation(s)
- Sam Reiter
- Max Planck Institute for Brain Research, Max-von-Laue-Str. 4, 60438 Frankfurt am Main, Germany
| | - Gilles Laurent
- Max Planck Institute for Brain Research, Max-von-Laue-Str. 4, 60438 Frankfurt am Main, Germany.
| |
Collapse
|
50
|
Ramezani F, Kheradpisheh SR, Thorpe SJ, Ghodrati M. Object categorization in visual periphery is modulated by delayed foveal noise. J Vis 2019; 19:1. [PMID: 31369042 DOI: 10.1167/19.9.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Behavioral studies in humans indicate that peripheral vision can do object recognition to some extent. Moreover, recent studies have shown that some information from brain regions retinotopic to visual periphery is somehow fed back to regions retinotopic to the fovea and disrupting this feedback impairs object recognition in human. However, it is unclear to what extent the information in visual periphery contributes to human object categorization. Here, we designed two series of rapid object categorization tasks to first investigate the performance of human peripheral vision in categorizing natural object images at different eccentricities and abstraction levels (superordinate, basic, and subordinate). Then, using a delayed foveal noise mask, we studied how modulating the foveal representation impacts peripheral object categorization at any of the abstraction levels. We found that peripheral vision can quickly and accurately accomplish superordinate categorization, while its performance in finer categorization levels dramatically drops as the object presents further in the periphery. Also, we found that a 300-ms delayed foveal noise mask can significantly disturb categorization performance in basic and subordinate levels, while it has no effect on the superordinate level. Our results suggest that human peripheral vision can easily process objects at high abstraction levels, and the information is fed back to foveal vision to prime foveal cortex for finer categorizations when a saccade is made toward the target object.
Collapse
Affiliation(s)
- Farzad Ramezani
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran
| | - Saeed Reza Kheradpisheh
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Simon J Thorpe
- Centre de Recherche Cerveau et Cognition (CerCo) Université Paul Sabatier, Toulouse, France
| | - Masoud Ghodrati
- Neuroscience Program, Biomedicine Discovery Institute, Monash University, Clayton, Victoria, Australia
| |
Collapse
|