1
|
Robert S, Granovetter MC, Ling S, Behrmann M. Space- and object-based attention in patients with a single hemisphere following childhood resection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.06.627251. [PMID: 39713352 PMCID: PMC11661120 DOI: 10.1101/2024.12.06.627251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
The neural processes underlying attentional processing are typically lateralized in adults, with spatial attention associated with the right hemisphere (RH) and object-based attention with the left hemisphere (LH). Using a modified two-rectangle attention paradigm, we compared the lateralization profiles of individuals with childhood hemispherectomy (either LH or RH) and age-matched, typically developing controls. Although patients exhibited slower reaction times (RTs) compared to controls, both groups benefited from valid attentional cueing. However, patients experienced significantly higher costs for invalid trials-reflected by larger RT differences between validly and invalidly cued targets. This was true for invalid trials on both cued and uncued objects, probes of object- and space-based attentional processes, respectively. Notably, controls showed no significant RT cost differences between invalidly cued locations on cued versus uncued objects. By contrast, patients exhibited greater RT costs for targets on uncued versus cued objects, suggesting greater difficulty shifting attention across objects. We explore potential explanations for this group difference and the lack of difference between patients with LH or RH resection. These findings enhance our understanding of spatial and object-based attention in typical development and reveal how significant neural injury affects the development of attentional systems in the LH and RH.
Collapse
Affiliation(s)
- Sophia Robert
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, Pittsburgh PA
| | - Michael C. Granovetter
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, Pittsburgh PA
- School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Departments of Pediatrics and Neurology, New York University, New York, NY, USA
| | - Shouyu Ling
- Department of Ophthalmology, University of Pittsburgh, PA, USA
| | - Marlene Behrmann
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, Pittsburgh PA
- Department of Ophthalmology, University of Pittsburgh, PA, USA
| |
Collapse
|
2
|
Kwak Y, Zhao Y, Lu ZL, Hanning NM, Carrasco M. Presaccadic Attention Enhances and Reshapes the Contrast Sensitivity Function Differentially around the Visual Field. eNeuro 2024; 11:ENEURO.0243-24.2024. [PMID: 39197949 PMCID: PMC11397507 DOI: 10.1523/eneuro.0243-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 06/24/2024] [Indexed: 09/01/2024] Open
Abstract
Contrast sensitivity (CS), which constrains human vision, decreases from fovea to periphery, from the horizontal to the vertical meridian, and from the lower vertical to the upper vertical meridian. It also depends on spatial frequency (SF), and the contrast sensitivity function (CSF) depicts this relation. To compensate for these visual constraints, we constantly make saccades and foveate on relevant objects in the scene. Already before saccade onset, presaccadic attention shifts to the saccade target and enhances perception. However, it is unknown whether and how it modulates the interplay between CS and SF, and if this effect varies around polar angle meridians. CS enhancement may result from a horizontal or vertical shift of the CSF, increase in bandwidth, or any combination. In addition, presaccadic attention could enhance CS similarly around the visual field, or it could benefit perception more at locations with poorer performance (i.e., vertical meridian). Here, we investigated these possibilities by extracting key attributes of the CSF of human observers. The results reveal that presaccadic attention (1) increases CS across SF, (2) increases the most preferred and the highest discernable SF, and (3) narrows the bandwidth. Therefore, presaccadic attention helps bridge the gap between presaccadic and postsaccadic input by increasing visibility at the saccade target. Counterintuitively, this CS enhancement was more pronounced where perception is better-along the horizontal than the vertical meridian-exacerbating polar angle asymmetries. Our results call for an investigation of the differential neural modulations underlying presaccadic perceptual changes for different saccade directions.
Collapse
Affiliation(s)
- Yuna Kwak
- Department of Psychology, New York University, New York, New York 10003
| | - Yukai Zhao
- Center for Neural Science, New York University, New York, New York 10003
| | - Zhong-Lin Lu
- Department of Psychology, New York University, New York, New York 10003
- Center for Neural Science, New York University, New York, New York 10003
- Division of Arts and Sciences, New York University Shanghai, Shanghai 200124, China
- NYU-ECNU Institute of Brain and Cognitive Science, Shanghai 200062, China
| | - Nina Maria Hanning
- Department of Psychology, New York University, New York, New York 10003
- Center for Neural Science, New York University, New York, New York 10003
- Department of Psychology, Humboldt-Universität zu Berlin, Berlin 12489, Germany
| | - Marisa Carrasco
- Department of Psychology, New York University, New York, New York 10003
- Center for Neural Science, New York University, New York, New York 10003
| |
Collapse
|
3
|
Ting F, Zeyi Z. Effects of different sensory integration tasks on the biomechanical characteristics of the lower limb during walking in patients with patellofemoral pain. Front Bioeng Biotechnol 2024; 12:1441027. [PMID: 39257445 PMCID: PMC11383783 DOI: 10.3389/fbioe.2024.1441027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 08/15/2024] [Indexed: 09/12/2024] Open
Abstract
Purpose This study aimed to analyze the biomechanical characteristics of the lower limb in patients with patellofemoral pain (PFP) while walking under different sensory integration tasks and elucidate the relationship between these biomechanical characteristics and patellofemoral joint stress (PFJS). Our study's findings may provide insights which could help to establish new approaches to treat and prevent PFP. Method Overall, 28 male university students presenting with PFP were enrolled in this study. The kinematic and kinetic data of the participants during walking were collected. The effects of different sensory integration tasks including baseline (BL), Tactile integration task (TIT), listening integration task (LIT), visual integration task (VIT) on the biomechanical characteristics of the lower limb were examined using a One-way repeated measures ANOVA. The relationship between the aforementioned biomechanical characteristics and PFJS was investigated using Pearson correlation analysis. Results The increased hip flexion angle (P = 0.016), increased knee extension moment (P = 0.047), decreased step length (P < 0.001), decreased knee flexion angle (P = 0.010), and decreased cadence (P < 0.001) exhibited by patients with PFP while performing a VIT were associated with increased patellofemoral joint stress. The reduced cadence (P < 0.050) achieved by patients with PFP when performing LIT were associated with increased patellofemoral joint stress. Conclusion VIT significantly influenced lower limb movement patterns during walking in patients with PFP. Specifically, the increased hip flexion angle, increased knee extension moment, decreased knee flexion angle, and decreased cadence resulting from this task may have increased PFJS and may have contributed to the recurrence of PFP. Similarly, patients with PFP often demonstrate a reduction in cadence when exposed to TIT and LIT. This may be the main trigger for increased PFJS under TIT and LIT.
Collapse
Affiliation(s)
- Fan Ting
- Shanghai Zhuoyue Ruixin Digital Technology Company limited, Shanghai, China
| | - Zhang Zeyi
- School of Physical Education and Health Care, East China Normal University, Shanghai, China
- Key Laboratory of Adolescent Health Assessment and Exercise Intervention of Ministry of Education, East China Normal University, Shanghai, China
| |
Collapse
|
4
|
Quaia C, Krauzlis RJ. Object recognition in primates: what can early visual areas contribute? Front Behav Neurosci 2024; 18:1425496. [PMID: 39070778 PMCID: PMC11272660 DOI: 10.3389/fnbeh.2024.1425496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/01/2024] [Indexed: 07/30/2024] Open
Abstract
Introduction If neuroscientists were asked which brain area is responsible for object recognition in primates, most would probably answer infero-temporal (IT) cortex. While IT is likely responsible for fine discriminations, and it is accordingly dominated by foveal visual inputs, there is more to object recognition than fine discrimination. Importantly, foveation of an object of interest usually requires recognizing, with reasonable confidence, its presence in the periphery. Arguably, IT plays a secondary role in such peripheral recognition, and other visual areas might instead be more critical. Methods To investigate how signals carried by early visual processing areas (such as LGN and V1) could be used for object recognition in the periphery, we focused here on the task of distinguishing faces from non-faces. We tested how sensitive various models were to nuisance parameters, such as changes in scale and orientation of the image, and the type of image background. Results We found that a model of V1 simple or complex cells could provide quite reliable information, resulting in performance better than 80% in realistic scenarios. An LGN model performed considerably worse. Discussion Because peripheral recognition is both crucial to enable fine recognition (by bringing an object of interest on the fovea), and probably sufficient to account for a considerable fraction of our daily recognition-guided behavior, we think that the current focus on area IT and foveal processing is too narrow. We propose that rather than a hierarchical system with IT-like properties as its primary aim, object recognition should be seen as a parallel process, with high-accuracy foveal modules operating in parallel with lower-accuracy and faster modules that can operate across the visual field.
Collapse
Affiliation(s)
- Christian Quaia
- Laboratory of Sensorimotor Research, National Eye Institute, NIH, Bethesda, MD, United States
| | | |
Collapse
|
5
|
Srivastava S, Wang WY, Eckstein MP. Emergent human-like covert attention in feedforward convolutional neural networks. Curr Biol 2024; 34:579-593.e12. [PMID: 38244541 DOI: 10.1016/j.cub.2023.12.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 10/09/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024]
Abstract
Covert attention allows the selection of locations or features of the visual scene without moving the eyes. Cues and contexts predictive of a target's location orient covert attention and improve perceptual performance. The performance benefits are widely attributed to theories of covert attention as a limited resource, zoom, spotlight, or weighting of visual information. However, such concepts are difficult to map to neuronal populations. We show that a feedforward convolutional neural network (CNN) trained on images to optimize target detection accuracy and with no explicit incorporation of an attention mechanism, a limited resource, or feedback connections learns to utilize cues and contexts in the three most prominent covert attention tasks (Posner cueing, set size effects in search, and contextual cueing) and predicts the cue/context influences on human accuracy. The CNN's cueing/context effects generalize across network training schemes, to peripheral and central pre-cues, discrimination tasks, and reaction time measures, and critically do not vary with reductions in network resources (size). The CNN shows comparable cueing/context effects to a model that optimally uses image information to make decisions (Bayesian ideal observer) but generalizes these effects to cue instances unseen during training. Together, the findings suggest that human-like behavioral signatures of covert attention in the three landmark paradigms might be an emergent property of task accuracy optimization in neuronal populations without positing limited attentional resources. The findings might explain recent behavioral results showing cueing and context effects across a variety of simple organisms with no neocortex, from archerfish to fruit flies.
Collapse
Affiliation(s)
- Sudhanshu Srivastava
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - William Yang Wang
- Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| | - Miguel P Eckstein
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| |
Collapse
|
6
|
Thayer DD, Sprague TC. Feature-Specific Salience Maps in Human Cortex. J Neurosci 2023; 43:8785-8800. [PMID: 37907257 PMCID: PMC10727177 DOI: 10.1523/jneurosci.1104-23.2023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 09/29/2023] [Accepted: 10/24/2023] [Indexed: 11/02/2023] Open
Abstract
Priority map theory is a leading framework for understanding how various aspects of stimulus displays and task demands guide visual attention. Per this theory, the visual system computes a priority map, which is a representation of visual space indexing the relative importance, or priority, of locations in the environment. Priority is computed based on both salience, defined based on image-computable properties; and relevance, defined by an individual's current goals, and is used to direct attention to the highest-priority locations for further processing. Computational theories suggest that priority maps identify salient locations based on individual feature dimensions (e.g., color, motion), which are integrated into an aggregate priority map. While widely accepted, a core assumption of this framework, the existence of independent feature dimension maps in visual cortex, remains untested. Here, we tested the hypothesis that retinotopic regions selective for specific feature dimensions (color or motion) in human cortex act as neural feature dimension maps, indexing salient locations based on their preferred feature. We used fMRI activation patterns to reconstruct spatial maps while male and female human participants viewed stimuli with salient regions defined by relative color or motion direction. Activation in reconstructed spatial maps was localized to the salient stimulus position in the display. Moreover, the strength of the stimulus representation was strongest in the ROI selective for the salience-defining feature. Together, these results suggest that feature-selective extrastriate visual regions highlight salient locations based on local feature contrast within their preferred feature dimensions, supporting their role as neural feature dimension maps.SIGNIFICANCE STATEMENT Identifying salient information is important for navigating the world. For example, it is critical to detect a quickly approaching car when crossing the street. Leading models of computer vision and visual search rely on compartmentalized salience computations based on individual features; however, there has been no direct empirical demonstration identifying neural regions as responsible for performing these dissociable operations. Here, we provide evidence of a critical double dissociation that neural activation patterns from color-selective regions prioritize the location of color-defined salience while minimally representing motion-defined salience, whereas motion-selective regions show the complementary result. These findings reveal that specialized cortical regions act as neural "feature dimension maps" that are used to index salient locations based on specific features to guide attention.
Collapse
Affiliation(s)
- Daniel D Thayer
- Department of Psychological and Brain Sciences, University of California-Santa Barbara, Santa Barbara, California 93106
| | - Thomas C Sprague
- Department of Psychological and Brain Sciences, University of California-Santa Barbara, Santa Barbara, California 93106
| |
Collapse
|
7
|
Bohil CJ, Phelps A, Neider MB, Schmidt J. Explicit and implicit category learning in categorical visual search. Atten Percept Psychophys 2023; 85:2131-2149. [PMID: 37784002 DOI: 10.3758/s13414-023-02789-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2023] [Indexed: 10/04/2023]
Abstract
Categorical search has been heavily investigated over the past decade, mostly using natural categories that leave the underlying category mental representation unknown. The categorization literature offers several theoretical accounts of category mental representations. One prominent account is that separate learning systems account for classification: an explicit learning system that relies on easily verbalized rules and an implicit learning system that relies on an associatively learned (nonverbalizable) information integration strategy. The current study assessed the contributions of these separate category learning systems in the context of categorical search using simple stimuli. Participants learned to classify sinusoidal grating stimuli according to explicit or implicit categorization strategies, followed by a categorical search task using these same stimulus categories. Computational modeling determined which participants used the appropriate classification strategy during training and search, and eye movements collected during categorical search were assessed. We found that the trained categorization strategies overwhelmingly transferred to the verification (classification response) phase of search. Implicit category learning led to faster search response and shorter target dwell times relative to explicit category learning, consistent with the notion that explicit rule classification relies on a more deliberative response strategy. Participants who transferred the correct category learning strategy to the search guidance phase produced stronger search guidance (defined as the proportion of trials on which the target was the first item fixated) with evidence of greater guidance in implicit-strategy learners. This demonstrates that both implicit and explicit categorization systems contribute to categorical search and produce dissociable patterns of data.
Collapse
Affiliation(s)
- Corey J Bohil
- Department of Psychology, University of Central Florida, Orlando, FL, USA.
- Lawrence Technological University, 21000 West Ten Mile Road, Southfield, MI, 48075, USA.
| | - Ashley Phelps
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| | - Mark B Neider
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| | - Joseph Schmidt
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| |
Collapse
|
8
|
Robinson MM, Brady TF. A quantitative model of ensemble perception as summed activation in feature space. Nat Hum Behav 2023; 7:1638-1651. [PMID: 37402880 PMCID: PMC10810262 DOI: 10.1038/s41562-023-01602-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 04/14/2023] [Indexed: 07/06/2023]
Abstract
Ensemble perception is a process by which we summarize complex scenes. Despite the importance of ensemble perception to everyday cognition, there are few computational models that provide a formal account of this process. Here we develop and test a model in which ensemble representations reflect the global sum of activation signals across all individual items. We leverage this set of minimal assumptions to formally connect a model of memory for individual items to ensembles. We compare our ensemble model against a set of alternative models in five experiments. Our approach uses performance on a visual memory task for individual items to generate zero-free-parameter predictions of interindividual and intraindividual differences in performance on an ensemble continuous-report task. Our top-down modelling approach formally unifies models of memory for individual items and ensembles and opens a venue for building and comparing models of distinct memory processes and representations.
Collapse
Affiliation(s)
- Maria M Robinson
- Psychology Department, University of California, San Diego, La Jolla, CA, USA.
| | - Timothy F Brady
- Psychology Department, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
9
|
Han NX, Eckstein MP. Head and body cues guide eye movements and facilitate target search in real-world videos. J Vis 2023; 23:5. [PMID: 37294703 PMCID: PMC10259675 DOI: 10.1167/jov.23.6.5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 05/03/2023] [Indexed: 06/11/2023] Open
Abstract
Static gaze cues presented in central vision result in observer shifts of covert attention and eye movements, and benefits in perceptual performance in the detection of simple targets. Less is known about how dynamic gazer behaviors with head and body motion influence search eye movements and performance in perceptual tasks in real-world scenes. Participants searched for a target person (yes/no task, 50% presence), whereas watching videos of one to three gazers looking at a designated person (50% valid gaze cue, looking at the target). To assess the contributions of different body parts, we digitally erase parts of the gazers in the videos to create three different body parts/whole conditions for gazers: floating heads (only head movements), headless bodies (only lower body movements), and the baseline condition with intact head and body. We show that valid dynamic gaze cues guided participants' eye movements (up to 3 fixations) closer to the target, speeded the time to foveate the target, reduced fixations to the gazers, and improved target detection. The effect of gaze cues in guiding eye movements to the search target was the smallest when the gazer's head was removed from the videos. To assess the inherent information about gaze goal location for each body parts/whole condition, we collected perceptual judgments estimating gaze goals by a separate group of observers with unlimited time. Observers' perceptual judgments showed larger estimate errors when the gazer's head was removed. This suggests that the reduced eye movement guidance from lower body cueing is related to observers' difficulty extracting gaze information without the presence of the head. Together, the study extends previous work by evaluating the impact of dynamic gazer behaviors on search with videos of real-world cluttered scenes.
Collapse
Affiliation(s)
- Nicole X Han
- Department of Psychological and Brain Sciences, Institute for Collaborative Biotechnologies, University of California, Santa Barbara, CA, USA
| | - Miguel P Eckstein
- Department of Psychological and Brain Sciences, Institute for Collaborative Biotechnologies, University of California, Santa Barbara, CA, USA
| |
Collapse
|
10
|
Abstract
Vision and learning have long been considered to be two areas of research linked only distantly. However, recent developments in vision research have changed the conceptual definition of vision from a signal-evaluating process to a goal-oriented interpreting process, and this shift binds learning, together with the resulting internal representations, intimately to vision. In this review, we consider various types of learning (perceptual, statistical, and rule/abstract) associated with vision in the past decades and argue that they represent differently specialized versions of the fundamental learning process, which must be captured in its entirety when applied to complex visual processes. We show why the generalized version of statistical learning can provide the appropriate setup for such a unified treatment of learning in vision, what computational framework best accommodates this kind of statistical learning, and what plausible neural scheme could feasibly implement this framework. Finally, we list the challenges that the field of statistical learning faces in fulfilling the promise of being the right vehicle for advancing our understanding of vision in its entirety. Expected final online publication date for the Annual Review of Vision Science, Volume 8 is September 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- József Fiser
- Department of Cognitive Science, Center for Cognitive Computation, Central European University, Vienna 1100, Austria;
| | - Gábor Lengyel
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
11
|
Peng P, Yang KF, Liang SQ, Li YJ. Contour-guided saliency detection with long-range interactions. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
12
|
Nicholson DA, Prinz AA. Could simplified stimuli change how the brain performs visual search tasks? A deep neural network study. J Vis 2022; 22:3. [PMID: 35675057 PMCID: PMC9187944 DOI: 10.1167/jov.22.7.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/04/2022] [Indexed: 11/24/2022] Open
Abstract
Visual search is a complex behavior influenced by many factors. To control for these factors, many studies use highly simplified stimuli. However, the statistics of these stimuli are very different from the statistics of the natural images that the human visual system is optimized by evolution and experience to perceive. Could this difference change search behavior? If so, simplified stimuli may contribute to effects typically attributed to cognitive processes, such as selective attention. Here we use deep neural networks to test how optimizing models for the statistics of one distribution of images constrains performance on a task using images from a different distribution. We train four deep neural network architectures on one of three source datasets-natural images, faces, and x-ray images-and then adapt them to a visual search task using simplified stimuli. This adaptation produces models that exhibit performance limitations similar to humans, whereas models trained on the search task alone exhibit no such limitations. However, we also find that deep neural networks trained to classify natural images exhibit similar limitations when adapted to a search task that uses a different set of natural images. Therefore, the distribution of data alone cannot explain this effect. We discuss how future work might integrate an optimization-based approach into existing models of visual search behavior.
Collapse
Affiliation(s)
- David A Nicholson
- Emory University, Department of Biology, O. Wayne Rollins Research Center, Atlanta, Georgia
| | - Astrid A Prinz
- Emory University, Department of Biology, O. Wayne Rollins Research Center, Atlanta, Georgia
| |
Collapse
|
13
|
Gaze-cued shifts of attention and microsaccades are sustained for whole bodies but are transient for body parts. Psychon Bull Rev 2022; 29:1854-1878. [PMID: 35381913 PMCID: PMC9568497 DOI: 10.3758/s13423-022-02087-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/04/2022] [Indexed: 11/21/2022]
Abstract
Gaze direction is an evolutionarily important mechanism in daily social interactions. It reflects a person’s internal cognitive state, spatial locus of interest, and predicts future actions. Studies have used static head images presented foveally and simple synthetic tasks to find that gaze orients attention and facilitates target detection at the cued location in a sustained manner. Little is known about how people’s natural gaze behavior, including eyes, head, and body movements, jointly orient covert attention, microsaccades, and facilitate performance in more ecological dynamic scenes. Participants completed a target person detection task with videos of real scenes. The videos showed people looking toward (valid cue) or away from a target (invalid cue) location. We digitally manipulated the individuals in the videos directing gaze to create three conditions: whole-intact (head and body movements), floating heads (only head movements), and headless bodies (only body movements). We assessed their impact on participants’ behavioral performance and microsaccades during the task. We show that, in isolation, an individual’s head or body orienting toward the target-person direction led to facilitation in detection that is transient in time (200 ms). In contrast, only the whole-intact condition led to sustained facilitation (500 ms). Furthermore, observers executed microsaccades more frequently towards the cued direction for valid trials, but this bias was sustained in time only with the joint presence of head and body parts. Together, the results differ from previous findings with foveally presented static heads. In more real-world scenarios and tasks, sustained attention requires the presence of the whole-intact body of the individuals dynamically directing their gaze.
Collapse
|
14
|
Lev-Ari T, Beeri H, Gutfreund Y. The Ecological View of Selective Attention. Front Integr Neurosci 2022; 16:856207. [PMID: 35391754 PMCID: PMC8979825 DOI: 10.3389/fnint.2022.856207] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 02/24/2022] [Indexed: 11/16/2022] Open
Abstract
Accumulating evidence is supporting the hypothesis that our selective attention is a manifestation of mechanisms that evolved early in evolution and are shared by many organisms from different taxa. This surge of new data calls for the re-examination of our notions about attention, which have been dominated mostly by human psychology. Here, we present an hypothesis that challenges, based on evolutionary grounds, a common view of attention as a means to manage limited brain resources. We begin by arguing that evolutionary considerations do not favor the basic proposition of the limited brain resources view of attention, namely, that the capacity of the sensory organs to provide information exceeds the capacity of the brain to process this information. Moreover, physiological studies in animals and humans show that mechanisms of selective attention are highly demanding of brain resources, making it paradoxical to see attention as a means to release brain resources. Next, we build on the above arguments to address the question why attention evolved in evolution. We hypothesize that, to a certain extent, limiting sensory processing is adaptive irrespective of brain capacity. We call this hypothesis the ecological view of attention (EVA) because it is centered on interactions of an animal with its environment rather than on internal brain resources. In its essence is the notion that inherently noisy and degraded sensory inputs serve the animal's adaptive, dynamic interactions with its environment. Attention primarily functions to resolve behavioral conflicts and false distractions. Hence, we evolved to focus on a particular target at the expense of others, not because of internal limitations, but to ensure that behavior is properly oriented and committed to its goals. Here, we expand on this notion and review evidence supporting it. We show how common results in human psychophysics and physiology can be reconciled with an EVA and discuss possible implications of the notion for interpreting current results and guiding future research.
Collapse
Affiliation(s)
| | | | - Yoram Gutfreund
- The Ruth and Bruce Rappaport Faculty of Medicine and Research Institute, The Technion, Haifa, Israel
| |
Collapse
|
15
|
Han NX, Chakravarthula PN, Eckstein MP. Peripheral facial features guiding eye movements and reducing fixational variability. J Vis 2021; 21:7. [PMID: 34347018 PMCID: PMC8340657 DOI: 10.1167/jov.21.8.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Face processing is a fast and efficient process due to its evolutionary and social importance. A majority of people direct their first eye movement to a featureless point just below the eyes that maximizes accuracy in recognizing a person's identity and gender. Yet, the exact properties or features of the face that guide the first eye movements and reduce fixational variability are unknown. Here, we manipulated the presence of the facial features and the spatial configuration of features to investigate their effect on the location and variability of first and second fixations to peripherally presented faces. Our results showed that observers can utilize the face outline, individual facial features, and feature spatial configuration to guide the first eye movements to their preferred point of fixation. The eyes have a preferential role in guiding the first eye movements and reducing fixation variability. Eliminating the eyes or altering their position had the greatest influence on the location and variability of fixations and resulted in the largest detriment to face identification performance. The other internal features (nose and mouth) also contribute to reducing fixation variability. A subsequent experiment measuring detection of single features showed that the eyes have the highest detectability (relative to other features) in the visual periphery providing a strong sensory signal to guide the oculomotor system. Together, the results suggest a flexible multiple-cue approach that might be a robust solution to cope with how the varying eccentricities in the real world influence the ability to resolve individual feature properties and the preferential role of the eyes.
Collapse
Affiliation(s)
- Nicole X Han
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA.,
| | - Puneeth N Chakravarthula
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA.,
| | - Miguel P Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA.,
| |
Collapse
|
16
|
Welbourne LE, Jonnalagadda A, Giesbrecht B, Eckstein MP. The transverse occipital sulcus and intraparietal sulcus show neural selectivity to object-scene size relationships. Commun Biol 2021; 4:768. [PMID: 34158579 PMCID: PMC8219818 DOI: 10.1038/s42003-021-02294-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 05/26/2021] [Indexed: 02/05/2023] Open
Abstract
To optimize visual search, humans attend to objects with the expected size of the sought target relative to its surrounding scene (object-scene scale consistency). We investigate how the human brain responds to variations in object-scene scale consistency. We use functional magnetic resonance imaging and a voxel-wise feature encoding model to estimate tuning to different object/scene properties. We find that regions involved in scene processing (transverse occipital sulcus) and spatial attention (intraparietal sulcus) have the strongest responsiveness and selectivity to object-scene scale consistency: reduced activity to mis-scaled objects (either unusually smaller or larger). The findings show how and where the brain incorporates object-scene size relationships in the processing of scenes. The response properties of these brain areas might explain why during visual search humans often miss objects that are salient but at atypical sizes relative to the surrounding scene.
Collapse
Affiliation(s)
- Lauren E Welbourne
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, USA.
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, USA.
- York NeuroImaging Centre, Department of Psychology, University of York, York, UK.
| | - Aditya Jonnalagadda
- Electrical and Computer Engineering, University of California, Santa Barbara, USA
| | - Barry Giesbrecht
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, USA
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, USA
- Interdepartmental Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, USA
| | - Miguel P Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, USA.
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, USA.
- Electrical and Computer Engineering, University of California, Santa Barbara, USA.
- Interdepartmental Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, USA.
| |
Collapse
|
17
|
Bates CJ, Jacobs RA. Optimal attentional allocation in the presence of capacity constraints in uncued and cued visual search. J Vis 2021; 21:3. [PMID: 33944906 PMCID: PMC8107488 DOI: 10.1167/jov.21.5.3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 03/09/2021] [Indexed: 11/24/2022] Open
Abstract
The vision sciences literature contains a large diversity of experimental and theoretical approaches to the study of visual attention. We argue that this diversity arises, at least in part, from the field's inability to unify differing theoretical perspectives. In particular, the field has been hindered by a lack of a principled formal framework for simultaneously thinking about both optimal attentional processing and capacity-limited attentional processing, where capacity is limited in a general, task-independent manner. Here, we supply such a framework based on rate-distortion theory (RDT) and optimal lossy compression. Our approach defines Bayes-optimal performance when an upper limit on information processing rate is imposed. In this article, we compare Bayesian and RDT accounts in both uncued and cued visual search tasks. We start by highlighting a typical shortcoming of unlimited-capacity Bayesian models that is not shared by RDT models, namely, that they often overestimate task performance when information-processing demands are increased. Next, we reexamine data from two cued-search experiments that have previously been modeled as the result of unlimited-capacity Bayesian inference and demonstrate that they can just as easily be explained as the result of optimal lossy compression. To model cued visual search, we introduce the concept of a "conditional communication channel." This simple extension generalizes the lossy-compression framework such that it can, in principle, predict optimal attentional-shift behavior in any kind of perceptual task, even when inputs to the model are raw sensory data such as image pixels. To demonstrate this idea's viability, we compare our idealized model of cued search, which operates on a simplified abstraction of the stimulus, to a deep neural network version that performs approximately optimal lossy compression on the real (pixel-level) experimental stimuli.
Collapse
Affiliation(s)
| | - Robert A Jacobs
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| |
Collapse
|
18
|
Souto D, Kerzel D. Visual selective attention and the control of tracking eye movements: a critical review. J Neurophysiol 2021; 125:1552-1576. [DOI: 10.1152/jn.00145.2019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
People’s eyes are directed at objects of interest with the aim of acquiring visual information. However, processing this information is constrained in capacity, requiring task-driven and salience-driven attentional mechanisms to select few among the many available objects. A wealth of behavioral and neurophysiological evidence has demonstrated that visual selection and the motor selection of saccade targets rely on shared mechanisms. This coupling supports the premotor theory of visual attention put forth more than 30 years ago, postulating visual selection as a necessary stage in motor selection. In this review, we examine to which extent the coupling of visual and motor selection observed with saccades is replicated during ocular tracking. Ocular tracking combines catch-up saccades and smooth pursuit to foveate a moving object. We find evidence that ocular tracking requires visual selection of the speed and direction of the moving target, but the position of the motion signal may not coincide with the position of the pursuit target. Further, visual and motor selection can be spatially decoupled when pursuit is initiated (open-loop pursuit). We propose that a main function of coupled visual and motor selection is to serve the coordination of catch-up saccades and pursuit eye movements. A simple race-to-threshold model is proposed to explain the variable coupling of visual selection during pursuit, catch-up and regular saccades, while generating testable predictions. We discuss pending issues, such as disentangling visual selection from preattentive visual processing and response selection, and the pinpointing of visual selection mechanisms, which have begun to be addressed in the neurophysiological literature.
Collapse
Affiliation(s)
- David Souto
- Department of Neuroscience, Psychology and Behaviour, University of Leicester, Leicester, United Kingdom
| | - Dirk Kerzel
- Faculté de Psychologie et des Sciences de l’Education, University of Geneva, Geneva, Switzerland
| |
Collapse
|
19
|
Lago MA, Jonnalagadda A, Abbey CK, Barufaldi BB, Bakic PR, Maidment ADA, Leung WK, Weinstein SP, Englander BS, Eckstein MP. Under-exploration of Three-Dimensional Images Leads to Search Errors for Small Salient Targets. Curr Biol 2021; 31:1099-1106.e5. [PMID: 33472051 PMCID: PMC8048135 DOI: 10.1016/j.cub.2020.12.029] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 10/09/2020] [Accepted: 12/18/2020] [Indexed: 10/22/2022]
Abstract
Advances in 3D imaging technology are transforming how radiologists search for cancer1,2 and how security officers scrutinize baggage for dangerous objects.3 These new 3D technologies often improve search over 2D images4,5 but vastly increase the image data. Here, we investigate 3D search for targets of various sizes in filtered noise and digital breast phantoms. For a Bayesian ideal observer optimally processing the filtered noise and a convolutional neural network processing the digital breast phantoms, search with 3D image stacks increases target information and improves accuracy over search with 2D images. In contrast, 3D search by humans leads to high miss rates for small targets easily detected in 2D search, but not for larger targets more visible in the visual periphery. Analyses of human eye movements, perceptual judgments, and a computational model with a foveated visual system suggest that human errors can be explained by interaction among a target's peripheral visibility, eye movement under-exploration of the 3D images, and a perceived overestimation of the explored area. Instructing observers to extend the search reduces 75% of the small target misses without increasing false positives. Results with twelve radiologists confirm that even medical professionals reading realistic breast phantoms have high miss rates for small targets in 3D search. Thus, under-exploration represents a fundamental limitation to the efficacy with which humans search in 3D image stacks and miss targets with these prevalent image technologies.
Collapse
Affiliation(s)
- Miguel A Lago
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA
| | - Aditya Jonnalagadda
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA
| | - Craig K Abbey
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA
| | - Bruno B Barufaldi
- Department of Radiology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Predrag R Bakic
- Department of Radiology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Andrew D A Maidment
- Department of Radiology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Winifred K Leung
- Ridley-Tree Cancer Center, Sansum Clinic, 540 W. Pueblo Street, Santa Barbara, CA 93105, USA
| | - Susan P Weinstein
- Department of Radiology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Brian S Englander
- Department of Radiology, University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA
| | - Miguel P Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA 93106, USA.
| |
Collapse
|
20
|
Meghdadi AH, Giesbrecht B, Eckstein MP. EEG signatures of contextual influences on visual search with real scenes. Exp Brain Res 2021; 239:797-809. [PMID: 33398454 DOI: 10.1007/s00221-020-05984-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 11/07/2020] [Indexed: 01/23/2023]
Abstract
The use of scene context is a powerful way by which biological organisms guide and facilitate visual search. Although many studies have shown enhancements of target-related electroencephalographic activity (EEG) with synthetic cues, there have been fewer studies demonstrating such enhancements during search with scene context and objects in real world scenes. Here, observers covertly searched for a target in images of real scenes while we used EEG to measure the steady state visual evoked response to objects flickering at different frequencies. The target appeared in its typical contextual location or out of context while we controlled for low-level properties of the image including target saliency against the background and retinal eccentricity. A pattern classifier using EEG activity at the relevant modulated frequencies showed target detection accuracy increased when the target was in a contextually appropriate location. A control condition for which observers searched the same images for a different target orthogonal to the contextual manipulation, resulted in no effects of scene context on classifier performance, confirming that image properties cannot explain the contextual modulations of neural activity. Pattern classifier decisions for individual images were also related to the aggregated observer behavioral decisions for individual images. Together, these findings demonstrate target-related neural responses are modulated by scene context during visual search with real world scenes and can be related to behavioral search decisions.
Collapse
Affiliation(s)
- Amir H Meghdadi
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93106-9660, USA.
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, 93106-5100, USA.
| | - Barry Giesbrecht
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93106-9660, USA
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, 93106-5100, USA
- Interdepartmental Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA, 93106-5100, USA
| | - Miguel P Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93106-9660, USA
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, 93106-5100, USA
- Interdepartmental Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, Santa Barbara, CA, 93106-5100, USA
| |
Collapse
|
21
|
Sagar V, Sengupta R, Sridharan D. Dissociable sensitivity and bias mechanisms mediate behavioral effects of exogenous attention. Sci Rep 2019; 9:12657. [PMID: 31477747 PMCID: PMC6718663 DOI: 10.1038/s41598-019-42759-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 04/08/2019] [Indexed: 11/24/2022] Open
Abstract
Attention can be directed endogenously, based on task-relevant goals, or captured exogenously, by salient stimuli. While recent studies have shown that endogenous attention can facilitate behavior through dissociable sensitivity (sensory) and choice bias (decisional) mechanisms, it is unknown if exogenous attention also operates through dissociable sensitivity and bias mechanisms. We tested human participants on a multialternative change detection task with exogenous attention cues, which preceded or followed change events in close temporal proximity. Analyzing participants’ behavior with a multidimensional signal detection model revealed clear dissociations between exogenous cueing effects on sensitivity and bias. While sensitivity was, overall, lower at the cued location compared to other locations, bias was highest at the cued location. With an appropriately designed post-cue control condition, we discovered that the attentional effect of exogenous pre-cueing was to enhance sensitivity proximal to the cue. In contrast, exogenous attention enhanced bias even for distal stimuli in the cued hemifield. Reaction time effects of exogenous cueing could be parsimoniously explained with a diffusion-decision model, in which drift rate was determined by independent contributions from sensitivity and bias at each location. The results suggest a mechanistic schema of how exogenous attention engages dissociable sensitivity and bias mechanisms to shape behavior.
Collapse
Affiliation(s)
- Vishak Sagar
- Centre for Neuroscience, Indian Institute of Science, C. V. Raman Avenue, Bangalore, 560012, India
| | - Ranit Sengupta
- Centre for Neuroscience, Indian Institute of Science, C. V. Raman Avenue, Bangalore, 560012, India
| | - Devarajan Sridharan
- Centre for Neuroscience, Indian Institute of Science, C. V. Raman Avenue, Bangalore, 560012, India.
| |
Collapse
|
22
|
Smith KG, Schmidt J, Wang B, Henderson JM, Fridriksson J. Task-Related Differences in Eye Movements in Individuals With Aphasia. Front Psychol 2018; 9:2430. [PMID: 30618911 PMCID: PMC6305326 DOI: 10.3389/fpsyg.2018.02430] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 11/19/2018] [Indexed: 11/25/2022] Open
Abstract
Background: Neurotypical young adults show task-based modulation and stability of their eye movements across tasks. This study aimed to determine whether persons with aphasia (PWA) modulate their eye movements and show stability across tasks similarly to control participants. Methods: Forty-eight PWA and age-matched control participants completed four eye-tracking tasks: scene search, scene memorization, text-reading, and pseudo-reading. Results: Main effects of task emerged for mean fixation duration, saccade amplitude, and standard deviations of each, demonstrating task-based modulation of eye movements. Group by task interactions indicated that PWA produced shorter fixations relative to controls. This effect was most pronounced for scene memorization and for individuals who recently suffered a stroke. PWA produced longer fixations, shorter saccades, and less variable eye movements in reading tasks compared to controls. Three-way interactions of group, aphasia subtype, and task also emerged. Text-reading and scene memorization were particularly effective at distinguishing aphasia subtype. Persons with anomic aphasia showed a reduction in reading saccade amplitudes relative to their respective control group and other PWA. Persons with conduction/Wernicke’s aphasia produced shorter scene memorization fixations relative to controls or PWA of other subtypes, suggesting a memorization specific effect. Positive correlations across most tasks emerged for fixation duration and did not significantly differ between controls and PWA. Conclusion: PWA generally produced shorter fixations and smaller saccades relative to controls particularly in scene memorization and text-reading, respectively. The effect was most pronounced recently after a stroke. Selectively in reading tasks, PWA produced longer fixations and shorter saccades relative to controls, consistent with reading difficulty. PWA showed task-based modulation of eye movements, though the pattern of results was somewhat abnormal relative to controls. All subtypes of PWA also demonstrated task-based modulation of eye movements. However, persons with anomic aphasia showed reduced modulation of saccade amplitude and smaller reading saccades, possibly to improve reading comprehension. Controls and PWA generally produced stabile fixation durations across tasks and did not differ in their relationship across tasks. Overall, these results suggest there is potential to differentiate among PWA with varying subtypes and from controls using eye movement measures of task-based modulation, especially reading and scene memorization tasks.
Collapse
Affiliation(s)
- Kimberly G Smith
- Department of Speech Pathology & Audiology, University of South Alabama, Mobile, AL, United States.,Department of Communication Sciences & Disorders, University of South Carolina, Columbia, SC, United States
| | - Joseph Schmidt
- Department of Psychology, University of Central Florida, Orlando, FL, United States
| | - Bin Wang
- Department of Mathematics and Statistics, University of South Alabama, Mobile, AL, United States
| | - John M Henderson
- Department of Psychology, Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Julius Fridriksson
- Department of Communication Sciences & Disorders, University of South Carolina, Columbia, SC, United States
| |
Collapse
|
23
|
Akbas E, Eckstein MP. Object detection through search with a foveated visual system. PLoS Comput Biol 2017; 13:e1005743. [PMID: 28991906 PMCID: PMC5669499 DOI: 10.1371/journal.pcbi.1005743] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2016] [Revised: 11/03/2017] [Accepted: 08/26/2017] [Indexed: 11/18/2022] Open
Abstract
Humans and many other species sense visual information with varying spatial resolution across the visual field (foveated vision) and deploy eye movements to actively sample regions of interests in scenes. The advantage of such varying resolution architecture is a reduced computational, hence metabolic cost. But what are the performance costs of such processing strategy relative to a scheme that processes the visual field at high spatial resolution? Here we first focus on visual search and combine object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We develop a foveated object detector that processes the entire scene with varying resolution, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. We compared the foveated object detector against a non-foveated version of the same object detector which processes the entire image at homogeneous high spatial resolution. We evaluated the accuracy of the foveated and non-foveated object detectors identifying 20 different objects classes in scenes from a standard computer vision data set (the PASCAL VOC 2007 dataset). We show that the foveated object detector can approximate the performance of the object detector with homogeneous high spatial resolution processing while bringing significant computational cost savings. Additionally, we assessed the impact of foveation on the computation of bottom-up saliency. An implementation of a simple foveated bottom-up saliency model with eye movements showed agreement in the selection of top salient regions of scenes with those selected by a non-foveated high resolution saliency model. Together, our results might help explain the evolution of foveated visual systems with eye movements as a solution that preserves perceptual performance in visual search while resulting in computational and metabolic savings to the brain.
Collapse
Affiliation(s)
- Emre Akbas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, California, United States of America
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Miguel P. Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, California, United States of America
- Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, California, United States of America
| |
Collapse
|
24
|
Eckstein MP, Koehler K, Welbourne LE, Akbas E. Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes. Curr Biol 2017; 27:2827-2832.e3. [PMID: 28889976 DOI: 10.1016/j.cub.2017.07.068] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Revised: 07/21/2017] [Accepted: 07/28/2017] [Indexed: 11/26/2022]
Abstract
Even with great advances in machine vision, animals are still unmatched in their ability to visually search complex scenes. Animals from bees [1, 2] to birds [3] to humans [4-12] learn about the statistical relations in visual environments to guide and aid their search for targets. Here, we investigate a novel manner in which humans utilize rapidly acquired information about scenes by guiding search toward likely target sizes. We show that humans often miss targets when their size is inconsistent with the rest of the scene, even when the targets were made larger and more salient and observers fixated the target. In contrast, we show that state-of-the-art deep neural networks do not exhibit such deficits in finding mis-scaled targets but, unlike humans, can be fooled by target-shaped distractors that are inconsistent with the expected target's size within the scene. Thus, it is not a human deficiency to miss targets when they are inconsistent in size with the scene; instead, it is a byproduct of a useful strategy that the brain has implemented to rapidly discount potential distractors.
Collapse
Affiliation(s)
- Miguel P Eckstein
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA.
| | - Kathryn Koehler
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA
| | - Lauren E Welbourne
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA; Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA
| | - Emre Akbas
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, 93103, USA; Department of Computer Engineering, Middle East Technical University, 06800 Ankara, Turkey
| |
Collapse
|