1
|
Bohil CJ, Phelps A, Neider MB, Schmidt J. Explicit and implicit category learning in categorical visual search. Atten Percept Psychophys 2023; 85:2131-2149. [PMID: 37784002 DOI: 10.3758/s13414-023-02789-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2023] [Indexed: 10/04/2023]
Abstract
Categorical search has been heavily investigated over the past decade, mostly using natural categories that leave the underlying category mental representation unknown. The categorization literature offers several theoretical accounts of category mental representations. One prominent account is that separate learning systems account for classification: an explicit learning system that relies on easily verbalized rules and an implicit learning system that relies on an associatively learned (nonverbalizable) information integration strategy. The current study assessed the contributions of these separate category learning systems in the context of categorical search using simple stimuli. Participants learned to classify sinusoidal grating stimuli according to explicit or implicit categorization strategies, followed by a categorical search task using these same stimulus categories. Computational modeling determined which participants used the appropriate classification strategy during training and search, and eye movements collected during categorical search were assessed. We found that the trained categorization strategies overwhelmingly transferred to the verification (classification response) phase of search. Implicit category learning led to faster search response and shorter target dwell times relative to explicit category learning, consistent with the notion that explicit rule classification relies on a more deliberative response strategy. Participants who transferred the correct category learning strategy to the search guidance phase produced stronger search guidance (defined as the proportion of trials on which the target was the first item fixated) with evidence of greater guidance in implicit-strategy learners. This demonstrates that both implicit and explicit categorization systems contribute to categorical search and produce dissociable patterns of data.
Collapse
Affiliation(s)
- Corey J Bohil
- Department of Psychology, University of Central Florida, Orlando, FL, USA.
- Lawrence Technological University, 21000 West Ten Mile Road, Southfield, MI, 48075, USA.
| | - Ashley Phelps
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| | - Mark B Neider
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| | - Joseph Schmidt
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| |
Collapse
|
2
|
Sakata C, Ueda Y, Moriguchi Y. Visual memory of a co-actor's target during joint search. PSYCHOLOGICAL RESEARCH 2023; 87:2068-2085. [PMID: 36976364 PMCID: PMC10043510 DOI: 10.1007/s00426-023-01819-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 03/17/2023] [Indexed: 03/29/2023]
Abstract
Studies on joint action show that when two actors turn-takingly attend to each other's target that appears one at a time, a partner's target is accumulated in memory. However, in the real world, actors may not be certain that they attend to the same object because multiple objects often appear simultaneously. In this study, we asked participant pairs to search for different targets in parallel from multiple objects and investigated the memory of a partner's target. We employed the contextual cueing paradigm, in which repetitive search forms associative memory between a target and a configuration of distractors that facilitates search. During the learning phase, exemplars of three target categories (i.e., bird, shoe, and tricycle) were presented among unique objects, and participant pairs searched for them. In Experiment 1, it was followed by a memory test about target exemplars. Consequently, the partner's target was better recognized than the target that nobody searched for. In Experiments 2a and 2b, the memory test was replaced with the transfer phase, where one individual from the pair searched for the category that nobody had searched for while the other individual searched for the category the partner had searched for in the learning phase. The transfer phase did not show search facilitation underpinned by associative memory between the partner's target and distractors. These results suggest that when participant pairs search for different targets in parallel, they accumulate the partner's target in memory but may not form its associative memory with the distractors that facilitates its search.
Collapse
Affiliation(s)
- Chifumi Sakata
- Graduate School of Letters, Kyoto University, Yoshida Hon-Machi, Sakyo-Ku, Kyoto, 606-8501, Japan.
| | - Yoshiyuki Ueda
- Institute for the Future of Human Society, Kyoto University, 46 Yoshida Shimoadachi-Cho, Sakyo-Ku, Kyoto, 606-8501, Japan
| | - Yusuke Moriguchi
- Graduate School of Letters, Kyoto University, Yoshida Hon-Machi, Sakyo-Ku, Kyoto, 606-8501, Japan
| |
Collapse
|
3
|
Phelps AM, Alexander RG, Schmidt J. Negative cues minimize visual search specificity effects. Vision Res 2022; 196:108030. [PMID: 35313163 PMCID: PMC9090971 DOI: 10.1016/j.visres.2022.108030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 03/02/2022] [Accepted: 03/08/2022] [Indexed: 11/28/2022]
Abstract
Prior target knowledge (i.e., positive cues) improves visual search performance. However, there is considerable debate about whether distractor knowledge (i.e., negative cues) can guide search. Some studies suggest the active suppression of negatively cued search items, while others suggest the initial capture of attention by negatively cued items. Prior work has used pictorial or specific text cues but has not explicitly compared them. We build on that work by comparing positive and negative cues presented pictorially and as categorical text labels using photorealistic objects and eye movement measures. Search displays contained a target (cued on positive trials), a lure from the target category (cued on negative trials), and four categorically-unrelated distractors. Search performance with positive cues resulted in stronger attentional guidance and faster object recognition for pictorial relative to categorical cues (i.e., a pictorial advantage, suggesting specific visual details afforded by pictorial cues improved search). However, in most search performance metrics, negative cues mitigate the pictorial advantage. Given that the negatively cued items captured attention, generated target guidance but mitigated the pictorial advantage, these results are partly consistent with both existing theories. Specific visual details provided in positive cues produce a large pictorial advantage in all measures, whereas specific visual details in negative cues only produce a small pictorial advantage for object recognition but not for attentional guidance. This asymmetry in the pictorial advantage suggests that the down-weighting of specific negatively cued visual features is less efficient than the up-weighting of specific positively cued visual features.
Collapse
Affiliation(s)
- Ashley M Phelps
- Department of Psychology, University of Central Florida, Orlando, FL, USA
| | - Robert G Alexander
- Departments of Ophthalmology, Neurology, and Physiology & Pharmacology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Joseph Schmidt
- Department of Psychology, University of Central Florida, Orlando, FL, USA.
| |
Collapse
|
4
|
Killingsworth CD, Bohil CJ. Breast Tissue Density Influences Tumor Malignancy Perception and Decisions in Mammography. JOURNAL OF APPLIED RESEARCH IN MEMORY AND COGNITION 2021. [DOI: 10.1016/j.jarmac.2021.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
5
|
Chen Y, Yang Z, Ahn S, Samaras D, Hoai M, Zelinsky G. COCO-Search18 fixation dataset for predicting goal-directed attention control. Sci Rep 2021; 11:8776. [PMID: 33888734 PMCID: PMC8062491 DOI: 10.1038/s41598-021-87715-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 03/31/2021] [Indexed: 11/23/2022] Open
Abstract
Attention control is a basic behavioral process that has been studied for decades. The currently best models of attention control are deep networks trained on free-viewing behavior to predict bottom-up attention control - saliency. We introduce COCO-Search18, the first dataset of laboratory-quality goal-directed behavior large enough to train deep-network models. We collected eye-movement behavior from 10 people searching for each of 18 target-object categories in 6202 natural-scene images, yielding [Formula: see text] 300,000 search fixations. We thoroughly characterize COCO-Search18, and benchmark it using three machine-learning methods: a ResNet50 object detector, a ResNet50 trained on fixation-density maps, and an inverse-reinforcement-learning model trained on behavioral search scanpaths. Models were also trained/tested on images transformed to approximate a foveated retina, a fundamental biological constraint. These models, each having a different reliance on behavioral training, collectively comprise the new state-of-the-art in predicting goal-directed search fixations. Our expectation is that future work using COCO-Search18 will far surpass these initial efforts, finding applications in domains ranging from human-computer interactive systems that can anticipate a person's intent and render assistance to the potentially early identification of attention-related clinical disorders (ADHD, PTSD, phobia) based on deviation from neurotypical fixation behavior.
Collapse
Affiliation(s)
- Yupei Chen
- Department of Psychology, Stony Brook University, New York, USA
| | - Zhibo Yang
- Department of Computer Science, Stony Brook University, New York, USA
| | - Seoyoung Ahn
- Department of Psychology, Stony Brook University, New York, USA
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, New York, USA
| | - Minh Hoai
- Department of Computer Science, Stony Brook University, New York, USA
| | - Gregory Zelinsky
- Department of Psychology, Stony Brook University, New York, USA.
- Department of Computer Science, Stony Brook University, New York, USA.
| |
Collapse
|
6
|
Zelinsky GJ, Chen Y, Ahn S, Adeli H, Yang Z, Huang L, Samaras D, Hoai M. Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning. NEURONS, BEHAVIOR, DATA ANALYSIS, AND THEORY 2021; 2021:10.51628/001c.22322. [PMID: 34164631 PMCID: PMC8218820 DOI: 10.51628/001c.22322] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Understanding how goals control behavior is a question ripe for interrogation by new methods from machine learning. These methods require large and labeled datasets to train models. To annotate a large-scale image dataset with observed search fixations, we collected 16,184 fixations from people searching for either microwaves or clocks in a dataset of 4,366 images (MS-COCO). We then used this behaviorally-annotated dataset and the machine learning method of inverse-reinforcement learning (IRL) to learn target-specific reward functions and policies for these two target goals. Finally, we used these learned policies to predict the fixations of 60 new behavioral searchers (clock = 30, microwave = 30) in a disjoint test dataset of kitchen scenes depicting both a microwave and a clock (thus controlling for differences in low-level image contrast). We found that the IRL model predicted behavioral search efficiency and fixation-density maps using multiple metrics. Moreover, reward maps from the IRL model revealed target-specific patterns that suggest, not just attention guidance by target features, but also guidance by scene context (e.g., fixations along walls in the search of clocks). Using machine learning and the psychologically meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.
Collapse
Affiliation(s)
- Gregory J. Zelinsky
- Department of Psychology, Stony Brook University, Stony Brook, NY, 11794, USA
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Yupei Chen
- Department of Psychology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Seoyoung Ahn
- Department of Psychology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Hossein Adeli
- Department of Psychology, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Zhibo Yang
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Lihan Huang
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Dimitrios Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Minh Hoai
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| |
Collapse
|
7
|
Alexander RG, Waite S, Macknik SL, Martinez-Conde S. What do radiologists look for? Advances and limitations of perceptual learning in radiologic search. J Vis 2020; 20:17. [PMID: 33057623 PMCID: PMC7571277 DOI: 10.1167/jov.20.10.17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 09/14/2020] [Indexed: 12/31/2022] Open
Abstract
Supported by guidance from training during residency programs, radiologists learn clinically relevant visual features by viewing thousands of medical images. Yet the precise visual features that expert radiologists use in their clinical practice remain unknown. Identifying such features would allow the development of perceptual learning training methods targeted to the optimization of radiology training and the reduction of medical error. Here we review attempts to bridge current gaps in understanding with a focus on computational saliency models that characterize and predict gaze behavior in radiologists. There have been great strides toward the accurate prediction of relevant medical information within images, thereby facilitating the development of novel computer-aided detection and diagnostic tools. In some cases, computational models have achieved equivalent sensitivity to that of radiologists, suggesting that we may be close to identifying the underlying visual representations that radiologists use. However, because the relevant bottom-up features vary across task context and imaging modalities, it will also be necessary to identify relevant top-down factors before perceptual expertise in radiology can be fully understood. Progress along these dimensions will improve the tools available for educating new generations of radiologists, and aid in the detection of medically relevant information, ultimately improving patient health.
Collapse
Affiliation(s)
- Robert G Alexander
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Stephen Waite
- Department of Radiology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Stephen L Macknik
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Susana Martinez-Conde
- Department of Ophthalmology, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
8
|
Clayden AC, Fisher RB, Nuthmann A. On the relative (un)importance of foveal vision during letter search in naturalistic scenes. Vision Res 2020; 177:41-55. [PMID: 32957035 DOI: 10.1016/j.visres.2020.07.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 07/10/2020] [Accepted: 07/13/2020] [Indexed: 11/26/2022]
Abstract
The importance of high-acuity foveal vision to visual search can be assessed by denying foveal vision using the gaze-contingent Moving Mask technique. Foveal vision was necessary to attain normal performance when searching for a target letter in alphanumeric displays, Perception & Psychophysics, 62 (2000) 576-585. In contrast, foveal vision was not necessary to correctly locate and identify medium-sized target objects in natural scenes, Journal of Experimental Psychology: Human Perception and Performance, 40 (2014) 342-360. To explore these task differences, we used grayscale pictures of real-world scenes which included a target letter (Experiment 1: T, Experiment 2: T or L). To reduce between-scene variability with regard to target salience, we developed the Target Embedding Algorithm (T.E.A.) to place the letter in a location for which there was a median change in local contrast when inserting the letter into the scene. The presence or absence of foveal vision was crossed with four target sizes. In both experiments, search performance decreased for smaller targets, and was impaired when searching the scene without foveal vision. For correct trials, the process of target localization remained completely unimpaired by the foveal scotoma, but it took longer to accept the target. We reasoned that the size of the target may affect the importance of foveal vision to the task, but the present data remain ambiguous. In summary, the data highlight the importance of extrafoveal vision for target localization, and the importance of foveal vision for target verification during letter-in-scene search.
Collapse
Affiliation(s)
- Adam C Clayden
- Psychology Department, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, UK; School of Engineering, Arts, Science and Technology, University of Suffolk, UK
| | | | - Antje Nuthmann
- Psychology Department, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, UK; Institute of Psychology, University of Kiel, Germany.
| |
Collapse
|
9
|
Yang Z, Huang L, Chen Y, Wei Z, Ahn S, Zelinsky G, Samaras D, Hoai M. Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning. PROCEEDINGS. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION 2020; 2020:190-199. [PMID: 34163124 PMCID: PMC8218821 DOI: 10.1109/cvpr42600.2020.00027] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.
Collapse
|
10
|
Changing perspectives on goal-directed attention control: The past, present, and future of modeling fixations during visual search. PSYCHOLOGY OF LEARNING AND MOTIVATION 2020. [DOI: 10.1016/bs.plm.2020.08.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
11
|
Yu CP, Liu H, Samaras D, Zelinsky GJ. Modelling attention control using a convolutional neural network designed after the ventral visual pathway. VISUAL COGNITION 2019. [DOI: 10.1080/13506285.2019.1661927] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Chen-Ping Yu
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
- Department of Psychology, Harvard University, Cambridge, MA, USA
| | - Huidong Liu
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Dimitrios Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Gregory J. Zelinsky
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
- Department of Psychology, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
12
|
Alexander RG, Nahvi RJ, Zelinsky GJ. Specifying the precision of guiding features for visual search. J Exp Psychol Hum Percept Perform 2019; 45:1248-1264. [PMID: 31219282 PMCID: PMC6706321 DOI: 10.1037/xhp0000668] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Visual search is the task of finding things with uncertain locations. Despite decades of research, the features that guide visual search remain poorly specified, especially in realistic contexts. This study tested the role of two features-shape and orientation-both in the presence and absence of hue information. We conducted five experiments to describe preview-target mismatch effects, decreases in performance caused by differences between the image of the target as it appears in the preview and as it appears in the actual search display. These mismatch effects provide direct measures of feature importance, with larger performance decrements expected for more important features. Contrary to previous conclusions, our data suggest that shape and orientation only guide visual search when color is not available. By varying the probability of mismatch in each feature dimension, we also show that these patterns of feature guidance do not change with the probability that the previewed feature will be invalid. We conclude that the target representations used to guide visual search are much less precise than previously believed, with participants encoding and using color and little else. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
|
13
|
Abstract
We proposed to abandon the item as conceptual unit in visual search and adopt a fixation-based framework instead. We treat various themes raised by our commentators, including the nature of the Functional Visual Field and existing similar ideas, alongside the importance of items, covert attention, and top-down/contextual influences. We reflect on the current state of, and future directions for, visual search.
Collapse
|
14
|
Smith KG, Schmidt J, Wang B, Henderson JM, Fridriksson J. Task-Related Differences in Eye Movements in Individuals With Aphasia. Front Psychol 2018; 9:2430. [PMID: 30618911 PMCID: PMC6305326 DOI: 10.3389/fpsyg.2018.02430] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 11/19/2018] [Indexed: 11/25/2022] Open
Abstract
Background: Neurotypical young adults show task-based modulation and stability of their eye movements across tasks. This study aimed to determine whether persons with aphasia (PWA) modulate their eye movements and show stability across tasks similarly to control participants. Methods: Forty-eight PWA and age-matched control participants completed four eye-tracking tasks: scene search, scene memorization, text-reading, and pseudo-reading. Results: Main effects of task emerged for mean fixation duration, saccade amplitude, and standard deviations of each, demonstrating task-based modulation of eye movements. Group by task interactions indicated that PWA produced shorter fixations relative to controls. This effect was most pronounced for scene memorization and for individuals who recently suffered a stroke. PWA produced longer fixations, shorter saccades, and less variable eye movements in reading tasks compared to controls. Three-way interactions of group, aphasia subtype, and task also emerged. Text-reading and scene memorization were particularly effective at distinguishing aphasia subtype. Persons with anomic aphasia showed a reduction in reading saccade amplitudes relative to their respective control group and other PWA. Persons with conduction/Wernicke’s aphasia produced shorter scene memorization fixations relative to controls or PWA of other subtypes, suggesting a memorization specific effect. Positive correlations across most tasks emerged for fixation duration and did not significantly differ between controls and PWA. Conclusion: PWA generally produced shorter fixations and smaller saccades relative to controls particularly in scene memorization and text-reading, respectively. The effect was most pronounced recently after a stroke. Selectively in reading tasks, PWA produced longer fixations and shorter saccades relative to controls, consistent with reading difficulty. PWA showed task-based modulation of eye movements, though the pattern of results was somewhat abnormal relative to controls. All subtypes of PWA also demonstrated task-based modulation of eye movements. However, persons with anomic aphasia showed reduced modulation of saccade amplitude and smaller reading saccades, possibly to improve reading comprehension. Controls and PWA generally produced stabile fixation durations across tasks and did not differ in their relationship across tasks. Overall, these results suggest there is potential to differentiate among PWA with varying subtypes and from controls using eye movement measures of task-based modulation, especially reading and scene memorization tasks.
Collapse
Affiliation(s)
- Kimberly G Smith
- Department of Speech Pathology & Audiology, University of South Alabama, Mobile, AL, United States.,Department of Communication Sciences & Disorders, University of South Carolina, Columbia, SC, United States
| | - Joseph Schmidt
- Department of Psychology, University of Central Florida, Orlando, FL, United States
| | - Bin Wang
- Department of Mathematics and Statistics, University of South Alabama, Mobile, AL, United States
| | - John M Henderson
- Department of Psychology, Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Julius Fridriksson
- Department of Communication Sciences & Disorders, University of South Carolina, Columbia, SC, United States
| |
Collapse
|
15
|
Alexander RG, Zelinsky GJ. Occluded information is restored at preview but not during visual search. J Vis 2018; 18:4. [PMID: 30347091 PMCID: PMC6181188 DOI: 10.1167/18.11.4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 07/26/2018] [Indexed: 11/30/2022] Open
Abstract
Objects often appear with some amount of occlusion. We fill in missing information using local shape features even before attending to those objects-a process called amodal completion. Here we explore the possibility that knowledge about common realistic objects can be used to "restore" missing information even in cases where amodal completion is not expected. We systematically varied whether visual search targets were occluded or not, both at preview and in search displays. Button-press responses were longest when the preview was unoccluded and the target was occluded in the search display. This pattern is consistent with a target-verification process that uses the features visible at preview but does not restore missing information in the search display. However, visual search guidance was weakest whenever the target was occluded in the search display, regardless of whether it was occluded at preview. This pattern suggests that information missing during the preview was restored and used to guide search, thereby resulting in a feature mismatch and poor guidance. If this process were preattentive, as with amodal completion, we should have found roughly equivalent search guidance across all conditions because the target would always be unoccluded or restored, resulting in no mismatch. We conclude that realistic objects are restored behind occluders during search target preview, even in situations not prone to amodal completion, and this restoration does not occur preattentively during search.
Collapse
Affiliation(s)
| | - Gregory J Zelinsky
- Department of Psychology, Stony Brook University, Stony Brook, NY, USA
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
16
|
Mishler AD, Neider MB. Redundancy gain for categorical targets depends on display configuration and duration. VISUAL COGNITION 2018. [DOI: 10.1080/13506285.2018.1470587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
- Ada D. Mishler
- Psychology Department, University of Central Florida, Orlando, Florida, USA
| | - Mark B. Neider
- Psychology Department, University of Central Florida, Orlando, Florida, USA
| |
Collapse
|
17
|
A Model of the Superior Colliculus Predicts Fixation Locations during Scene Viewing and Visual Search. J Neurosci 2016; 37:1453-1467. [PMID: 28039373 DOI: 10.1523/jneurosci.0825-16.2016] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Revised: 11/21/2016] [Accepted: 12/01/2016] [Indexed: 11/21/2022] Open
Abstract
Modern computational models of attention predict fixations using saliency maps and target maps, which prioritize locations for fixation based on feature contrast and target goals, respectively. But whereas many such models are biologically plausible, none have looked to the oculomotor system for design constraints or parameter specification. Conversely, although most models of saccade programming are tightly coupled to underlying neurophysiology, none have been tested using real-world stimuli and tasks. We combined the strengths of these two approaches in MASC, a model of attention in the superior colliculus (SC) that captures known neurophysiological constraints on saccade programming. We show that MASC predicted the fixation locations of humans freely viewing naturalistic scenes and performing exemplar and categorical search tasks, a breadth achieved by no other existing model. Moreover, it did this as well or better than its more specialized state-of-the-art competitors. MASC's predictive success stems from its inclusion of high-level but core principles of SC organization: an over-representation of foveal information, size-invariant population codes, cascaded population averaging over distorted visual and motor maps, and competition between motor point images for saccade programming, all of which cause further modulation of priority (attention) after projection of saliency and target maps to the SC. Only by incorporating these organizing brain principles into our models can we fully understand the transformation of complex visual information into the saccade programs underlying movements of overt attention. With MASC, a theoretical footing now exists to generate and test computationally explicit predictions of behavioral and neural responses in visually complex real-world contexts.SIGNIFICANCE STATEMENT The superior colliculus (SC) performs a visual-to-motor transformation vital to overt attention, but existing SC models cannot predict saccades to visually complex real-world stimuli. We introduce a brain-inspired SC model that outperforms state-of-the-art image-based competitors in predicting the sequences of fixations made by humans performing a range of everyday tasks (scene viewing and exemplar and categorical search), making clear the value of looking to the brain for model design. This work is significant in that it will drive new research by making computationally explicit predictions of SC neural population activity in response to naturalistic stimuli and tasks. It will also serve as a blueprint for the construction of other brain-inspired models, helping to usher in the next generation of truly intelligent autonomous systems.
Collapse
|
18
|
Infrared Human Posture Recognition Method for Monitoring in Smart Homes Based on Hidden Markov Model. SUSTAINABILITY 2016. [DOI: 10.3390/su8090892] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
19
|
Yu CP, Maxfield JT, Zelinsky GJ. Searching for Category-Consistent Features: A Computational Approach to Understanding Visual Category Representation. Psychol Sci 2016; 27:870-84. [PMID: 27142461 DOI: 10.1177/0956797616640237] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 02/29/2016] [Indexed: 11/17/2022] Open
Abstract
This article introduces a generative model of category representation that uses computer vision methods to extract category-consistent features (CCFs) directly from images of category exemplars. The model was trained on 4,800 images of common objects, and CCFs were obtained for 68 categories spanning subordinate, basic, and superordinate levels in a category hierarchy. When participants searched for these same categories, targets cued at the subordinate level were preferentially fixated, but fixated targets were verified faster when they followed a basic-level cue. The subordinate-level advantage in guidance is explained by the number of target-category CCFs, a measure of category specificity that decreases with movement up the category hierarchy. The basic-level advantage in verification is explained by multiplying the number of CCFs by sibling distance, a measure of category distinctiveness. With this model, the visual representations of real-world object categories, each learned from the vast numbers of image exemplars accumulated throughout everyday experience, can finally be studied.
Collapse
Affiliation(s)
| | | | - Gregory J Zelinsky
- Department of Computer Science Department of Psychology, Stony Brook University
| |
Collapse
|
20
|
Hout MC, Godwin HJ, Fitzsimmons G, Robbins A, Menneer T, Goldinger SD. Using multidimensional scaling to quantify similarity in visual search and beyond. Atten Percept Psychophys 2016; 78:3-20. [PMID: 26494381 PMCID: PMC5523409 DOI: 10.3758/s13414-015-1010-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Visual search is one of the most widely studied topics in vision science, both as an independent topic of interest, and as a tool for studying attention and visual cognition. A wide literature exists that seeks to understand how people find things under varying conditions of difficulty and complexity, and in situations ranging from the mundane (e.g., looking for one's keys) to those with significant societal importance (e.g., baggage or medical screening). A primary determinant of the ease and probability of success during search are the similarity relationships that exist in the search environment, such as the similarity between the background and the target, or the likeness of the non-targets to one another. A sense of similarity is often intuitive, but it is seldom quantified directly. This presents a problem in that similarity relationships are imprecisely specified, limiting the capacity of the researcher to examine adequately their influence. In this article, we present a novel approach to overcoming this problem that combines multi-dimensional scaling (MDS) analyses with behavioral and eye-tracking measurements. We propose a method whereby MDS can be repurposed to successfully quantify the similarity of experimental stimuli, thereby opening up theoretical questions in visual search and attention that cannot currently be addressed. These quantifications, in conjunction with behavioral and oculomotor measures, allow for critical observations about how similarity affects performance, information selection, and information processing. We provide a demonstration and tutorial of the approach, identify documented examples of its use, discuss how complementary computer vision methods could also be adopted, and close with a discussion of potential avenues for future application of this technique.
Collapse
|
21
|
Overt attention in natural scenes: Objects dominate features. Vision Res 2015; 107:36-48. [DOI: 10.1016/j.visres.2014.11.006] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Revised: 11/04/2014] [Accepted: 11/10/2014] [Indexed: 11/18/2022]
|
22
|
Maxfield JT, Stalder WD, Zelinsky GJ. Effects of target typicality on categorical search. J Vis 2014; 14:1. [PMID: 25274990 PMCID: PMC4181372 DOI: 10.1167/14.12.1] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 07/25/2014] [Indexed: 11/24/2022] Open
Abstract
The role of target typicality in a categorical visual search task was investigated by cueing observers with a target name, followed by a five-item target present/absent search array in which the target images were rated in a pretest to be high, medium, or low in typicality with respect to the basic-level target cue. Contrary to previous work, we found that search guidance was better for high-typicality targets compared to low-typicality targets, as measured by both the proportion of immediate target fixations and the time to fixate the target. Consistent with previous work, we also found an effect of typicality on target verification times, the time between target fixation and the search judgment; as target typicality decreased, verification times increased. To model these typicality effects, we trained Support Vector Machine (SVM) classifiers on the target categories, and tested these on the corresponding specific targets used in the search task. This analysis revealed significant differences in classifier confidence between the high-, medium-, and low-typicality groups, paralleling the behavioral results. Collectively, these findings suggest that target typicality broadly affects both search guidance and verification, and that differences in typicality can be predicted by distance from an SVM classification boundary.
Collapse
Affiliation(s)
| | | | - Gregory J. Zelinsky
- Department of Psychology, Stony Brook University, Stony Brook, NY
- Department of Computer Science, Stony Brook University, Stony Brook, NY
| |
Collapse
|
23
|
Schmidt J, MacNamara A, Proudfit GH, Zelinsky GJ. More target features in visual working memory leads to poorer search guidance: evidence from contralateral delay activity. J Vis 2014; 14:8. [PMID: 24599946 DOI: 10.1167/14.3.8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The visual-search literature has assumed that the top-down target representation used to guide search resides in visual working memory (VWM). We directly tested this assumption using contralateral delay activity (CDA) to estimate the VWM load imposed by the target representation. In Experiment 1, observers previewed four photorealistic objects and were cued to remember the two objects appearing to the left or right of central fixation; Experiment 2 was identical except that observers previewed two photorealistic objects and were cued to remember one. CDA was measured during a delay following preview offset but before onset of a four-object search array. One of the targets was always present, and observers were asked to make an eye movement to it and press a button. We found lower magnitude CDA on trials when the initial search saccade was directed to the target (strong guidance) compared to when it was not (weak guidance). This difference also tended to be larger shortly before search-display onset and was largely unaffected by VWM item-capacity limits or number of previews. Moreover, the difference between mean strong- and weak-guidance CDA was proportional to the increase in search time between mean strong-and weak-guidance trials (as measured by time-to-target and reaction-time difference scores). Contrary to most search models, our data suggest that trials resulting in the maintenance of more target features results in poor search guidance to a target. We interpret these counterintuitive findings as evidence for strong search guidance using a small set of highly discriminative target features that remain after pruning from a larger set of features, with the load imposed on VWM varying with this feature-consolidation process.
Collapse
Affiliation(s)
- Joseph Schmidt
- Institute for Mind and Brain, Department of Psychology, University of South Carolina, Columbia, SC, USA
| | | | | | | |
Collapse
|
24
|
Cognitive control of fixation duration in visual search: The role of extrafoveal processing. VISUAL COGNITION 2014. [DOI: 10.1080/13506285.2014.881443] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
25
|
Abstract
Human vision is an active process in which information is sampled during brief periods of stable fixation in between gaze shifts. Foveal analysis serves to identify the currently fixated object and has to be coordinated with a peripheral selection process of the next fixation location. Models of visual search and scene perception typically focus on the latter, without considering foveal processing requirements. We developed a dual-task noise classification technique that enables identification of the information uptake for foveal analysis and peripheral selection within a single fixation. Human observers had to use foveal vision to extract visual feature information (orientation) from different locations for a psychophysical comparison. The selection of to-be-fixated locations was guided by a different feature (luminance contrast). We inserted noise in both visual features and identified the uptake of information by looking at correlations between the noise at different points in time and behavior. Our data show that foveal analysis and peripheral selection proceeded completely in parallel. Peripheral processing stopped some time before the onset of an eye movement, but foveal analysis continued during this period. Variations in the difficulty of foveal processing did not influence the uptake of peripheral information and the efficacy of peripheral selection, suggesting that foveal analysis and peripheral selection operated independently. These results provide important theoretical constraints on how to model target selection in conjunction with foveal object identification: in parallel and independently.
Collapse
|
26
|
Zelinsky GJ, Adeli H, Peng Y, Samaras D. Modelling eye movements in a categorical search task. Philos Trans R Soc Lond B Biol Sci 2013; 368:20130058. [PMID: 24018720 DOI: 10.1098/rstb.2013.0058] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We introduce a model of eye movements during categorical search, the task of finding and recognizing categorically defined targets. It extends a previous model of eye movements during search (target acquisition model, TAM) by using distances from an support vector machine classification boundary to create probability maps indicating pixel-by-pixel evidence for the target category in search images. Other additions include functionality enabling target-absent searches, and a fixation-based blurring of the search images now based on a mapping between visual and collicular space. We tested this model on images from a previously conducted variable set-size (6/13/20) present/absent search experiment where participants searched for categorically defined teddy bear targets among random category distractors. The model not only captured target-present/absent set-size effects, but also accurately predicted for all conditions the numbers of fixations made prior to search judgements. It also predicted the percentages of first eye movements during search landing on targets, a conservative measure of search guidance. Effects of set size on false negative and false positive errors were also captured, but error rates in general were overestimated. We conclude that visual features discriminating a target category from non-targets can be learned and used to guide eye movements during categorical search.
Collapse
Affiliation(s)
- Gregory J Zelinsky
- Department of Psychology, Stony Brook University, , Stony Brook, NY 11794-2500, USA
| | | | | | | |
Collapse
|