1
|
Jiang C, Chen Z, Wolfe JM. Toward viewing behavior for aerial scene categorization. Cogn Res Princ Implic 2024; 9:17. [PMID: 38530617 PMCID: PMC10965882 DOI: 10.1186/s41235-024-00541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/07/2024] [Indexed: 03/28/2024] Open
Abstract
Previous work has demonstrated similarities and differences between aerial and terrestrial image viewing. Aerial scene categorization, a pivotal visual processing task for gathering geoinformation, heavily depends on rotation-invariant information. Aerial image-centered research has revealed effects of low-level features on performance of various aerial image interpretation tasks. However, there are fewer studies of viewing behavior for aerial scene categorization and of higher-level factors that might influence that categorization. In this paper, experienced subjects' eye movements were recorded while they were asked to categorize aerial scenes. A typical viewing center bias was observed. Eye movement patterns varied among categories. We explored the relationship of nine image statistics to observers' eye movements. Results showed that if the images were less homogeneous, and/or if they contained fewer or no salient diagnostic objects, viewing behavior became more exploratory. Higher- and object-level image statistics were predictive at both the image and scene category levels. Scanpaths were generally organized and small differences in scanpath randomness could be roughly captured by critical object saliency. Participants tended to fixate on critical objects. Image statistics included in this study showed rotational invariance. The results supported our hypothesis that the availability of diagnostic objects strongly influences eye movements in this task. In addition, this study provides supporting evidence for Loschky et al.'s (Journal of Vision, 15(6), 11, 2015) speculation that aerial scenes are categorized on the basis of image parts and individual objects. The findings were discussed in relation to theories of scene perception and their implications for automation development.
Collapse
Affiliation(s)
- Chenxi Jiang
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China
| | - Zhenzhong Chen
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei, China.
- Hubei Luojia Laboratory, Wuhan, Hubei, China.
| | - Jeremy M Wolfe
- Harvard Medical School, Boston, MA, USA
- Brigham & Women's Hospital, Boston, MA, USA
| |
Collapse
|
2
|
Wiesmann SL, Võ MLH. Disentangling diagnostic object properties for human scene categorization. Sci Rep 2023; 13:5912. [PMID: 37041222 PMCID: PMC10090043 DOI: 10.1038/s41598-023-32385-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 03/27/2023] [Indexed: 04/13/2023] Open
Abstract
It usually only takes a single glance to categorize our environment into different scene categories (e.g. a kitchen or a highway). Object information has been suggested to play a crucial role in this process, and some proposals even claim that the recognition of a single object can be sufficient to categorize the scene around it. Here, we tested this claim in four behavioural experiments by having participants categorize real-world scene photographs that were reduced to a single, cut-out object. We show that single objects can indeed be sufficient for correct scene categorization and that scene category information can be extracted within 50 ms of object presentation. Furthermore, we identified object frequency and specificity for the target scene category as the most important object properties for human scene categorization. Interestingly, despite the statistical definition of specificity and frequency, human ratings of these properties were better predictors of scene categorization behaviour than more objective statistics derived from databases of labelled real-world images. Taken together, our findings support a central role of object information during human scene categorization, showing that single objects can be indicative of a scene category if they are assumed to frequently and exclusively occur in a certain environment.
Collapse
Affiliation(s)
- Sandro L Wiesmann
- Department of Psychology, Johann Wolfgang Goethe-Universität, Theodor-W.-Adorno-Platz 6, 60323, Frankfurt Am Main, Germany.
| | - Melissa L-H Võ
- Department of Psychology, Johann Wolfgang Goethe-Universität, Theodor-W.-Adorno-Platz 6, 60323, Frankfurt Am Main, Germany
| |
Collapse
|
3
|
Fan CL, Sokolowski HM, Rosenbaum RS, Levine B. What about "space" is important for episodic memory? WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2023; 14:e1645. [PMID: 36772875 DOI: 10.1002/wcs.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 01/16/2023] [Accepted: 01/18/2023] [Indexed: 02/12/2023]
Abstract
Early cognitive neuroscientific research revealed that the hippocampus is crucial for spatial navigation in rodents, and for autobiographical episodic memory in humans. Researchers quickly linked these streams to propose that the human hippocampus supports memory through its role in representing space, and research on the link between spatial cognition and episodic memory in humans has proliferated over the past several decades. Different researchers apply the term "spatial" in a variety of contexts, however, and it remains unclear what aspect of space may be critical to memory. Similarly, "episodic" has been defined and tested in different ways. Naturalistic assessment of spatial memory and episodic memory (i.e., episodic autobiographical memory) is required to unify the scale and biological relevance in comparisons of spatial and mnemonic processing. Limitations regarding the translation of rodent to human research, human ontogeny, and inter-individual variability require greater consideration in the interpretation of this literature. In this review, we outline the aspects of space that are (and are not) commonly linked to episodic memory, and then we discuss these dimensions through the lens of individual differences in naturalistic autobiographical memory. Future studies should carefully consider which aspect(s) of space are being linked to memory within the context of naturalistic human cognition. This article is categorized under: Psychology > Memory.
Collapse
Affiliation(s)
- Carina L Fan
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada.,Rotman Research Institute, Baycrest, Toronto, Ontario, Canada
| | | | - R Shayna Rosenbaum
- Rotman Research Institute, Baycrest, Toronto, Ontario, Canada.,Department of Psychology, York University, Toronto, Ontario, Canada
| | - Brian Levine
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada.,Rotman Research Institute, Baycrest, Toronto, Ontario, Canada.,Department of Medicine, Neurology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Quilty-Dunn J, Porot N, Mandelbaum E. The best game in town: The reemergence of the language-of-thought hypothesis across the cognitive sciences. Behav Brain Sci 2022; 46:e261. [PMID: 36471543 DOI: 10.1017/s0140525x22002849] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Mental representations remain the central posits of psychology after many decades of scrutiny. However, there is no consensus about the representational format(s) of biological cognition. This paper provides a survey of evidence from computational cognitive psychology, perceptual psychology, developmental psychology, comparative psychology, and social psychology, and concludes that one type of format that routinely crops up is the language-of-thought (LoT). We outline six core properties of LoTs: (i) discrete constituents; (ii) role-filler independence; (iii) predicate-argument structure; (iv) logical operators; (v) inferential promiscuity; and (vi) abstract content. These properties cluster together throughout cognitive science. Bayesian computational modeling, compositional features of object perception, complex infant and animal reasoning, and automatic, intuitive cognition in adults all implicate LoT-like structures. Instead of regarding LoT as a relic of the previous century, researchers in cognitive science and philosophy-of-mind must take seriously the explanatory breadth of LoT-based architectures. We grant that the mind may harbor many formats and architectures, including iconic and associative structures as well as deep-neural-network-like architectures. However, as computational/representational approaches to the mind continue to advance, classical compositional symbolic structures - that is, LoTs - only prove more flexible and well-supported over time.
Collapse
Affiliation(s)
- Jake Quilty-Dunn
- Department of Philosophy and Philosophy-Neuroscience-Psychology Program, Washington University in St. Louis, St. Louis, MO, USA. , sites.google.com/site/jakequiltydunn/
| | - Nicolas Porot
- Africa Institute for Research in Economics and Social Sciences, Mohammed VI Polytechnic University, Rabat, Morocco. , nicolasporot.com
| | - Eric Mandelbaum
- Departments of Philosophy and Psychology, The Graduate Center & Baruch College, CUNY, New York, NY, USA. , ericmandelbaum.com
| |
Collapse
|
5
|
Helbing J, Draschkow D, L-H Võ M. Auxiliary Scene-Context Information Provided by Anchor Objects Guides Attention and Locomotion in Natural Search Behavior. Psychol Sci 2022; 33:1463-1476. [PMID: 35942922 DOI: 10.1177/09567976221091838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Successful adaptive behavior requires efficient attentional and locomotive systems. Previous research has thoroughly investigated how we achieve this efficiency during natural behavior by exploiting prior knowledge related to targets of our actions (e.g., attending to metallic targets when looking for a pot) and to the environmental context (e.g., looking for the pot in the kitchen). Less is known about whether and how individual nontarget components of the environment support natural behavior. In our immersive virtual reality task, 24 adult participants searched for objects in naturalistic scenes in which we manipulated the presence and arrangement of large, static objects that anchor predictions about targets (e.g., the sink provides a prediction for the location of the soap). Our results show that gaze and body movements in this naturalistic setting are strongly guided by these anchors. These findings demonstrate that objects auxiliary to the target are incorporated into the representations guiding attention and locomotion.
Collapse
Affiliation(s)
- Jason Helbing
- Scene Grammar Lab, Department of Psychology, Goethe University Frankfurt
| | - Dejan Draschkow
- Brain and Cognition Laboratory, Department of Experimental Psychology, University of Oxford.,Oxford Centre for Human Brain Activity, Wellcome Centre for Integrative Neuroimaging, Department of Psychiatry, University of Oxford
| | - Melissa L-H Võ
- Scene Grammar Lab, Department of Psychology, Goethe University Frankfurt
| |
Collapse
|
6
|
Helo A, Guerra E, Coloma CJ, Aravena-Bravo P, Rämä P. Do Children With Developmental Language Disorder Activate Scene Knowledge to Guide Visual Attention? Effect of Object-Scene Inconsistencies on Gaze Allocation. Front Psychol 2022; 12:796459. [PMID: 35069387 PMCID: PMC8776641 DOI: 10.3389/fpsyg.2021.796459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 12/09/2021] [Indexed: 12/03/2022] Open
Abstract
Our visual environment is highly predictable in terms of where and in which locations objects can be found. Based on visual experience, children extract rules about visual scene configurations, allowing them to generate scene knowledge. Similarly, children extract the linguistic rules from relatively predictable linguistic contexts. It has been proposed that the capacity of extracting rules from both domains might share some underlying cognitive mechanisms. In the present study, we investigated the link between language and scene knowledge development. To do so, we assessed whether preschool children (age range = 5;4–6;6) with Developmental Language Disorder (DLD), who present several difficulties in the linguistic domain, are equally attracted to object-scene inconsistencies in a visual free-viewing task in comparison with age-matched children with Typical Language Development (TLD). All children explored visual scenes containing semantic (e.g., soap on a breakfast table), syntactic (e.g., bread on the chair back), or both inconsistencies (e.g., soap on the chair back). Since scene knowledge interacts with image properties (i.e., saliency) to guide gaze allocation during visual exploration from the early stages of development, we also included the objects’ saliency rank in the analysis. The results showed that children with DLD were less attracted to semantic and syntactic inconsistencies than children with TLD. In addition, saliency modulated syntactic effect only in the group of children with TLD. Our findings indicate that children with DLD do not activate scene knowledge to guide visual attention as efficiently as children with TLD, especially at the syntactic level, suggesting a link between scene knowledge and language development.
Collapse
Affiliation(s)
- Andrea Helo
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Departamento de Neurociencias, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Ernesto Guerra
- Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Carmen Julia Coloma
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Paulina Aravena-Bravo
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Escuela de Psicología, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Pia Rämä
- Integrative Neuroscience and Cognition Center (UMR 8002), CNRS, Université Paris Descartes, Paris, France
| |
Collapse
|
7
|
Anderson BA, Kim H, Kim AJ, Liao MR, Mrkonja L, Clement A, Grégoire L. The past, present, and future of selection history. Neurosci Biobehav Rev 2021; 130:326-350. [PMID: 34499927 PMCID: PMC8511179 DOI: 10.1016/j.neubiorev.2021.09.004] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/08/2021] [Accepted: 09/02/2021] [Indexed: 01/22/2023]
Abstract
The last ten years of attention research have witnessed a revolution, replacing a theoretical dichotomy (top-down vs. bottom-up control) with a trichotomy (biased by current goals, physical salience, and selection history). This third new mechanism of attentional control, selection history, is multifaceted. Some aspects of selection history must be learned over time whereas others reflect much more transient influences. A variety of different learning experiences can shape the attention system, including reward, aversive outcomes, past experience searching for a target, target‒non-target relations, and more. In this review, we provide an overview of the historical forces that led to the proposal of selection history as a distinct mechanism of attentional control. We then propose a formal definition of selection history, with concrete criteria, and identify different components of experience-driven attention that fit within this definition. The bulk of the review is devoted to exploring how these different components relate to one another. We conclude by proposing an integrative account of selection history centered on underlying themes that emerge from our review.
Collapse
Affiliation(s)
- Brian A Anderson
- Texas A&M University, College Station, TX, 77843, United States.
| | - Haena Kim
- Texas A&M University, College Station, TX, 77843, United States
| | - Andy J Kim
- Texas A&M University, College Station, TX, 77843, United States
| | - Ming-Ray Liao
- Texas A&M University, College Station, TX, 77843, United States
| | - Lana Mrkonja
- Texas A&M University, College Station, TX, 77843, United States
| | - Andrew Clement
- Texas A&M University, College Station, TX, 77843, United States
| | | |
Collapse
|
8
|
David EJ, Beitner J, Võ MLH. The importance of peripheral vision when searching 3D real-world scenes: A gaze-contingent study in virtual reality. J Vis 2021; 21:3. [PMID: 34251433 PMCID: PMC8287039 DOI: 10.1167/jov.21.7.3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Visual search in natural scenes is a complex task relying on peripheral vision to detect potential targets and central vision to verify them. The segregation of the visual fields has been particularly established by on-screen experiments. We conducted a gaze-contingent experiment in virtual reality in order to test how the perceived roles of central and peripheral visions translated to more natural settings. The use of everyday scenes in virtual reality allowed us to study visual attention by implementing a fairly ecological protocol that cannot be implemented in the real world. Central or peripheral vision was masked during visual search, with target objects selected according to scene semantic rules. Analyzing the resulting search behavior, we found that target objects that were not spatially constrained to a probable location within the scene impacted search measures negatively. Our results diverge from on-screen studies in that search performances were only slightly affected by central vision loss. In particular, a central mask did not impact verification times when the target was grammatically constrained to an anchor object. Our findings demonstrates that the role of central vision (up to 6 degrees of eccentricities) in identifying objects in natural scenes seems to be minor, while the role of peripheral preprocessing of targets in immersive real-world searches may have been underestimated by on-screen experiments.
Collapse
Affiliation(s)
- Erwan Joël David
- Department of Psychology, Goethe-Universität, Frankfurt, Germany.,
| | - Julia Beitner
- Department of Psychology, Goethe-Universität, Frankfurt, Germany.,
| | | |
Collapse
|
9
|
Võ MLH. The meaning and structure of scenes. Vision Res 2021; 181:10-20. [PMID: 33429218 DOI: 10.1016/j.visres.2020.11.003] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 10/31/2020] [Accepted: 11/03/2020] [Indexed: 01/09/2023]
Abstract
We live in a rich, three dimensional world with complex arrangements of meaningful objects. For decades, however, theories of visual attention and perception have been based on findings generated from lines and color patches. While these theories have been indispensable for our field, the time has come to move on from this rather impoverished view of the world and (at least try to) get closer to the real thing. After all, our visual environment consists of objects that we not only look at, but constantly interact with. Having incorporated the meaning and structure of scenes, i.e. its "grammar", then allows us to easily understand objects and scenes we have never encountered before. Studying this grammar provides us with the fascinating opportunity to gain new insights into the complex workings of attention, perception, and cognition. In this review, I will discuss how the meaning and the complex, yet predictive structure of real-world scenes influence attention allocation, search, and object identification.
Collapse
Affiliation(s)
- Melissa Le-Hoa Võ
- Department of Psychology, Johann Wolfgang-Goethe-Universität, Frankfurt, Germany. https://www.scenegrammarlab.com/
| |
Collapse
|