1
|
Al-Hindawi A, Vizcaychipi M, Demiris Y. A prospective multi-center study quantifying visual inattention in delirium using generative models of the visual processing stream. Sci Rep 2024; 14:15698. [PMID: 38977712 PMCID: PMC11231180 DOI: 10.1038/s41598-024-66368-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 07/01/2024] [Indexed: 07/10/2024] Open
Abstract
The visual attentional deficits in delirium are poorly characterized. Studies have highlighted neuro-anatomical abnormalities in the visual processing stream but fail at quantifying these abnormalities at a functional level. To identify these deficits, we undertook a multi-center eye-tracking study where we recorded 210 sessions from 42 patients using a novel eye-tracking system that was made specifically for free-viewing in the (ICU); each session lasted 10 min and was labeled with the delirium status of the patient using the Confusion Assessment Method in ICU (CAM-ICU). To analyze this data, we formulate the task of visual attention as a hierarchical generative process that yields a probabilistic distribution of the location of the next fixation. This distribution can then be compared to the measured patient fixation producing a correctness score which is tallied compared across delirium status. This analysis demonstrated that the visual processing system of patients suffering from delirium is functionally restricted to a statistically significant degree. This is the first study to explore the potential mechanisms underpinning visual inattention in delirium and suggests a new target of future research into a disease process that affects one in four hospitalized patients with severe short and long-term consequences.
Collapse
Affiliation(s)
- Ahmed Al-Hindawi
- Personal Robotics Laboratory, Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, UK.
- Department of Anaesthesia, Pain Medicine and Intensive Care, Chelsea and Westminster Hospital NHS Foundation Trust, London, SW10 9NH, UK.
| | - Marcela Vizcaychipi
- Department of Anaesthesia, Pain Medicine and Intensive Care, Chelsea and Westminster Hospital NHS Foundation Trust, London, SW10 9NH, UK
| | - Yiannis Demiris
- Personal Robotics Laboratory, Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
2
|
Newport RA, Liu S, Di Ieva A. Analyzing Eye Paths Using Fractals. ADVANCES IN NEUROBIOLOGY 2024; 36:827-848. [PMID: 38468066 DOI: 10.1007/978-3-031-47606-8_42] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Visual patterns reflect the anatomical and cognitive background underlying process governing how we perceive information, influenced by stimulus characteristics and our own visual perception. These patterns are both spatially complex and display self-similarity seen in fractal geometry at different scales, making them challenging to measure using the traditional topological dimensions used in Euclidean geometry.However, methods for measuring eye gaze patterns using fractals have shown success in quantifying geometric complexity, matchability, and implementation into machine learning methods. This success is due to the inherent capabilities that fractals possess when reducing dimensionality using Hilbert curves, measuring temporal complexity using the Higuchi fractal dimension (HFD), and determining geometric complexity using the Minkowski-Bouligand dimension.Understanding the many applications of fractals when measuring and analyzing eye gaze patterns can extend the current growing body of knowledge by identifying markers tied to neurological pathology. Additionally, in future work, fractals can facilitate defining imaging modalities in eye tracking diagnostics by exploiting their capability to acquire multiscale information, including complementary functions, structures, and dynamics.
Collapse
Affiliation(s)
- Robert Ahadizad Newport
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Faculty of Medicine, Human and Health Sciences, Macquarie University, Sydney, NSW, Australia.
| | - Sidong Liu
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Faculty of Medicine, Human and Health Sciences, Macquarie University, Sydney, NSW, Australia
| | - Antonio Di Ieva
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Faculty of Medicine, Human and Health Sciences, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
3
|
Malpica S, Martin D, Serrano A, Gutierrez D, Masia B. Task-Dependent Visual Behavior in Immersive Environments: A Comparative Study of Free Exploration, Memory and Visual Search. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4417-4425. [PMID: 37788210 DOI: 10.1109/tvcg.2023.3320259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Visual behavior depends on both bottom-up mechanisms, where gaze is driven by the visual conspicuity of the stimuli, and top-down mechanisms, guiding attention towards relevant areas based on the task or goal of the viewer. While this is well-known, visual attention models often focus on bottom-up mechanisms. Existing works have analyzed the effect of high-level cognitive tasks like memory or visual search on visual behavior; however, they have often done so with different stimuli, methodology, metrics and participants, which makes drawing conclusions and comparisons between tasks particularly difficult. In this work we present a systematic study of how different cognitive tasks affect visual behavior in a novel within-subjects design scheme. Participants performed free exploration, memory and visual search tasks in three different scenes while their eye and head movements were being recorded. We found significant, consistent differences between tasks in the distributions of fixations, saccades and head movements. Our findings can provide insights for practitioners and content creators designing task-oriented immersive applications.
Collapse
|
4
|
Kümmerer M, Bethge M. Predicting Visual Fixations. Annu Rev Vis Sci 2023; 9:269-291. [PMID: 37419107 DOI: 10.1146/annurev-vision-120822-072528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
As we navigate and behave in the world, we are constantly deciding, a few times per second, where to look next. The outcomes of these decisions in response to visual input are comparatively easy to measure as trajectories of eye movements, offering insight into many unconscious and conscious visual and cognitive processes. In this article, we review recent advances in predicting where we look. We focus on evaluating and comparing models: How can we consistently measure how well models predict eye movements, and how can we judge the contribution of different mechanisms? Probabilistic models facilitate a unified approach to fixation prediction that allows us to use explainable information explained to compare different models across different settings, such as static and video saliency, as well as scanpath prediction. We review how the large variety of saliency maps and scanpath models can be translated into this unifying framework, how much different factors contribute, and how we can select the most informative examples for model comparison. We conclude that the universal scale of information gain offers a powerful tool for the inspection of candidate mechanisms and experimental design that helps us understand the continual decision-making process that determines where we look.
Collapse
Affiliation(s)
| | - Matthias Bethge
- Tübingen AI Center, University of Tübingen, Tübingen, Germany; ,
| |
Collapse
|
5
|
Vásquez-Amézquita M, Leongómez JD, Salvador A, Seto MC. What can the eyes tell us about atypical sexual preferences as a function of sex and age? Linking eye movements with child-related chronophilias. Forensic Sci Res 2023; 8:5-15. [PMID: 37712065 PMCID: PMC10498142 DOI: 10.1093/fsr/owad009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 03/14/2022] [Accepted: 10/13/2022] [Indexed: 09/16/2023] Open
Abstract
Visual attention plays a central role in current theories of sexual information processing and is key to informing the use of eye-tracking techniques in the study of typical sexual preferences and more recently, in the study of atypical preferences such as pedophilia (prepubescent children) and hebephilia (pubescent children). The aim of this theoretical-empirical review is to connect the concepts of a visual attention-based model of sexual arousal processing with eye movements as indicators of atypical sexual interests, to substantiate the use of eye-tracking as a useful indirect measure of sexual preferences according to sex and age of the stimuli. Implications for research are discussed in terms of recognizing the value, scope and limitations of eye-tracking in the study of pedophilia and other chronophilias in males and females, and the generation of new hypotheses using this type of indirect measure of human sexual response.
Collapse
Affiliation(s)
- Milena Vásquez-Amézquita
- Faculty of Psychology, Universidad El Bosque, Bogotá, Colombia
- Department of Psychobiology, Laboratory of Social Cognitive Neuroscience, IDOCAL, University of Valencia, Valencia, Spain
- Department of Social Sciences, Faculty of Human and Social Sciences, Universidad de la Costa, Barranquilla, Colombia
| | | | - Alicia Salvador
- Department of Psychobiology, Laboratory of Social Cognitive Neuroscience, IDOCAL, University of Valencia, Valencia, Spain
| | - Michael C Seto
- Forensic Research Unit, Royal Ottawa HealthCare Group, Ottawa, ON, Canada
| |
Collapse
|
6
|
D’Amelio A, Patania S, Bursic S, Cuculo V, Boccignone G. Using Gaze for Behavioural Biometrics. SENSORS (BASEL, SWITZERLAND) 2023; 23:1262. [PMID: 36772302 PMCID: PMC9920149 DOI: 10.3390/s23031262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/15/2023] [Accepted: 01/20/2023] [Indexed: 06/18/2023]
Abstract
A principled approach to the analysis of eye movements for behavioural biometrics is laid down. The approach grounds in foraging theory, which provides a sound basis to capture the uniqueness of individual eye movement behaviour. We propose a composite Ornstein-Uhlenbeck process for quantifying the exploration/exploitation signature characterising the foraging eye behaviour. The relevant parameters of the composite model, inferred from eye-tracking data via Bayesian analysis, are shown to yield a suitable feature set for biometric identification; the latter is eventually accomplished via a classical classification technique. A proof of concept of the method is provided by measuring its identification performance on a publicly available dataset. Data and code for reproducing the analyses are made available. Overall, we argue that the approach offers a fresh view on either the analyses of eye-tracking data and prospective applications in this field.
Collapse
Affiliation(s)
- Alessandro D’Amelio
- PHuSe Lab, Department of Computer Science, University of Milano Statale, Via Celoria 18, 20133 Milan, Italy
| | - Sabrina Patania
- PHuSe Lab, Department of Computer Science, University of Milano Statale, Via Celoria 18, 20133 Milan, Italy
| | - Sathya Bursic
- PHuSe Lab, Department of Computer Science, University of Milano Statale, Via Celoria 18, 20133 Milan, Italy
- Department of Psychology, University of Milano-Bicocca, Piazza dell’Ateneo Nuovo 1, 20126 Milan, Italy
| | - Vittorio Cuculo
- PHuSe Lab, Department of Computer Science, University of Milano Statale, Via Celoria 18, 20133 Milan, Italy
| | - Giuseppe Boccignone
- PHuSe Lab, Department of Computer Science, University of Milano Statale, Via Celoria 18, 20133 Milan, Italy
| |
Collapse
|
7
|
Kucharský Š, Zaharieva M, Raijmakers M, Visser I. Habituation, part
II
. Rethinking the habituation paradigm. INFANT AND CHILD DEVELOPMENT 2022. [DOI: 10.1002/icd.2383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Šimon Kucharský
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Department of Psychological Methods, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
| | - Martina Zaharieva
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Research Institute of Child Development and Education, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
| | - Maartje Raijmakers
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Department of Educational Studies and Learn!, Faculty of Behavioral and Movement Sciences Free University Amsterdam Amsterdam the Netherlands
| | - Ingmar Visser
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Amsterdam Brain & Cognition (ABC) University of Amsterdam Amsterdam the Netherlands
| |
Collapse
|
8
|
Yang L, Xu M, Guo Y, Deng X, Gao F, Guan Z. Hierarchical Bayesian LSTM for Head Trajectory Prediction on Omnidirectional Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7563-7580. [PMID: 34596534 DOI: 10.1109/tpami.2021.3117019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
When viewing omnidirectional images (ODIs), viewers can access different viewports via head movement (HM), which sequentially forms head trajectories in spatial-temporal domain. Thus, head trajectories play a key role in modeling human attention on ODIs. In this paper, we establish a large-scale dataset collecting 21,600 head trajectories on 1,080 ODIs. By mining our dataset, we find two important factors influencing head trajectories, i.e., temporal dependency and subject-specific variance. Accordingly, we propose a novel approach integrating hierarchical Bayesian inference into long short-term memory (LSTM) network for head trajectory prediction on ODIs, which is called HiBayes-LSTM. In HiBayes-LSTM, we develop a mechanism of Future Intention Estimation (FIE), which captures the temporal correlations from previous, current and estimated future information, for predicting viewport transition. Additionally, a training scheme called Hierarchical Bayesian inference (HBI) is developed for modeling inter-subject uncertainty in HiBayes-LSTM. For HBI, we introduce a joint Gaussian distribution in a hierarchy, to approximate the posterior distribution over network weights. By sampling subject-specific weights from the approximated posterior distribution, our HiBayes-LSTM approach can yield diverse viewport transition among different subjects and obtain multiple head trajectories. Extensive experiments validate that our HiBayes-LSTM approach significantly outperforms 9 state-of-the-art approaches for trajectory prediction on ODIs, and then it is successfully applied to predict saliency on ODIs.
Collapse
|
9
|
Newport RA, Russo C, Liu S, Suman AA, Di Ieva A. SoftMatch: Comparing Scanpaths Using Combinatorial Spatio-Temporal Sequences with Fractal Curves. SENSORS (BASEL, SWITZERLAND) 2022; 22:7438. [PMID: 36236535 PMCID: PMC9570610 DOI: 10.3390/s22197438] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 09/24/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Recent studies matching eye gaze patterns with those of others contain research that is heavily reliant on string editing methods borrowed from early work in bioinformatics. Previous studies have shown string editing methods to be susceptible to false negative results when matching mutated genes or unordered regions of interest in scanpaths. Even as new methods have emerged for matching amino acids using novel combinatorial techniques, scanpath matching is still limited by a traditional collinear approach. This approach reduces the ability to discriminate between free viewing scanpaths of two people looking at the same stimulus due to the heavy weight placed on linearity. To overcome this limitation, we here introduce a new method called SoftMatch to compare pairs of scanpaths. SoftMatch diverges from traditional scanpath matching in two different ways: firstly, by preserving locality using fractal curves to reduce dimensionality from 2D Cartesian (x,y) coordinates into 1D (h) Hilbert distances, and secondly by taking a combinatorial approach to fixation matching using discrete Fréchet distance measurements between segments of scanpath fixation sequences. These matching "sequences of fixations over time" are a loose acronym for SoftMatch. Results indicate high degrees of statistical and substantive significance when scoring matches between scanpaths made during free-form viewing of unfamiliar stimuli. Applications of this method can be used to better understand bottom up perceptual processes extending to scanpath outlier detection, expertise analysis, pathological screening, and salience prediction.
Collapse
Affiliation(s)
- Robert Ahadizad Newport
- Faculty of Medicine, Health and Human Sciences, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
| | - Carlo Russo
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
| | - Sidong Liu
- Faculty of Medicine, Health and Human Sciences, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
| | - Abdulla Al Suman
- Faculty of Medicine, Health and Human Sciences, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
| | - Antonio Di Ieva
- Faculty of Medicine, Health and Human Sciences, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Balaclava Road, Sydney, NSW 2109, Australia
| |
Collapse
|
10
|
Yu H, Shamsi F, Kwon M. Altered eye movements during reading under degraded viewing conditions: Background luminance, text blur, and text contrast. J Vis 2022; 22:4. [PMID: 36069942 PMCID: PMC9465940 DOI: 10.1167/jov.22.10.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 08/10/2022] [Indexed: 11/24/2022] Open
Abstract
Degraded viewing conditions caused by either natural environments or visual disorders lead to slow reading. Here, we systematically investigated how eye movement patterns during reading are affected by degraded viewing conditions in terms of spatial resolution, contrast, and background luminance. Using a high-speed eye tracker, binocular eye movements were obtained from 14 young normally sighted adults. Images of text passages were manipulated with varying degrees of background luminance (1.3-265 cd/m2), text blur (severe blur to no blur), or text contrast (2.6%-100%). We analyzed changes in key eye movement features, such as saccades, microsaccades, regressive saccades, fixations, and return-sweeps across different viewing conditions. No significant changes were observed for the range of tested background luminance values. However, with increasing text blur and decreasing text contrast, we observed a significant decrease in saccade amplitude and velocity, as well as a significant increase in fixation duration, number of fixations, proportion of regressive saccades, microsaccade rate, and duration of return-sweeps. Among all, saccade amplitude, fixation duration, and proportion of regressive saccades turned out to be the most significant contributors to reading speed, together accounting for 90% of variance in reading speed. Our results together showed that, when presented with degraded viewing conditions, the patterns of eye movements during reading were altered accordingly. These findings may suggest that the seemingly deviated eye movements observed in individuals with visual impairments may be in part resulting from active and optimal information acquisition strategies operated when visual sensory input becomes substantially deprived.
Collapse
Affiliation(s)
- Haojue Yu
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - Foroogh Shamsi
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - MiYoung Kwon
- Department of Psychology, Northeastern University, Boston, MA, USA
| |
Collapse
|
11
|
Modeling eye movement in dynamic interactive tasks for maximizing situation awareness based on Markov decision process. Sci Rep 2022; 12:13298. [PMID: 35918377 PMCID: PMC9346140 DOI: 10.1038/s41598-022-17433-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 07/25/2022] [Indexed: 11/09/2022] Open
Abstract
For complex dynamic interactive tasks (such as aviating), operators need to continuously extract information from areas of interest (AOIs) through eye movement to maintain high level of situation awareness (SA), as failures of SA may cause task performance degradation, even system accident. Most of the current eye movement models focus on either static tasks (such as image viewing) or simple dynamic tasks (such as video watching), without considering SA. In this study, an eye movement model with the goal of maximizing SA is proposed based on Markov decision process (MDP), which is designed to describe the dynamic eye movement of experienced operators in dynamic interactive tasks. Two top-down factors, expectancy and value, are introduced into this model to represent the update probability and the importance of information in AOIs, respectively. In particular, the model regards sequence of eye fixations to different AOIs as sequential decisions to maximize the SA-related reward (value) in the context of uncertain information update (expectancy). Further, this model was validated with a flight simulation experiment. Results show that the predicted probabilities of fixation on and shift between AOIs are highly correlated (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$R = 0.928$$\end{document}R=0.928 and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$R = 0.951$$\end{document}R=0.951, respectively) with those of the experiment data.
Collapse
|
12
|
Martin D, Serrano A, Bergman AW, Wetzstein G, Masia B. ScanGAN360: A Generative Model of Realistic Scanpaths for 360° Images. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:2003-2013. [PMID: 35167469 DOI: 10.1109/tvcg.2022.3150502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding and modeling the dynamics of human gaze behavior in 360° environments is crucial for creating, improving, and developing emerging virtual reality applications. However, recruiting human observers and acquiring enough data to analyze their behavior when exploring virtual environments requires complex hardware and software setups, and can be time-consuming. Being able to generate virtual observers can help overcome this limitation, and thus stands as an open problem in this medium. Particularly, generative adversarial approaches could alleviate this challenge by generating a large number of scanpaths that reproduce human behavior when observing new scenes, essentially mimicking virtual observers. However, existing methods for scanpath generation do not adequately predict realistic scanpaths for 360° images. We present ScanGAN360, a new generative adversarial approach to address this problem. We propose a novel loss function based on dynamic time warping and tailor our network to the specifics of 360° images. The quality of our generated scanpaths outperforms competing approaches by a large margin, and is almost on par with the human baseline. ScanGAN360 allows fast simulation of large numbers of virtual observers, whose behavior mimics real users, enabling a better understanding of gaze behavior, facilitating experimentation, and aiding novel applications in virtual reality and beyond.
Collapse
|
13
|
Ernst D, Wolfe JM. How fixation durations are affected by search difficulty manipulations. VISUAL COGNITION 2022. [DOI: 10.1080/13506285.2022.2063465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Daniel Ernst
- Brigham & Women’s Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
- Bielefeld University, Bielefeld, Germany
| | - Jeremy M. Wolfe
- Brigham & Women’s Hospital, Boston, MA, United States
- Harvard Medical School, Boston, MA, United States
| |
Collapse
|
14
|
Engbert R, Rabe MM, Schwetlick L, Seelig SA, Reich S, Vasishth S. Data assimilation in dynamical cognitive science. Trends Cogn Sci 2022; 26:99-102. [PMID: 34972646 DOI: 10.1016/j.tics.2021.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 11/23/2021] [Accepted: 11/24/2021] [Indexed: 11/26/2022]
Abstract
Dynamical models make specific assumptions about cognitive processes that generate human behavior. In data assimilation, these models are tested against time-ordered data. Recent progress on Bayesian data assimilation demonstrates that this approach combines the strengths of statistical modeling of individual differences with the those of dynamical cognitive models.
Collapse
Affiliation(s)
- Ralf Engbert
- Department of Psychology, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany.
| | | | - Lisa Schwetlick
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Stefan A Seelig
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Sebastian Reich
- Institute of Mathematics, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Shravan Vasishth
- Department of Linguistics, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| |
Collapse
|
15
|
Silva-Gago M, Ioannidou F, Fedato A, Hodgson T, Bruner E. Visual Attention and Cognitive Archaeology: An Eye-Tracking Study of Palaeolithic Stone Tools. Perception 2021; 51:3-24. [PMID: 34967251 DOI: 10.1177/03010066211069504] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The study of lithic technology can provide information on human cultural evolution. This article aims to analyse visual behaviour associated with the exploration of ancient stone artefacts and how this relates to perceptual mechanisms in humans. In Experiment 1, we used eye tracking to record patterns of eye fixations while participants viewed images of stone tools, including examples of worked pebbles and handaxes. The results showed that the focus of gaze was directed more towards the upper regions of worked pebbles and on the basal areas for handaxes. Knapped surfaces also attracted more fixation than natural cortex for both tool types. Fixation distribution was different to that predicted by models that calculate visual salience. Experiment 2 was an online study using a mouse-click attention tracking technique and included images of unworked pebbles and 'mixed' images combining the handaxe's outline with the pebble's unworked texture. The pattern of clicks corresponded to that revealed using eye tracking and there were differences between tools and other images. Overall, the findings suggest that visual exploration is directed towards functional aspects of tools. Studies of visual attention and exploration can supply useful information to inform understanding of human cognitive evolution and tool use.
Collapse
Affiliation(s)
- María Silva-Gago
- Centro Nacional de Investigación sobre la Evolución Humana, Burgos, Spain
| | | | - Annapaola Fedato
- Centro Nacional de Investigación sobre la Evolución Humana, Burgos, Spain
| | - Timothy Hodgson
- College of Social Science, 4547University of Lincoln, Lincoln, UK
| | - Emiliano Bruner
- Centro Nacional de Investigación sobre la Evolución Humana, Burgos, Spain
| |
Collapse
|
16
|
No-Reference Image Quality Assessment with Convolutional Neural Networks and Decision Fusion. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app12010101] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
No-reference image quality assessment (NR-IQA) has always been a difficult research problem because digital images may suffer very diverse types of distortions and their contents are extremely various. Moreover, IQA is also a very hot topic in the research community since the number and role of digital images in everyday life is continuously growing. Recently, a huge amount of effort has been devoted to exploiting convolutional neural networks and other deep learning techniques for no-reference image quality assessment. Since deep learning relies on a massive amount of labeled data, utilizing pretrained networks has become very popular in the literature. In this study, we introduce a novel, deep learning-based NR-IQA architecture that relies on the decision fusion of multiple image quality scores coming from different types of convolutional neural networks. The main idea behind this scheme is that a diverse set of different types of networks is able to better characterize authentic image distortions than a single network. The experimental results show that our method can effectively estimate perceptual image quality on four large IQA benchmark databases containing either authentic or artificial distortions. These results are also confirmed in significance and cross database tests.
Collapse
|
17
|
Xia C, Han J, Zhang D. Evaluation of Saccadic Scanpath Prediction: Subjective Assessment Database and Recurrent Neural Network Based Metric. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:4378-4395. [PMID: 32750785 DOI: 10.1109/tpami.2020.3002168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In recent years, predicting the saccadic scanpaths of humans has become a new trend in the field of visual attention modeling. Given various saccadic algorithms, determining how to evaluate their ability to model a dynamic saccade has become an important yet understudied issue. To our best knowledge, existing metrics for evaluating saccadic prediction models are often heuristically designed, which may produce results that are inconsistent with human subjective assessment. To this end, we first construct a subjective database by collecting the assessments on 5,000 pairs of scanpaths from ten subjects. Based on this database, we can compare different metrics according to their consistency with human visual perception. In addition, we also propose a data-driven metric to measure scanpath similarity based on the human subjective comparison. To achieve this goal, we employ a long short-term memory (LSTM) network to learn the inference from the relationship of encoded scanpaths to a binary measurement. Experimental results have demonstrated that the LSTM-based metric outperforms other existing metrics. Moreover, we believe the constructed database can be used as a benchmark to inspire more insights for future metric selection.
Collapse
|
18
|
A Comparative Eye Tracking Study of Usability—Towards Sustainable Web Design. SUSTAINABILITY 2021. [DOI: 10.3390/su131810415] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Websites are one of the most frequently used communication environments, and creating sustainable web designs should be an objective for all companies. Ensuring high usability is proving to be one of the main contributors to sustainable web design, reducing usage time, eliminating frustration and increasing satisfaction and retention. The present paper studies the usability of different website landing pages, seeking to identify the elements, structures and designs that increase usability. The study analyzed the behavior of 22 participants during their interaction with five different landing pages while they performed three tasks on the webpage and freely viewed each page for one minute. The stimuli were represented by five different banking websites, each of them presenting the task content in a different mode (text, image, symbol, graph, etc.).; the data obtained from the eye tracker (fixations location, order and duration, saccades, revisits of the same element, etc.), together with the data from the applied survey lead to interesting conclusions: the top, center and right sides of the webpage attract the most attention; the use of pictures depicting persons increase visibility; the scanpaths follow a vertical and horizontal direction; numerical data should be presented through graphs or tables. Even if a user's past experience influences their experience on a website, we show that the design of the webpage itself has a greater influence on webpage usability.
Collapse
|
19
|
Liu H, Hu X, Ren Y, Wang L, Guo L, Guo CC, Han J. Neural Correlates of Interobserver Visual Congruency in Free-Viewing Condition. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.3002765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
20
|
Human scanpath estimation based on semantic segmentation guided by common eye fixation behaviors. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.07.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
21
|
|
22
|
Huber-Huber C, Buonocore A, Melcher D. The extrafoveal preview paradigm as a measure of predictive, active sampling in visual perception. J Vis 2021; 21:12. [PMID: 34283203 PMCID: PMC8300052 DOI: 10.1167/jov.21.7.12] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 05/18/2021] [Indexed: 01/02/2023] Open
Abstract
A key feature of visual processing in humans is the use of saccadic eye movements to look around the environment. Saccades are typically used to bring relevant information, which is glimpsed with extrafoveal vision, into the high-resolution fovea for further processing. With the exception of some unusual circumstances, such as the first fixation when walking into a room, our saccades are mainly guided based on this extrafoveal preview. In contrast, the majority of experimental studies in vision science have investigated "passive" behavioral and neural responses to suddenly appearing and often temporally or spatially unpredictable stimuli. As reviewed here, a growing number of studies have investigated visual processing of objects under more natural viewing conditions in which observers move their eyes to a stationary stimulus, visible previously in extrafoveal vision, during each trial. These studies demonstrate that the extrafoveal preview has a profound influence on visual processing of objects, both for behavior and neural activity. Starting from the preview effect in reading research we follow subsequent developments in vision research more generally and finally argue that taking such evidence seriously leads to a reconceptualization of the nature of human visual perception that incorporates the strong influence of prediction and action on sensory processing. We review theoretical perspectives on visual perception under naturalistic viewing conditions, including theories of active vision, active sensing, and sampling. Although the extrafoveal preview paradigm has already provided useful information about the timing of, and potential mechanisms for, the close interaction of the oculomotor and visual systems while reading and in natural scenes, the findings thus far also raise many new questions for future research.
Collapse
Affiliation(s)
- Christoph Huber-Huber
- Radboud University, Donders Institute for Brain, Cognition and Behaviour, The Netherlands
- CIMeC, University of Trento, Italy
| | - Antimo Buonocore
- Werner Reichardt Centre for Integrative Neuroscience, Tübingen University, Tübingen, BW, Germany
- Hertie Institute for Clinical Brain Research, Tübingen University, Tübingen, BW, Germany
| | - David Melcher
- CIMeC, University of Trento, Italy
- Division of Science, New York University Abu Dhabi, UAE
| |
Collapse
|
23
|
Abstract
Saliency and visual attention have been studied in a computational context for decades, mostly in the capacity of predicting spatial topographical saliency maps or simulated heatmaps. Spatial selection by an attentive mechanism is, however, inherently a sequential sampling process in humans. There have been recent efforts in analyzing and modeling scanpaths, however, there is as of yet no universal agreement on what metrics should be applied to measure scanpath similarity or the quality of a predicted scanpath from a computational model. Many similarity measures have been suggested in different contexts and little is known about their behavior or properties. This paper presents in one place a review of these metrics, axiomatic analysis of gaze metrics for scanpaths, and careful analysis of the discriminative power of different metrics in order to provide a roadmap for further future analysis. This is accompanied by experimentation based on classic modeling strategies for simulating sequential selection from traditional representations of saliency, and deep neural networks that produce sequences by construction. Experiments provide strong support for the necessity of sequential analysis of attention and support for certain metrics including a family of metrics introduced in this paper motivated by the notion of scanpath plausibility.
Collapse
|
24
|
Sun W, Chen Z, Wu F. Visual Scanpath Prediction Using IOR-ROI Recurrent Mixture Density Network. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2101-2118. [PMID: 31796389 DOI: 10.1109/tpami.2019.2956930] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A visual scanpath represents the human eye movements when scanning the visual field for acquiring and receiving visual information. Predicting visual scanpaths when a certain stimulus is presented plays an important role in modeling overt human visual attention and search behavior. In this paper, we presented an 'Inhibition of Return - Region of Interest' (IOR-ROI) recurrent mixture density network based framework learning to produce human-like visual scanpaths under task-free viewing conditions. The proposed model simultaneously predicts a sequence of ordered fixation positions and their corresponding fixation durations. Our model integrates bottom-up features and semantic features extracted by convolutional neural networks. Then the integrated feature maps are fed into the IOR-ROI Long Short-Term Memory (LSTM) which is the core component of the proposed model. The IOR-ROI LSTM is a dual LSTM unit, i.e., the IOR-LSTM and the ROI-LSTM, capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively maintain and update visual information regarding previously fixated regions. ROI-LSTM is responsible for predicting the next possible ROIs given the spatially inhibited image feature maps on the feature-wise basis. Fixation duration is predicted by a regression neural network given the viewing history and image feature maps corresponding to currently fixated ROI. Considering the eye movement pattern variations among subjects, a mixture density network is adopted to model the next fixation distribution as Gaussian mixtures and the fixation duration is also modeled using Gaussian distribution. Our model is evaluated on the OSIE and MIT low resolution eye-tracking datasets and experimental results indicate that the proposed method can achieve superior performance in predicting visual scanpaths. The code will be publicly available on URL: https://github.com/sunwj/scanpath.
Collapse
|
25
|
Image Quality Assessment without Reference by Combining Deep Learning-Based Features and Viewing Distance. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11104661] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
An abundance of objective image quality metrics have been introduced in the literature. One important essential aspect that perceived image quality is dependent on is the viewing distance from the observer to the image. We introduce in this study a novel image quality metric able to estimate the quality of a given image without reference for different viewing distances between the image and the observer. We first select relevant patches from the image using saliency information. For each patch, a feature vector is extracted from a convolutional neural network model and concatenated at the viewing distance, for which the quality is predicted. The resulting vector is fed to fully connected layers to predict subjective scores for the considered viewing distance. The proposed method was evaluated using the Colourlab Image Database: Image Quality and Viewing Distance-changed Image Database. Both databases provide subjective scores at two different viewing distances. In the Colourlab Image Database: Image Quality we obtain a Pearson correlation of 0.87 at both 50 cm and 100 cm viewing distances, while in the Viewing Distance-changed Image Database we obtained a Pearson correlation of 0.93 and 0.94 at viewing distance of four and six times the image height. The results show the efficiency of our method and its generalization ability.
Collapse
|
26
|
D'Amelio A, Boccignone G. Gazing at Social Interactions Between Foraging and Decision Theory. Front Neurorobot 2021; 15:639999. [PMID: 33859558 PMCID: PMC8042312 DOI: 10.3389/fnbot.2021.639999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/09/2021] [Indexed: 11/30/2022] Open
Abstract
Finding the underlying principles of social attention in humans seems to be essential for the design of the interaction between natural and artificial agents. Here, we focus on the computational modeling of gaze dynamics as exhibited by humans when perceiving socially relevant multimodal information. The audio-visual landscape of social interactions is distilled into a number of multimodal patches that convey different social value, and we work under the general frame of foraging as a tradeoff between local patch exploitation and landscape exploration. We show that the spatio-temporal dynamics of gaze shifts can be parsimoniously described by Langevin-type stochastic differential equations triggering a decision equation over time. In particular, value-based patch choice and handling is reduced to a simple multi-alternative perceptual decision making that relies on a race-to-threshold between independent continuous-time perceptual evidence integrators, each integrator being associated with a patch.
Collapse
Affiliation(s)
- Alessandro D'Amelio
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| | - Giuseppe Boccignone
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| |
Collapse
|
27
|
Abeles D, Yuval-Greenberg S. Active sensing and overt avoidance: Gaze shifts as a mechanism of predictive avoidance in vision. Cognition 2021; 211:104648. [PMID: 33714871 DOI: 10.1016/j.cognition.2021.104648] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 01/11/2021] [Accepted: 02/23/2021] [Indexed: 11/27/2022]
Abstract
Sensory organs are not only involved in passively transmitting sensory input, but are also involved in actively seeking it. Some sensory organs move dynamically to allow highly prioritized input to be detected by their most sensitive parts. Such 'active sensing' systems engage in pursuing relevant input, relying on attentional prioritizations. However, pursuing input may not always be advantageous. Task-irrelevant input may be distracting and interfere with task performance. We hypothesize that an efficient 'active sensing' mechanism should be able to not only pursue relevant input but also to predict irrelevant input and avoid it. Moreover, we hypothesize that this mechanism should be evident even when the task is non-visual and all visual information acts as a distractor. In this study, we demonstrate the existence of a predictive 'overt avoidance' mechanism in vision. In two experiments, participants were asked to perform a continuous mental-arithmetic task while occasionally being presented with task-irrelevant crowded displays limited to one quadrant of a screen. The locations of these visual stimuli were constant within a block but varied between blocks. Results show that gaze was consistently shifted away from the predicted location of distraction, even prior to its appearance, confirming the existence of a predictive 'overt avoidance' mechanism in vision. Based on these findings, we propose a conceptual model to explain how an 'active sensing' system, hardwired to explore, can overcome this drive when presented with distracting information. According to the model, distraction is handled through a dual mechanism of suppression and avoidance processes that are causally linked. This framework demonstrates how perception and motion work together to approach relevant information while avoiding irrelevant distraction.
Collapse
Affiliation(s)
- Dekel Abeles
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Shlomit Yuval-Greenberg
- School of Psychological Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
28
|
Park K, Chin S. A Smart Interface HUD Optimized for VR HMD and Leap Motion. J Imaging Sci Technol 2021. [DOI: 10.2352/j.imagingsci.technol.2021.65.2.020501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
|
29
|
Okada KI, Miura K, Fujimoto M, Morita K, Yoshida M, Yamamori H, Yasuda Y, Iwase M, Inagaki M, Shinozaki T, Fujita I, Hashimoto R. Impaired inhibition of return during free-viewing behaviour in patients with schizophrenia. Sci Rep 2021; 11:3237. [PMID: 33547381 PMCID: PMC7865073 DOI: 10.1038/s41598-021-82253-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 01/18/2021] [Indexed: 01/30/2023] Open
Abstract
Schizophrenia affects various aspects of cognitive and behavioural functioning. Eye movement abnormalities are commonly observed in patients with schizophrenia (SZs). Here we examined whether such abnormalities reflect an anomaly in inhibition of return (IOR), the mechanism that inhibits orienting to previously fixated or attended locations. We analyzed spatiotemporal patterns of eye movement during free-viewing of visual images including natural scenes, geometrical patterns, and pseudorandom noise in SZs and healthy control participants (HCs). SZs made saccades to previously fixated locations more frequently than HCs. The time lapse from the preceding saccade was longer for return saccades than for forward saccades in both SZs and HCs, but the difference was smaller in SZs. SZs explored a smaller area than HCs. Generalized linear mixed-effect model analysis indicated that the frequent return saccades served to confine SZs' visual exploration to localized regions. The higher probability of return saccades in SZs was related to cognitive decline after disease onset but not to the dose of prescribed antipsychotics. We conclude that SZs exhibited attenuated IOR under free-viewing conditions, which led to restricted scene scanning. IOR attenuation will be a useful clue for detecting impairment in attention/orienting control and accompanying cognitive decline in schizophrenia.
Collapse
Affiliation(s)
- Ken-ichi Okada
- grid.136593.b0000 0004 0373 3971Graduate School of Frontier Biosciences, Osaka University, Osaka, 565-0871 Japan ,grid.136593.b0000 0004 0373 3971Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, and Osaka University, Osaka, 565-0871 Japan ,grid.39158.360000 0001 2173 7691Present Address: Department of Physiology, Hokkaido University School of Medicine, Hokkaido, 060-8638 Japan
| | - Kenichiro Miura
- grid.419280.60000 0004 1763 8916Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, Ogawa-Higashi 4-1-1, Kodaira, Tokyo, 187-8553 Japan
| | - Michiko Fujimoto
- grid.419280.60000 0004 1763 8916Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, Ogawa-Higashi 4-1-1, Kodaira, Tokyo, 187-8553 Japan ,grid.136593.b0000 0004 0373 3971Department of Psychiatry, Osaka University Graduate School of Medicine, Osaka, 565-0871 Japan
| | - Kentaro Morita
- grid.412708.80000 0004 1764 7572Department of Rehabilitation, University of Tokyo Hospital, Tokyo, 113-8655 Japan
| | - Masatoshi Yoshida
- grid.467811.d0000 0001 2272 1771Department of Developmental Physiology, National Institute for Physiological Sciences, Aichi, 444-8585 Japan ,grid.275033.00000 0004 1763 208XSchool of Life Science, The Graduate University for Advanced Studies, Kanagawa, 240-0193 Japan ,grid.39158.360000 0001 2173 7691Center for Human Nature, Artificial Intelligence, and Neuroscience, Hokkaido University, Hokkaido, 060-0812 Japan
| | - Hidenaga Yamamori
- grid.419280.60000 0004 1763 8916Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, Ogawa-Higashi 4-1-1, Kodaira, Tokyo, 187-8553 Japan ,grid.136593.b0000 0004 0373 3971Department of Psychiatry, Osaka University Graduate School of Medicine, Osaka, 565-0871 Japan ,grid.460257.2Japan Community Health Care Organization Osaka Hospital, Osaka, 553-0003 Japan
| | - Yuka Yasuda
- grid.419280.60000 0004 1763 8916Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, Ogawa-Higashi 4-1-1, Kodaira, Tokyo, 187-8553 Japan ,Life Grow Brilliant Mental Clinic, Medical Corporation Foster, Osaka, 530-0012 Japan ,grid.136593.b0000 0004 0373 3971Molecular Research Center for Children’s Mental Development, United Graduate School of Child Development, Osaka University, Osaka, 565-0871 Japan
| | - Masao Iwase
- grid.136593.b0000 0004 0373 3971Department of Psychiatry, Osaka University Graduate School of Medicine, Osaka, 565-0871 Japan
| | - Mikio Inagaki
- grid.136593.b0000 0004 0373 3971Graduate School of Frontier Biosciences, Osaka University, Osaka, 565-0871 Japan ,grid.136593.b0000 0004 0373 3971Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, and Osaka University, Osaka, 565-0871 Japan
| | - Takashi Shinozaki
- grid.136593.b0000 0004 0373 3971Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, and Osaka University, Osaka, 565-0871 Japan ,grid.136593.b0000 0004 0373 3971Graduate School of Information Science and Technology, Osaka University, Osaka, 565-0871 Japan
| | - Ichiro Fujita
- grid.136593.b0000 0004 0373 3971Graduate School of Frontier Biosciences, Osaka University, Osaka, 565-0871 Japan ,grid.136593.b0000 0004 0373 3971Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology, and Osaka University, Osaka, 565-0871 Japan
| | - Ryota Hashimoto
- grid.419280.60000 0004 1763 8916Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, Ogawa-Higashi 4-1-1, Kodaira, Tokyo, 187-8553 Japan ,grid.136593.b0000 0004 0373 3971Department of Psychiatry, Osaka University Graduate School of Medicine, Osaka, 565-0871 Japan ,grid.136593.b0000 0004 0373 3971Molecular Research Center for Children’s Mental Development, United Graduate School of Child Development, Osaka University, Osaka, 565-0871 Japan
| |
Collapse
|
30
|
Zhou Y, Yu Y. Human visual search follows a suboptimal Bayesian strategy revealed by a spatiotemporal computational model and experiment. Commun Biol 2021; 4:34. [PMID: 33397998 PMCID: PMC7782508 DOI: 10.1038/s42003-020-01485-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Accepted: 11/14/2020] [Indexed: 11/09/2022] Open
Abstract
There is conflicting evidence regarding whether humans can make spatially optimal eye movements during visual search. Some studies have shown that humans can optimally integrate information across fixations and determine the next fixation location, however, these models have generally ignored the control of fixation duration and memory limitation, and the model results do not agree well with the details of human eye movement metrics. Here, we measured the temporal course of the human visibility map and performed a visual search experiment. We further built a continuous-time eye movement model that considers saccadic inaccuracy, saccadic bias, and memory constraints. We show that this model agrees better with the spatial and temporal properties of human eye movements and predict that humans have a memory capacity of around eight previous fixations. The model results reveal that humans employ a suboptimal eye movement strategy to find a target, which may minimize costs while still achieving sufficiently high search performance.
Collapse
Affiliation(s)
- Yunhui Zhou
- School of Life Sciences, Fudan University, 200433, Shanghai, China
| | - Yuguo Yu
- School of Life Sciences, Fudan University, 200433, Shanghai, China.
- State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, 200433, Shanghai, China.
- Human Phenome Institute, Fudan University, 200433, Shanghai, China.
- Research Institute of Intelligent Complex Systems and Institutes of Brain Science, Fudan University, 200433, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, 200433, Shanghai, China.
| |
Collapse
|
31
|
Schwetlick L, Rothkegel LOM, Trukenbrod HA, Engbert R. Modeling the effects of perisaccadic attention on gaze statistics during scene viewing. Commun Biol 2020; 3:727. [PMID: 33262536 PMCID: PMC7708631 DOI: 10.1038/s42003-020-01429-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 10/21/2020] [Indexed: 11/09/2022] Open
Abstract
How we perceive a visual scene depends critically on the selection of gaze positions. For this selection process, visual attention is known to play a key role in two ways. First, image-features attract visual attention, a fact that is captured well by time-independent fixation models. Second, millisecond-level attentional dynamics around the time of saccade drives our gaze from one position to the next. These two related research areas on attention are typically perceived as separate, both theoretically and experimentally. Here we link the two research areas by demonstrating that perisaccadic attentional dynamics improve predictions on scan path statistics. In a mathematical model, we integrated perisaccadic covert attention with dynamic scan path generation. Our model reproduces saccade amplitude distributions, angular statistics, intersaccadic turning angles, and their impact on fixation durations as well as inter-individual differences using Bayesian inference. Therefore, our result lend support to the relevance of perisaccadic attention to gaze statistics.
Collapse
Affiliation(s)
- Lisa Schwetlick
- Department of Psychology, University of Potsdam, 14469, Potsdam, Germany.
- DFG Collaborative Research Center 1294, University of Potsdam, 14469, Potsdam, Germany.
| | | | | | - Ralf Engbert
- Department of Psychology, University of Potsdam, 14469, Potsdam, Germany
- DFG Collaborative Research Center 1294, University of Potsdam, 14469, Potsdam, Germany
- Research Focus Cognitive Science, University of Potsdam, 14469, Potsdam, Germany
| |
Collapse
|
32
|
Zanca D, Melacci S, Gori M. Gravitational Laws of Focus of Attention. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2983-2995. [PMID: 31180885 DOI: 10.1109/tpami.2019.2920636] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The understanding of the mechanisms behind focus of attention in a visual scene is a problem of great interest in visual perception and computer vision. In this paper, we describe a model of scanpath as a dynamic process which can be interpreted as a variational law somehow related to mechanics, where the focus of attention is subject to a gravitational field. The distributed virtual mass that drives eye movements is associated with the presence of details and motion in the video. Unlike most current models, the proposed approach does not estimate directly the saliency map, but the prediction of eye movements allows us to integrate over time the positions of interest. The process of inhibition-of-return is also supported in the same dynamic model with the purpose of simulating fixations and saccades. The differential equations of motion of the proposed model are numerically integrated to simulate scanpaths on both images and videos. Experimental results for the tasks of saliency and scanpath prediction on a wide collection of datasets are presented to support the theory. Top level performances are achieved especially in the prediction of scanpaths, which is the primary purpose of the proposed model.
Collapse
|
33
|
Berga D, Otazu X. Modeling bottom-up and top-down attention with a neurodynamic model of V1. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
34
|
Malem-Shinitski N, Opper M, Reich S, Schwetlick L, Seelig SA, Engbert R. A mathematical model of local and global attention in natural scene viewing. PLoS Comput Biol 2020; 16:e1007880. [PMID: 33315888 PMCID: PMC7769622 DOI: 10.1371/journal.pcbi.1007880] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 12/28/2020] [Accepted: 10/23/2020] [Indexed: 11/24/2022] Open
Abstract
Understanding the decision process underlying gaze control is an important question in cognitive neuroscience with applications in diverse fields ranging from psychology to computer vision. The decision for choosing an upcoming saccade target can be framed as a selection process between two states: Should the observer further inspect the information near the current gaze position (local attention) or continue with exploration of other patches of the given scene (global attention)? Here we propose and investigate a mathematical model motivated by switching between these two attentional states during scene viewing. The model is derived from a minimal set of assumptions that generates realistic eye movement behavior. We implemented a Bayesian approach for model parameter inference based on the model's likelihood function. In order to simplify the inference, we applied data augmentation methods that allowed the use of conjugate priors and the construction of an efficient Gibbs sampler. This approach turned out to be numerically efficient and permitted fitting interindividual differences in saccade statistics. Thus, the main contribution of our modeling approach is two-fold; first, we propose a new model for saccade generation in scene viewing. Second, we demonstrate the use of novel methods from Bayesian inference in the field of scan path modeling.
Collapse
Affiliation(s)
| | - Manfred Opper
- Department of Artificial Intelligence, Technische Universität Berlin, Berlin, Germany
| | - Sebastian Reich
- Institute of Mathematics, University of Potsdam, Potsdam, Germany
| | - Lisa Schwetlick
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Stefan A. Seelig
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Department of Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
35
|
Le Meur O, Le Pen T, Cozot R. Can we accurately predict where we look at paintings? PLoS One 2020; 15:e0239980. [PMID: 33035250 PMCID: PMC7546463 DOI: 10.1371/journal.pone.0239980] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 09/17/2020] [Indexed: 11/27/2022] Open
Abstract
The objective of this study is to investigate and to simulate the gaze deployment of observers on paintings. For that purpose, we built a large eye tracking dataset composed of 150 paintings belonging to 5 art movements. We observed that the gaze deployment over the proposed paintings was very similar to the gaze deployment over natural scenes. Therefore, we evaluate existing saliency models and propose a new one which significantly outperforms the most recent deep-based saliency models. Thanks to this new saliency model, we can predict very accurately what are the salient areas of a painting. This opens new avenues for many image-based applications such as animation of paintings or transformation of a still painting into a video clip.
Collapse
|
36
|
|
37
|
Harada Y, Ohyama J. The effect of task-irrelevant spatial contexts on 360-degree attention. PLoS One 2020; 15:e0237717. [PMID: 32810159 PMCID: PMC7437462 DOI: 10.1371/journal.pone.0237717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 07/31/2020] [Indexed: 11/19/2022] Open
Abstract
The effect of spatial contexts on attention is important for evaluating the risk of human errors and the accessibility of information in different situations. In traditional studies, this effect has been investigated using display-based and non-laboratory procedures. However, these two procedures are inadequate for measuring attention directed toward 360-degree environments and controlling exogeneous stimuli. In order to resolve these limitations, we used a virtual-reality-based procedure and investigated how spatial contexts of 360-degree environments influence attention. In the experiment, 20 students were asked to search for and report a target that was presented at any location in 360-degree virtual spaces as accurately and quickly as possible. Spatial contexts comprised a basic context (a grey and objectless space) and three specific contexts (a square grid floor, a cubic room, and an infinite floor). We found that response times for the task and eye movements were influenced by the spatial context of 360-degree surrounding spaces. In particular, although total viewing times for the contexts did not match the saliency maps, the differences in total viewing times between the basic and specific contexts did resemble the maps. These results suggest that attention comprises basic and context-dependent characteristics, and the latter are influenced by the saliency of 360-degree contexts even when the contexts are irrelevant to a task.
Collapse
Affiliation(s)
- Yuki Harada
- National Institute of Advanced Industrial Science and Technology, Human Augmentation Research Center, Tsukuba, Ibaraki, Japan
- Department of Rehabilitation for Brain Functions, Research Institute of National Rehabilitation Center for Persons with Disabilities, Tokorozawa, Saitama, Japan
| | - Junji Ohyama
- National Institute of Advanced Industrial Science and Technology, Human Augmentation Research Center, Tsukuba, Ibaraki, Japan
| |
Collapse
|
38
|
Abstract
Unmanned Aerial Vehicle (UAV) imagery is gaining a lot of momentum lately. Indeed, gathered information from a bird-point-of-view is particularly relevant for numerous applications, from agriculture to surveillance services. We herewith study visual saliency to verify whether there are tangible differences between this imagery and more conventional contents. We first describe typical and UAV contents based on their human saliency maps in a high-dimensional space, encompassing saliency map statistics, distribution characteristics, and other specifically designed features. Thanks to a large amount of eye tracking data collected on UAV, we stress the differences between typical and UAV videos, but more importantly within UAV sequences. We then designed a process to extract new visual attention biases in the UAV imagery, leading to the definition of a new dictionary of visual biases. We then conduct a benchmark on two different datasets, whose results confirm that the 20 defined biases are relevant as a low-complexity saliency prediction system.
Collapse
|
39
|
David EJ, Lebranchu P, Perreira Da Silva M, Le Callet P. Predicting artificial visual field losses: A gaze-based inference study. J Vis 2020; 19:22. [PMID: 31868896 DOI: 10.1167/19.14.22] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Visual field defects are a world-wide concern, and the proportion of the population experiencing vision loss is ever increasing. Macular degeneration and glaucoma are among the four leading causes of permanent vision loss. Identifying and characterizing visual field losses from gaze alone could prove crucial in the future for screening tests, rehabilitation therapies, and monitoring. In this experiment, 54 participants took part in a free-viewing task of visual scenes while experiencing artificial scotomas (central and peripheral) of varying radii in a gaze-contingent paradigm. We studied the importance of a set of gaze features as predictors to best differentiate between artificial scotoma conditions. Linear mixed models were utilized to measure differences between scotoma conditions. Correlation and factorial analyses revealed redundancies in our data. Finally, hidden Markov models and recurrent neural networks were implemented as classifiers in order to measure the predictive usefulness of gaze features. The results show separate saccade direction biases depending on scotoma type. We demonstrate that the saccade relative angle, amplitude, and peak velocity of saccades are the best features on the basis of which to distinguish between artificial scotomas in a free-viewing task. Finally, we discuss the usefulness of our protocol and analyses as a gaze-feature identifier tool that discriminates between artificial scotomas of different types and sizes.
Collapse
Affiliation(s)
| | - Pierre Lebranchu
- University of Nantes and Nantes University Hospital, Nantes, France
| | | | | |
Collapse
|
40
|
Backhaus D, Engbert R, Rothkegel LOM, Trukenbrod HA. Task-dependence in scene perception: Head unrestrained viewing using mobile eye-tracking. J Vis 2020; 20:3. [PMID: 32392286 PMCID: PMC7409614 DOI: 10.1167/jov.20.5.3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/15/2019] [Indexed: 11/24/2022] Open
Abstract
Real-world scene perception is typically studied in the laboratory using static picture viewing with restrained head position. Consequently, the transfer of results obtained in this paradigm to real-word scenarios has been questioned. The advancement of mobile eye-trackers and the progress in image processing, however, permit a more natural experimental setup that, at the same time, maintains the high experimental control from the standard laboratory setting. We investigated eye movements while participants were standing in front of a projector screen and explored images under four specific task instructions. Eye movements were recorded with a mobile eye-tracking device and raw gaze data were transformed from head-centered into image-centered coordinates. We observed differences between tasks in temporal and spatial eye-movement parameters and found that the bias to fixate images near the center differed between tasks. Our results demonstrate that current mobile eye-tracking technology and a highly controlled design support the study of fine-scaled task dependencies in an experimental setting that permits more natural viewing behavior than the static picture viewing paradigm.
Collapse
Affiliation(s)
- Daniel Backhaus
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | | | - Hans A. Trukenbrod
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
41
|
van Renswoude DR, Raijmakers MEJ, Visser I. Looking (for) patterns: Similarities and differences between infant and adult free scene-viewing patterns. J Eye Mov Res 2020; 13:10.16910/jemr.13.1.2. [PMID: 33828784 PMCID: PMC7881888 DOI: 10.16910/jemr.13.1.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Systematic tendencies such as the center and horizontal bias are known to have a large influence on how and where we move our eyes during static onscreen free scene viewing. However, it is unknown whether these tendencies are learned viewing strategies or are more default tendencies in the way we move our eyes. To gain insight into the origin of these tendencies we explore the systematic tendencies of infants (3 - 20-month-olds, N = 157) and adults (N = 88) in three different scene viewing data sets. We replicated com-mon findings, such as longer fixation durations and shorter saccade amplitudes in infants compared to adults. The leftward bias was never studied in infants, and our results indi-cate that it is not present, while we did replicate the leftward bias in adults. The general pattern of the results highlights the similarity between infant and adult eye movements. Similar to adults, infants' fixation durations increase with viewing time and the depend-encies between successive fixations and saccades show very similar patterns. A straight-forward conclusion to draw from this set of studies is that infant and adult eye movements are mainly driven by similar underlying basic processes.
Collapse
|
42
|
Ryan JD, Shen K, Liu Z. The intersection between the oculomotor and hippocampal memory systems: empirical developments and clinical implications. Ann N Y Acad Sci 2020; 1464:115-141. [PMID: 31617589 PMCID: PMC7154681 DOI: 10.1111/nyas.14256] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 08/29/2019] [Accepted: 09/19/2019] [Indexed: 12/28/2022]
Abstract
Decades of cognitive neuroscience research has shown that where we look is intimately connected to what we remember. In this article, we review findings from human and nonhuman animals, using behavioral, neuropsychological, neuroimaging, and computational modeling methods, to show that the oculomotor and hippocampal memory systems interact in a reciprocal manner, on a moment-to-moment basis, mediated by a vast structural and functional network. Visual exploration serves to efficiently gather information from the environment for the purpose of creating new memories, updating existing memories, and reconstructing the rich, vivid details from memory. Conversely, memory increases the efficiency of visual exploration. We call for models of oculomotor control to consider the influence of the hippocampal memory system on the cognitive control of eye movements, and for models of hippocampal and broader medial temporal lobe function to consider the influence of the oculomotor system on the development and expression of memory. We describe eye movement-based applications for the detection of neurodegeneration and delivery of therapeutic interventions for mental health disorders for which the hippocampus is implicated and memory dysfunctions are at the forefront.
Collapse
Affiliation(s)
- Jennifer D. Ryan
- Rotman Research InstituteBaycrestTorontoOntarioCanada
- Department of PsychologyUniversity of TorontoTorontoOntarioCanada
- Department of PsychiatryUniversity of TorontoTorontoOntarioCanada
| | - Kelly Shen
- Rotman Research InstituteBaycrestTorontoOntarioCanada
| | - Zhong‐Xu Liu
- Department of Behavioral SciencesUniversity of Michigan‐DearbornDearbornMichigan
| |
Collapse
|
43
|
Hoang K, Pitti A, Goudou JF, Dufour JY, Gaussier P. Active vision: on the relevance of a bio-inspired approach for object detection. BIOINSPIRATION & BIOMIMETICS 2020; 15:025003. [PMID: 31639780 DOI: 10.1088/1748-3190/ab504c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Starting from biological systems, we review the interest of active perception for object recognition in an autonomous system. Foveated vision and control of the eye saccade introduce strong benefits related to the differentiation of a 'what' pathway recognizing some local parts in the image and a 'where' pathway related to moving the fovea in that part of the image. Experiments on a dataset illustrate the capability of our model to deal with complex visual scenes. The results enlighten the interest of top-down contextual information to serialize the exploration and to perform some kind of hypothesis test. Moreover learning to control the occular saccade from the previous one can help reducing the exploration area and improve the recognition performances. Yet our results show that the selection of the next saccade should take into account broader statistical information. This opens new avenues for the control of the ocular saccades and the active exploration of complex visual scenes.
Collapse
Affiliation(s)
- Kevin Hoang
- ETIS UMR 8051/ENSEA, University of Cergy-Pontoise, France. Thales SIX GTS- Vision and Sensing laboratory, Palaiseau, France
| | | | | | | | | |
Collapse
|
44
|
Le Moan S, Pedersen M. A Three-Feature Model to Predict Colour Change Blindness. Vision (Basel) 2019; 3:vision3040061. [PMID: 31735862 PMCID: PMC6969898 DOI: 10.3390/vision3040061] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 10/22/2019] [Accepted: 11/06/2019] [Indexed: 11/16/2022] Open
Abstract
Change blindness is a striking shortcoming of our visual system which is exploited in the popular ‘Spot the difference’ game, as it makes us unable to notice large visual changes happening right before our eyes. Change blindness illustrates the fact that we see much less than we think we do. In this paper, we introduce a fully automated model to predict colour change blindness in cartoon images based on image complexity, change magnitude and observer experience. Using linear regression with only three parameters, the predictions of the proposed model correlate significantly with measured detection times. We also demonstrate the efficacy of the model to classify stimuli in terms of difficulty.
Collapse
Affiliation(s)
- Steven Le Moan
- Department of Mechanical and Electrical Engineering, Massey University, 4410 Palmerston North, New Zealand
- Correspondence:
| | - Marius Pedersen
- Department of Computer Science, Norwegian University of Science and Technology, 2815 Gjøvik, Norway;
| |
Collapse
|
45
|
Schütt HH, Rothkegel LOM, Trukenbrod HA, Engbert R, Wichmann FA. Disentangling bottom-up versus top-down and low-level versus high-level influences on eye movements over time. J Vis 2019; 19:1. [PMID: 30821809 DOI: 10.1167/19.3.1] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Bottom-up and top-down as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analyzing their influence over time. For this purpose, we develop a saliency model that is based on the internal representation of a recent early spatial vision model to measure the low-level, bottom-up factor. To measure the influence of high-level, bottom-up features, we use a recent deep neural network-based saliency model. To account for top-down influences, we evaluate the models on two large data sets with different tasks: first, a memorization task and, second, a search task. Our results lend support to a separation of visual scene exploration into three phases: the first saccade, an initial guided exploration characterized by a gradual broadening of the fixation density, and a steady state that is reached after roughly 10 fixations. Saccade-target selection during the initial exploration and in the steady state is related to similar areas of interest, which are better predicted when including high-level features. In the search data set, fixation locations are determined predominantly by top-down processes. In contrast, the first fixation follows a different fixation density and contains a strong central fixation bias. Nonetheless, first fixations are guided strongly by image properties, and as early as 200 ms after image onset, fixations are better predicted by high-level information. We conclude that any low-level, bottom-up factors are mainly limited to the generation of the first saccade. All saccades are better explained when high-level features are considered, and later, this high-level, bottom-up control can be overruled by top-down influences.
Collapse
Affiliation(s)
- Heiko H Schütt
- Neural Information Processing Group, Universität Tübingen, Tübingen, Germany.,Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Lars O M Rothkegel
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Hans A Trukenbrod
- Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Experimental and Biological Psychology and Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Universität Tübingen, Tübingen, Germany
| |
Collapse
|
46
|
Xia C, Han J, Qi F, Shi G. Predicting Human Saccadic Scanpaths Based on Iterative Representation Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:3502-3515. [PMID: 30735998 DOI: 10.1109/tip.2019.2897966] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Visual attention is a dynamic process of scene exploration and information acquisition. However, existing research on attention modeling has concentrated on estimating static salient locations. In contrast, dynamic attributes presented by saccade have not been well explored in previous attention models. In this paper, we address the problem of saccadic scanpath prediction by introducing an iterative representation learning framework. Within the framework, saccade can be interpreted as an iterative process of predicting one fixation according to the current representation and updating the representation based on the gaze shift. In the predicting phase, we propose a Bayesian definition of saccade to combine the influence of perceptual residual and spatial location on the selection of fixations. In implementation, we compute the representation error of an autoencoder-based network to measure perceptual residuals of each area. Simultaneously, we integrate saccade amplitude and center-weighted mechanism to model the influence of spatial location. Based on estimating the influence of two parts, the final fixation is defined as the point with the largest posterior probability of gaze shift. In the updating phase, we update the representation pattern for the subsequent calculation by retraining the network with samples extracted around the current fixation. In the experiments, the proposed model can replicate the fundamental properties of psychophysics in visual search. In addition, it can achieve superior performance on several benchmark eye-tracking data sets.
Collapse
|
47
|
Morita K, Miura K, Fujimoto M, Yamamori H, Yasuda Y, Kudo N, Azechi H, Okada N, Koshiyama D, Ikeda M, Kasai K, Hashimoto R. Eye movement abnormalities and their association with cognitive impairments in schizophrenia. Schizophr Res 2019; 209:255-262. [PMID: 30661730 DOI: 10.1016/j.schres.2018.12.051] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Revised: 12/28/2018] [Accepted: 12/28/2018] [Indexed: 12/26/2022]
Abstract
BACKGROUND Eye movement abnormalities have been identified in schizophrenia; however, their relevance to cognition is still unknown. In this study, we explored the general relationship between eye movements and cognitive function. METHODS The three eye movement measures (scanpath length, horizontal position gain, and duration of fixations) that were previously reported to be useful in distinguishing subjects with schizophrenia from healthy subjects, as well as Wechsler Adult Intelligence Scale-III (WAIS-III) scores, were collected and tested for association in 113 subjects with schizophrenia and 404 healthy subjects. RESULTS Scanpath length was positively correlated with matrix reasoning and digit symbol coding in subjects with schizophrenia and correlated with vocabulary and symbol search in healthy subjects. Upon testing for interaction effects of diagnosis and scanpath length on correlated WAIS-III scores, a significant interaction effect was only observed for matrix reasoning. The positive correlation between scanpath length and matrix reasoning, which was specific to subjects with schizophrenia, remained significant after controlling for demographic confounders such as medication and negative symptoms. No correlation was observed between the two other eye movement measures and any of the WAIS-III scores. CONCLUSIONS Herein, we reveal novel findings on the association between eye-movement-based measures of visual exploration and cognitive scores requiring visual search in subjects with schizophrenia and in healthy subjects. The association between scanpath length and matrix reasoning, a measure of perceptual organization in subjects with schizophrenia, implies the existence of common cognitive processes, and subjects with longer scanpath length may be advantageous in performance of perceptual organization tasks.
Collapse
Affiliation(s)
- Kentaro Morita
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 1138655, Japan
| | - Kenichiro Miura
- Department of Integrative Brain Science, Graduate School of Medicine, Kyoto University, Konoe-cho, Yoshida, Kyoto, Kyoto 6068501, Japan.
| | - Michiko Fujimoto
- Department of Psychiatry, Osaka University Graduate School of Medicine, D3, 2-2, Yamadaoka, Suita, Osaka 5650871, Japan; Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan
| | - Hidenaga Yamamori
- Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan; Japan Community Health care Organization Osaka Hospital, 4-2-78, Fukushima, Fukushima-ku, Osaka-city, Osaka 5530033, Japan
| | - Yuka Yasuda
- Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan; Life Grow Brilliant Mental Clinic, Takahashi Bldg. 7F, 2-1-21, Shibata, Kita-ku, Osaka-city, Osaka 5300012, Japan
| | - Noriko Kudo
- Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan
| | - Hirotsugu Azechi
- Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan
| | - Naohiro Okada
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 1138655, Japan; The International Research Center for Neurointelligence (WPI-IRCN) at The University of Tokyo Institutes for Advanced Study (UTIAS), The University of Tokyo, 7-3-1, Hongo, Tokyo 1138655, Japan
| | - Daisuke Koshiyama
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 1138655, Japan
| | - Manabu Ikeda
- Department of Psychiatry, Osaka University Graduate School of Medicine, D3, 2-2, Yamadaoka, Suita, Osaka 5650871, Japan
| | - Kiyoto Kasai
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo 1138655, Japan; The International Research Center for Neurointelligence (WPI-IRCN) at The University of Tokyo Institutes for Advanced Study (UTIAS), The University of Tokyo, 7-3-1, Hongo, Tokyo 1138655, Japan
| | - Ryota Hashimoto
- Department of Pathology of Mental Diseases, National Institute of Mental Health, National Center of Neurology and Psychiatry, 4-1-1, Ogawahigashi, Kodaira, Tokyo 1878553, Japan; Osaka University, D3, 2-2, Yamadaoka, Suita, Osaka 5650871, Japan.
| |
Collapse
|
48
|
Trukenbrod HA, Barthelmé S, Wichmann FA, Engbert R. Spatial statistics for gaze patterns in scene viewing: Effects of repeated viewing. J Vis 2019; 19:5. [PMID: 31173630 DOI: 10.1167/19.6.5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function, a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within 4° is stronger than expected from chance. Furthermore, the pair correlation function reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.
Collapse
Affiliation(s)
| | - Simon Barthelmé
- Centre National de la Recherche Scientifique, Gipsa-lab, Grenoble Institut National Polytechnique, France
| | - Felix A Wichmann
- Eberhard Karls University of Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience Tübingen, Tübingen, Germany.,Max Planck Institute for Intelligent Systems, Tübingen, Germany
| | | |
Collapse
|
49
|
Rothkegel LOM, Schütt HH, Trukenbrod HA, Wichmann FA, Engbert R. Searchers adjust their eye-movement dynamics to target characteristics in natural scenes. Sci Rep 2019; 9:1635. [PMID: 30733470 PMCID: PMC6367441 DOI: 10.1038/s41598-018-37548-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 12/07/2018] [Indexed: 11/30/2022] Open
Abstract
When searching a target in a natural scene, it has been shown that both the target's visual properties and similarity to the background influence whether and how fast humans are able to find it. So far, it was unclear whether searchers adjust the dynamics of their eye movements (e.g., fixation durations, saccade amplitudes) to the target they search for. In our experiment, participants searched natural scenes for six artificial targets with different spatial frequency content throughout eight consecutive sessions. High-spatial frequency targets led to smaller saccade amplitudes and shorter fixation durations than low-spatial frequency targets if target identity was known. If a saccade was programmed in the same direction as the previous saccade, fixation durations and successive saccade amplitudes were not influenced by target type. Visual saliency and empirical fixation density at the endpoints of saccades which maintain direction were comparatively low, indicating that these saccades were less selective. Our results suggest that searchers adjust their eye movement dynamics to the search target efficiently, since previous research has shown that low-spatial frequencies are visible farther into the periphery than high-spatial frequencies. We interpret the saccade direction specificity of our effects as an underlying separation into a default scanning mechanism and a selective, target-dependent mechanism.
Collapse
Affiliation(s)
- Lars O M Rothkegel
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany.
| | - Heiko H Schütt
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
- Neural Information Processing Group, University of Tübingen, Sand 6, 72076, Tübingen, Germany
| | - Hans A Trukenbrod
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Sand 6, 72076, Tübingen, Germany
- Max Planck Institute for Intelligent Systems, Max-Planck-Ring 4, 72076, Tübingen, Germany
| | - Ralf Engbert
- Department of Psychology, University of Potsdam, Karl-Liebknechtstraße 24/25, 14476, Potsdam, Germany
| |
Collapse
|
50
|
Kasprowski P, Harezlak K, Skurowski P. Implicit Calibration Using Probable Fixation Targets. SENSORS 2019; 19:s19010216. [PMID: 30626162 PMCID: PMC6339230 DOI: 10.3390/s19010216] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 12/13/2018] [Accepted: 12/25/2018] [Indexed: 11/16/2022]
Abstract
Proper calibration of eye movement signal registered by an eye tracker seems to be one of the main challenges in popularizing eye trackers as yet another user-input device. Classic calibration methods taking time and imposing unnatural behavior on eyes must be replaced by intelligent methods that are able to calibrate the signal without conscious cooperation by the user. Such an implicit calibration requires some knowledge about the stimulus a user is looking at and takes into account this information to predict probable gaze targets. This paper describes a possible method to perform implicit calibration: it starts with finding probable fixation targets (PFTs), then it uses these targets to build a mapping-probable gaze path. Various algorithms that may be used for finding PFTs and mappings are presented in the paper and errors are calculated using two datasets registered with two different types of eye trackers. The results show that although for now the implicit calibration provides results worse than the classic one, it may be comparable with it and sufficient for some applications.
Collapse
Affiliation(s)
- Pawel Kasprowski
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| | - Katarzyna Harezlak
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| | - Przemysław Skurowski
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| |
Collapse
|