1
|
Schwetlick L, Kümmerer M, Bethge M, Engbert R. Potsdam data set of eye movement on natural scenes (DAEMONS). Front Psychol 2024; 15:1389609. [PMID: 38800681 PMCID: PMC11116805 DOI: 10.3389/fpsyg.2024.1389609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 04/16/2024] [Indexed: 05/29/2024] Open
Affiliation(s)
- Lisa Schwetlick
- Department of Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| | | | - Matthias Bethge
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Ralf Engbert
- Department of Experimental and Biological Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
2
|
Burlingham CS, Sendhilnathan N, Komogortsev O, Murdison TS, Proulx MJ. Motor "laziness" constrains fixation selection in real-world tasks. Proc Natl Acad Sci U S A 2024; 121:e2302239121. [PMID: 38470927 PMCID: PMC10962974 DOI: 10.1073/pnas.2302239121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 02/02/2024] [Indexed: 03/14/2024] Open
Abstract
Humans coordinate their eye, head, and body movements to gather information from a dynamic environment while maximizing reward and minimizing biomechanical and energetic costs. However, such natural behavior is not possible in traditional experiments employing head/body restraints and artificial, static stimuli. Therefore, it is unclear to what extent mechanisms of fixation selection discovered in lab studies, such as inhibition-of-return (IOR), influence everyday behavior. To address this gap, participants performed nine real-world tasks, including driving, visually searching for an item, and building a Lego set, while wearing a mobile eye tracker (169 recordings; 26.6 h). Surprisingly, in all tasks, participants most often returned to what they just viewed and saccade latencies were shorter preceding return than forward saccades, i.e., consistent with facilitation, rather than inhibition, of return. We hypothesize that conservation of eye and head motor effort ("laziness") contributes. Correspondingly, we observed center biases in fixation position and duration relative to the head's orientation. A model that generates scanpaths by randomly sampling these distributions reproduced all return phenomena we observed, including distinct 3-fixation sequences for forward versus return saccades. After controlling for orbital eccentricity, one task (building a Lego set) showed evidence for IOR. This, along with small discrepancies between model and data, indicates that the brain balances minimization of motor costs with maximization of rewards (e.g., accomplished by IOR and other mechanisms) and that the optimal balance varies according to task demands. Supporting this account, the orbital range of motion used in each task traded off lawfully with fixation duration.
Collapse
Affiliation(s)
- Charlie S. Burlingham
- Reality Labs Research, Meta Platforms Inc., Redmond, WA98052
- Department of Psychology, New York University, New York, NY10003
| | | | - Oleg Komogortsev
- Reality Labs Research, Meta Platforms Inc., Redmond, WA98052
- Department of Computer Science, Texas State University, San Marcos, TX78666
| | | | | |
Collapse
|
3
|
Roth N, Rolfs M, Hellwich O, Obermayer K. Objects guide human gaze behavior in dynamic real-world scenes. PLoS Comput Biol 2023; 19:e1011512. [PMID: 37883331 PMCID: PMC10602265 DOI: 10.1371/journal.pcbi.1011512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 09/12/2023] [Indexed: 10/28/2023] Open
Abstract
The complexity of natural scenes makes it challenging to experimentally study the mechanisms behind human gaze behavior when viewing dynamic environments. Historically, eye movements were believed to be driven primarily by space-based attention towards locations with salient features. Increasing evidence suggests, however, that visual attention does not select locations with high saliency but operates on attentional units given by the objects in the scene. We present a new computational framework to investigate the importance of objects for attentional guidance. This framework is designed to simulate realistic scanpaths for dynamic real-world scenes, including saccade timing and smooth pursuit behavior. Individual model components are based on psychophysically uncovered mechanisms of visual attention and saccadic decision-making. All mechanisms are implemented in a modular fashion with a small number of well-interpretable parameters. To systematically analyze the importance of objects in guiding gaze behavior, we implemented five different models within this framework: two purely spatial models, where one is based on low-level saliency and one on high-level saliency, two object-based models, with one incorporating low-level saliency for each object and the other one not using any saliency information, and a mixed model with object-based attention and selection but space-based inhibition of return. We optimized each model's parameters to reproduce the saccade amplitude and fixation duration distributions of human scanpaths using evolutionary algorithms. We compared model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and returning to objects. A model with object-based attention and inhibition, which uses saliency information to prioritize between objects for saccadic selection, leads to scanpath statistics with the highest similarity to the human data. This demonstrates that scanpath models benefit from object-based attention and selection, suggesting that object-level attentional units play an important role in guiding attentional processing.
Collapse
Affiliation(s)
- Nicolas Roth
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
| | - Martin Rolfs
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Department of Psychology, Humboldt-Universität zu Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | - Olaf Hellwich
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Computer Engineering and Microelectronics, Technische Universität Berlin, Germany
| | - Klaus Obermayer
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| |
Collapse
|
4
|
Kümmerer M, Bethge M. Predicting Visual Fixations. Annu Rev Vis Sci 2023; 9:269-291. [PMID: 37419107 DOI: 10.1146/annurev-vision-120822-072528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
As we navigate and behave in the world, we are constantly deciding, a few times per second, where to look next. The outcomes of these decisions in response to visual input are comparatively easy to measure as trajectories of eye movements, offering insight into many unconscious and conscious visual and cognitive processes. In this article, we review recent advances in predicting where we look. We focus on evaluating and comparing models: How can we consistently measure how well models predict eye movements, and how can we judge the contribution of different mechanisms? Probabilistic models facilitate a unified approach to fixation prediction that allows us to use explainable information explained to compare different models across different settings, such as static and video saliency, as well as scanpath prediction. We review how the large variety of saliency maps and scanpath models can be translated into this unifying framework, how much different factors contribute, and how we can select the most informative examples for model comparison. We conclude that the universal scale of information gain offers a powerful tool for the inspection of candidate mechanisms and experimental design that helps us understand the continual decision-making process that determines where we look.
Collapse
Affiliation(s)
| | - Matthias Bethge
- Tübingen AI Center, University of Tübingen, Tübingen, Germany; ,
| |
Collapse
|
5
|
Borovska P, de Haas B. Faces in scenes attract rapid saccades. J Vis 2023; 23:11. [PMID: 37552021 PMCID: PMC10411644 DOI: 10.1167/jov.23.8.11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 06/29/2023] [Indexed: 08/09/2023] Open
Abstract
During natural vision, the human visual system has to process upcoming eye movements in parallel to currently fixated stimuli. Saccades targeting isolated faces are known to have lower latency and higher velocity, but it is unclear how this generalizes to the natural cycle of saccades and fixations during free-viewing of complex scenes. To which degree can the visual system process high-level features of extrafoveal stimuli when they are embedded in visual clutter and compete with concurrent foveal input? Here, we investigated how free-viewing dynamics vary as a function of an upcoming fixation target while controlling for various low-level factors. We found strong evidence that face- versus inanimate object-directed saccades are preceded by shorter fixations and have higher peak velocity. Interestingly, the boundary conditions for these two effects are dissociated. The effect on fixation duration was limited to face saccades, which were small and followed the trajectory of the preceding one, early in a trial. This is reminiscent of a recently proposed model of perisaccadic retinotopic shifts of attention. The effect on saccadic velocity, however, extended to very large saccades and increased with trial duration. These findings suggest that multiple, independent mechanisms interact to process high-level features of extrafoveal targets and modulate the dynamics of natural vision.
Collapse
Affiliation(s)
- Petra Borovska
- Experimental Psychology, Justus Liebig University, Giessen, Germany
| | - Benjamin de Haas
- Experimental Psychology, Justus Liebig University, Giessen, Germany
| |
Collapse
|
6
|
Kucharský Š, Zaharieva M, Raijmakers M, Visser I. Habituation, part
II
. Rethinking the habituation paradigm. INFANT AND CHILD DEVELOPMENT 2022. [DOI: 10.1002/icd.2383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Šimon Kucharský
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Department of Psychological Methods, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
| | - Martina Zaharieva
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Research Institute of Child Development and Education, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
| | - Maartje Raijmakers
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Department of Educational Studies and Learn!, Faculty of Behavioral and Movement Sciences Free University Amsterdam Amsterdam the Netherlands
| | - Ingmar Visser
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences University of Amsterdam Amsterdam the Netherlands
- Amsterdam Brain & Cognition (ABC) University of Amsterdam Amsterdam the Netherlands
| |
Collapse
|
7
|
Zhang M, Armendariz M, Xiao W, Rose O, Bendtz K, Livingstone M, Ponce C, Kreiman G. Look twice: A generalist computational model predicts return fixations across tasks and species. PLoS Comput Biol 2022; 18:e1010654. [PMID: 36413523 PMCID: PMC9681066 DOI: 10.1371/journal.pcbi.1010654] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 10/13/2022] [Indexed: 11/23/2022] Open
Abstract
Primates constantly explore their surroundings via saccadic eye movements that bring different parts of an image into high resolution. In addition to exploring new regions in the visual field, primates also make frequent return fixations, revisiting previously foveated locations. We systematically studied a total of 44,328 return fixations out of 217,440 fixations. Return fixations were ubiquitous across different behavioral tasks, in monkeys and humans, both when subjects viewed static images and when subjects performed natural behaviors. Return fixations locations were consistent across subjects, tended to occur within short temporal offsets, and typically followed a 180-degree turn in saccadic direction. To understand the origin of return fixations, we propose a proof-of-principle, biologically-inspired and image-computable neural network model. The model combines five key modules: an image feature extractor, bottom-up saliency cues, task-relevant visual features, finite inhibition-of-return, and saccade size constraints. Even though there are no free parameters that are fine-tuned for each specific task, species, or condition, the model produces fixation sequences resembling the universal properties of return fixations. These results provide initial steps towards a mechanistic understanding of the trade-off between rapid foveal recognition and the need to scrutinize previous fixation locations.
Collapse
Affiliation(s)
- Mengmi Zhang
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- CFAR and I2R, Agency for Science, Technology and Research, Singapore
| | - Marcelo Armendariz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- Laboratory for Neuro- and Psychophysiology, KU Leuven, Leuven, Belgium
| | - Will Xiao
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Olivia Rose
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Katarina Bendtz
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
| | - Margaret Livingstone
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Carlos Ponce
- Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Gabriel Kreiman
- Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Center for Brains, Minds and Machines, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
8
|
Kümmerer M, Bethge M, Wallis TSA. DeepGaze III: Modeling free-viewing human scanpaths with deep learning. J Vis 2022; 22:7. [PMID: 35472130 PMCID: PMC9055565 DOI: 10.1167/jov.22.5.7] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Humans typically move their eyes in “scanpaths” of fixations linked by saccades. Here we present DeepGaze III, a new model that predicts the spatial location of consecutive fixations in a free-viewing scanpath over static images. DeepGaze III is a deep learning–based model that combines image information with information about the previous fixation history to predict where a participant might fixate next. As a high-capacity and flexible model, DeepGaze III captures many relevant patterns in the human scanpath data, setting a new state of the art in the MIT300 dataset and thereby providing insight into how much information in scanpaths across observers exists in the first place. We use this insight to assess the importance of mechanisms implemented in simpler, interpretable models for fixation selection. Due to its architecture, DeepGaze III allows us to disentangle several factors that play an important role in fixation selection, such as the interplay of scene content and scanpath history. The modular nature of DeepGaze III allows us to conduct ablation studies, which show that scene content has a stronger effect on fixation selection than previous scanpath history in our main dataset. In addition, we can use the model to identify scenes for which the relative importance of these sources of information differs most. These data-driven insights would be difficult to accomplish with simpler models that do not have the computational capacity to capture such patterns, demonstrating an example of how deep learning advances can be used to contribute to scientific understanding.
Collapse
Affiliation(s)
| | | | - Thomas S A Wallis
- Technical University of Darmstadt, Institute of Psychology and Centre for Cognitive Science, Darmstadt, Germany.,
| |
Collapse
|
9
|
Cajar A, Engbert R, Laubrock J. Potsdam Eye-Movement Corpus for Scene Memorization and Search With Color and Spatial-Frequency Filtering. Front Psychol 2022; 13:850482. [PMID: 35282209 PMCID: PMC8904922 DOI: 10.3389/fpsyg.2022.850482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 01/31/2022] [Indexed: 11/24/2022] Open
Affiliation(s)
- Anke Cajar
- Department of Psychology and Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Department of Psychology and Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Jochen Laubrock
- Department of Psychology and Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
- Medizinische Hochschule Brandenburg Theodor Fontane, Neuruppin, Germany
| |
Collapse
|
10
|
Engbert R, Rabe MM, Schwetlick L, Seelig SA, Reich S, Vasishth S. Data assimilation in dynamical cognitive science. Trends Cogn Sci 2022; 26:99-102. [PMID: 34972646 DOI: 10.1016/j.tics.2021.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 11/23/2021] [Accepted: 11/24/2021] [Indexed: 11/26/2022]
Abstract
Dynamical models make specific assumptions about cognitive processes that generate human behavior. In data assimilation, these models are tested against time-ordered data. Recent progress on Bayesian data assimilation demonstrates that this approach combines the strengths of statistical modeling of individual differences with the those of dynamical cognitive models.
Collapse
Affiliation(s)
- Ralf Engbert
- Department of Psychology, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany.
| | | | - Lisa Schwetlick
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Stefan A Seelig
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Sebastian Reich
- Institute of Mathematics, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Shravan Vasishth
- Department of Linguistics, University of Potsdam, Potsdam, Germany; Research Focus Cognitive Sciences, University of Potsdam, Potsdam, Germany
| |
Collapse
|
11
|
Malem-Shinitski N, Opper M, Reich S, Schwetlick L, Seelig SA, Engbert R. A mathematical model of local and global attention in natural scene viewing. PLoS Comput Biol 2020; 16:e1007880. [PMID: 33315888 PMCID: PMC7769622 DOI: 10.1371/journal.pcbi.1007880] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 12/28/2020] [Accepted: 10/23/2020] [Indexed: 11/24/2022] Open
Abstract
Understanding the decision process underlying gaze control is an important question in cognitive neuroscience with applications in diverse fields ranging from psychology to computer vision. The decision for choosing an upcoming saccade target can be framed as a selection process between two states: Should the observer further inspect the information near the current gaze position (local attention) or continue with exploration of other patches of the given scene (global attention)? Here we propose and investigate a mathematical model motivated by switching between these two attentional states during scene viewing. The model is derived from a minimal set of assumptions that generates realistic eye movement behavior. We implemented a Bayesian approach for model parameter inference based on the model's likelihood function. In order to simplify the inference, we applied data augmentation methods that allowed the use of conjugate priors and the construction of an efficient Gibbs sampler. This approach turned out to be numerically efficient and permitted fitting interindividual differences in saccade statistics. Thus, the main contribution of our modeling approach is two-fold; first, we propose a new model for saccade generation in scene viewing. Second, we demonstrate the use of novel methods from Bayesian inference in the field of scan path modeling.
Collapse
Affiliation(s)
| | - Manfred Opper
- Department of Artificial Intelligence, Technische Universität Berlin, Berlin, Germany
| | - Sebastian Reich
- Institute of Mathematics, University of Potsdam, Potsdam, Germany
| | - Lisa Schwetlick
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Stefan A. Seelig
- Department of Psychology, University of Potsdam, Potsdam, Germany
| | - Ralf Engbert
- Department of Psychology, University of Potsdam, Potsdam, Germany
| |
Collapse
|