1
|
Mohamed Selim A, Barz M, Bhatti OS, Alam HMT, Sonntag D. A review of machine learning in scanpath analysis for passive gaze-based interaction. Front Artif Intell 2024; 7:1391745. [PMID: 38903158 PMCID: PMC11188426 DOI: 10.3389/frai.2024.1391745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/15/2024] [Indexed: 06/22/2024] Open
Abstract
The scanpath is an important concept in eye tracking. It refers to a person's eye movements over a period of time, commonly represented as a series of alternating fixations and saccades. Machine learning has been increasingly used for the automatic interpretation of scanpaths over the past few years, particularly in research on passive gaze-based interaction, i.e., interfaces that implicitly observe and interpret human eye movements, with the goal of improving the interaction. This literature review investigates research on machine learning applications in scanpath analysis for passive gaze-based interaction between 2012 and 2022, starting from 2,425 publications and focussing on 77 publications. We provide insights on research domains and common learning tasks in passive gaze-based interaction and present common machine learning practices from data collection and preparation to model selection and evaluation. We discuss commonly followed practices and identify gaps and challenges, especially concerning emerging machine learning topics, to guide future research in the field.
Collapse
Affiliation(s)
- Abdulrahman Mohamed Selim
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Michael Barz
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
- Applied Artificial Intelligence, University of Oldenburg, Oldenburg, Germany
| | - Omair Shahzad Bhatti
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Hasan Md Tusfiqur Alam
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Daniel Sonntag
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
- Applied Artificial Intelligence, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
2
|
Martinez-Cedillo AP, Foulsham T. Don't look now! Social elements are harder to avoid during scene viewing. Vision Res 2024; 216:108356. [PMID: 38184917 DOI: 10.1016/j.visres.2023.108356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/09/2023] [Accepted: 12/28/2023] [Indexed: 01/09/2024]
Abstract
Regions of social importance (i.e., other people) attract attention in real world scenes, but it is unclear how automatic this bias is and how it might interact with other guidance factors. To investigate this, we recorded eye movements while participants were explicitly instructed to avoid looking at one of two objects in a scene (either a person or a non-social object). The results showed that, while participants could follow these instructions, they still made errors (especially on the first saccade). Crucially, there were about twice as many erroneous looks towards the person than there were towards the other object. This indicates that it is hard to suppress the prioritization of social information during scene viewing, with implications for how quickly and automatically this information is perceived and attended to.
Collapse
Affiliation(s)
- A P Martinez-Cedillo
- Department of Psychology, University of York, York YO10 5DD, England; Department of Psychology, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, England.
| | - T Foulsham
- Department of Psychology, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, England
| |
Collapse
|
3
|
Recker L, Poth CH. Test-retest reliability of eye tracking measures in a computerized Trail Making Test. J Vis 2023; 23:15. [PMID: 37594452 PMCID: PMC10445213 DOI: 10.1167/jov.23.8.15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 07/12/2023] [Indexed: 08/19/2023] Open
Abstract
The Trail Making Test (TMT) is a frequently applied neuropsychological test that evaluates participants' executive functions based on their time to connect a sequence of numbers (TMT-A) or alternating numbers and letters (TMT-B). Test performance is associated with various cognitive functions ranging from visuomotor speed to working memory capabilities. However, although the test can screen for impaired executive functioning in a variety of neuropsychiatric disorders, it provides only little information about which specific cognitive impairments underlie performance detriments. To resolve this lack of specificity, recent cognitive research combined the TMT with eye tracking so that eye movements could help uncover reasons for performance impairments. However, using eye-tracking-based test scores to examine differences between persons, and ultimately apply the scores for diagnostics, presupposes that the reliability of the scores is established. Therefore, we investigated the test-retest reliabilities of scores in an eye-tracking version of the TMT recently introduced by Recker et al. (2022). We examined two healthy samples performing an initial test and then a retest 3 days (n = 31) or 10 to 30 days (n = 34) later. Results reveal that, although reliabilities of classic completion times were overall good, comparable with earlier versions, reliabilities of eye-tracking-based scores ranged from excellent (e.g., durations of fixations) to poor (e.g., number of fixations guiding manual responses). These findings indicate that some eye-tracking measures offer a strong basis for assessing interindividual differences beyond classic behavioral measures when examining processes related to information accumulation processes but are less suitable to diagnose differences in eye-hand coordination.
Collapse
Affiliation(s)
- Lukas Recker
- Neuro-Cognitive Psychology and Center for Cognitive Interaction Technology, Bielefeld University, Bielefeld, Germany
- https://orcid.org/0000-0001-8465-9643
- https://www.uni-bielefeld.de/fakultaeten/psychologie/abteilung/arbeitseinheiten/01/people/scientificstaff/recker/
| | - Christian H Poth
- Neuro-Cognitive Psychology and Center for Cognitive Interaction Technology, Bielefeld University, Bielefeld, Germany
- https://orcid.org/0000-0003-1621-4911
| |
Collapse
|
4
|
Priscilla Martinez-Cedillo A, Dent K, Foulsham T. Do cognitive load and ADHD traits affect the tendency to prioritise social information in scenes? Q J Exp Psychol (Hove) 2022; 75:1904-1918. [PMID: 34844477 PMCID: PMC9424720 DOI: 10.1177/17470218211066475] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
We report two experiments investigating the effect of working memory (WM)
load on selective attention. Experiment 1 was a modified version of
Lavie et al. and confirmed that increasing memory load disrupted
performance in the classic flanker task. Experiment 2 used the same
manipulation of WM load to probe attention during the viewing of
complex scenes while also investigating individual differences in
attention deficit hyperactivity disorder (ADHD) traits. In the
image-viewing task, we measured the degree to which fixations targeted
each of two crucial objects: (1) a social object (a person in the
scene) and (2) a non-social object of higher or lower physical
salience. We compared the extent to which increasing WM load would
change the pattern of viewing of the physically salient and socially
salient objects. If attending to the social item requires greater
default voluntary top-down resources, then the viewing of social
objects should show stronger modulation by WM load compared with
viewing of physically salient objects. The results showed that the
social object was fixated to a greater degree than the other object
(regardless of physical salience). Increased salience drew fixations
away from the background leading to slightly increased fixations on
the non-social object, without changing fixations on the social
object. Increased levels of ADHD-like traits were associated with
fewer fixations on the social object, but only in the high-salient,
low-load condition. Importantly, WM load did not affect the number of
fixations on the social object. Such findings suggest rather
surprisingly that attending to a social area in complex stimuli is not
dependent on the availability of voluntary top-down resources.
Collapse
Affiliation(s)
| | - Kevin Dent
- Department of Psychology, University of Essex, Colchester, UK
| | - Tom Foulsham
- Department of Psychology, University of Essex, Colchester, UK
| |
Collapse
|
5
|
Wiebel-Herboth CB, Krüger M, Wollstadt P. Measuring inter- and intra-individual differences in visual scan patterns in a driving simulator experiment using active information storage. PLoS One 2021; 16:e0248166. [PMID: 33735199 PMCID: PMC7971706 DOI: 10.1371/journal.pone.0248166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/20/2021] [Indexed: 11/17/2022] Open
Abstract
Scan pattern analysis has been discussed as a promising tool in the context of real-time gaze-based applications. In particular, information-theoretic measures of scan path predictability, such as the gaze transition entropy (GTE), have been proposed for detecting relevant changes in user state or task demand. These measures model scan patterns as first-order Markov chains, assuming that only the location of the previous fixation is predictive of the next fixation in time. However, this assumption may not be sufficient in general, as recent research has shown that scan patterns may also exhibit more long-range temporal correlations. Thus, we here evaluate the active information storage (AIS) as a novel information-theoretic approach to quantifying scan path predictability in a dynamic task. In contrast to the GTE, the AIS provides means to statistically test and account for temporal correlations in scan path data beyond the previous last fixation. We compare AIS to GTE in a driving simulator experiment, in which participants drove in a highway scenario, where trials were defined based on an experimental manipulation that encouraged the driver to start an overtaking maneuver. Two levels of difficulty were realized by varying the time left to complete the task. We found that individual observers indeed showed temporal correlations beyond a single past fixation and that the length of the correlation varied between observers. No effect of task difficulty was observed on scan path predictability for either AIS or GTE, but we found a significant increase in predictability during overtaking. Importantly, for participants for which the first-order Markov chain assumption did not hold, this was only shown using AIS but not GTE. We conclude that accounting for longer time horizons in scan paths in a personalized fashion is beneficial for interpreting gaze pattern in dynamic tasks.
Collapse
Affiliation(s)
| | - Matti Krüger
- Honda Research Institute Europe, Offenbach/Main, Germany
| | | |
Collapse
|
6
|
Leroy A, Spotorno S, Faure S. Emotional scene processing in children and adolescents with attention deficit/hyperactivity disorder: a systematic review. Eur Child Adolesc Psychiatry 2021; 30:331-346. [PMID: 32034554 DOI: 10.1007/s00787-020-01480-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 01/23/2020] [Indexed: 10/25/2022]
Abstract
"Impairments in emotional information processing are frequently reported in attention deficit hyperactivity disorder (ADHD) at a voluntary, explicit level (e.g., emotion recognition) and at an involuntary, implicit level (e.g., emotional interference). Most of previous studies have used faces with emotional expressions, rarely examining other important sources of information usually co-occurring with faces in our every day experience. Here, we examined how the emotional content of an entire visual scene depicting real-world environments and situations is processed in ADHD. We systematically reviewed in PubMed, SCOPUS and ScienceDirect, using the PRISMA guidelines, empirical studies published in English until March 2019, about processing of visual scenes, with or without emotional content, in children and adolescents with ADHD. We included 17 studies among the 154 initially identified. Fifteen used scenes with emotional content (which was task-relevant in seven and irrelevant in eight studies) and two used scenes without emotional content. Even though the interpretation of the results differed according to the theoretical model of emotions of the study and the presence of comorbidity, differences in scene information processing between ADHD and typically developing children and adolescents were reported in all but one study. ADHD children and adolescents show difficulties in the processing of emotional information conveyed by visual scenes, which may stem from a stronger bottom-up impact of emotional stimuli in ADHD, increasing the emotional experience, and from core deficits of the disorder, decreasing the overall processing of the scene".
Collapse
Affiliation(s)
- Anaïs Leroy
- Laboratoire D'Anthropologie Et de Psychologie Cliniques, Cognitives Et Sociales (LAPCOS), MSHS Sud Est, Université Côte D'Azur, Pôle Universitaire Saint Jean D'Angely, 24 avenue des Diables Bleus, 06357, Nice Cédex 4, France. .,CERTA, Reference Centre for Learning Disabilities, Fondation Lenval, Hôpitaux Pédiatriques de Nice CHU-Lenval, Nice, France.
| | | | - Sylvane Faure
- Laboratoire D'Anthropologie Et de Psychologie Cliniques, Cognitives Et Sociales (LAPCOS), MSHS Sud Est, Université Côte D'Azur, Pôle Universitaire Saint Jean D'Angely, 24 avenue des Diables Bleus, 06357, Nice Cédex 4, France
| |
Collapse
|
7
|
Quantifying the Predictability of Visual Scanpaths Using Active Information Storage. ENTROPY 2021; 23:e23020167. [PMID: 33573069 PMCID: PMC7912697 DOI: 10.3390/e23020167] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/22/2021] [Accepted: 01/23/2021] [Indexed: 12/27/2022]
Abstract
Entropy-based measures are an important tool for studying human gaze behavior under various conditions. In particular, gaze transition entropy (GTE) is a popular method to quantify the predictability of a visual scanpath as the entropy of transitions between fixations and has been shown to correlate with changes in task demand or changes in observer state. Measuring scanpath predictability is thus a promising approach to identifying viewers' cognitive states in behavioral experiments or gaze-based applications. However, GTE does not account for temporal dependencies beyond two consecutive fixations and may thus underestimate the actual predictability of the current fixation given past gaze behavior. Instead, we propose to quantify scanpath predictability by estimating the active information storage (AIS), which can account for dependencies spanning multiple fixations. AIS is calculated as the mutual information between a processes' multivariate past state and its next value. It is thus able to measure how much information a sequence of past fixations provides about the next fixation, hence covering a longer temporal horizon. Applying the proposed approach, we were able to distinguish between induced observer states based on estimated AIS, providing first evidence that AIS may be used in the inference of user states to improve human-machine interaction.
Collapse
|
8
|
Rigby SN, Jakobson LS, Pearson PM, Stoesz BM. Alexithymia and the Evaluation of Emotionally Valenced Scenes. Front Psychol 2020; 11:1820. [PMID: 32793083 PMCID: PMC7394003 DOI: 10.3389/fpsyg.2020.01820] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/01/2020] [Indexed: 01/15/2023] Open
Abstract
Alexithymia is a personality trait characterized by difficulties identifying and describing feelings (DIF and DDF) and an externally oriented thinking (EOT) style. The primary aim of the present study was to investigate links between alexithymia and the evaluation of emotional scenes. We also investigated whether viewers' evaluations of emotional scenes were better predicted by specific alexithymic traits or by individual differences in sensory processing sensitivity (SPS). Participants (N = 106) completed measures of alexithymia and SPS along with a task requiring speeded judgments of the pleasantness of 120 moderately arousing scenes. We did not replicate laterality effects previously described with the scene perception task. Compared to those with weak alexithymic traits, individuals with moderate-to-strong alexithymic traits were less likely to classify positively valenced scenes as pleasant and were less likely to classify scenes with (vs. without) implied motion (IM) in a way that was consistent with normative scene valence ratings. In addition, regression analyses confirmed that reporting strong EOT and a tendency to be easily overwhelmed by busy sensory environments negatively predicted classification accuracy for positive scenes, and that both DDF and EOT negatively predicted classification accuracy for scenes depicting IM. These findings highlight the importance of accounting for stimulus characteristics and individual differences in specific traits associated with alexithymia and SPS when investigating the processing of emotional stimuli. Learning more about the links between these individual difference variables may have significant clinical implications, given that alexithymia is an important, transdiagnostic risk factor for a wide range of psychopathologies.
Collapse
Affiliation(s)
- Sarah N Rigby
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada
| | - Lorna S Jakobson
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada
| | - Pauline M Pearson
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada.,Department of Psychology, University of Winnipeg, Winnipeg, MB, Canada
| | - Brenda M Stoesz
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada.,Centre for the Advancement of Teaching and Learning, University of Manitoba, Winnipeg, MB, Canada
| |
Collapse
|
9
|
Hayes TR, Henderson JM. Center bias outperforms image salience but not semantics in accounting for attention during scene viewing. Atten Percept Psychophys 2020; 82:985-994. [PMID: 31456175 PMCID: PMC11149060 DOI: 10.3758/s13414-019-01849-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
How do we determine where to focus our attention in real-world scenes? Image saliency theory proposes that our attention is 'pulled' to scene regions that differ in low-level image features. However, models that formalize image saliency theory often contain significant scene-independent spatial biases. In the present studies, three different viewing tasks were used to evaluate whether image saliency models account for variance in scene fixation density based primarily on scene-dependent, low-level feature contrast, or on their scene-independent spatial biases. For comparison, fixation density was also compared to semantic feature maps (Meaning Maps; Henderson & Hayes, Nature Human Behaviour, 1, 743-747, 2017) that were generated using human ratings of isolated scene patches. The squared correlations (R2) between scene fixation density and each image saliency model's center bias, each full image saliency model, and meaning maps were computed. The results showed that in tasks that produced observer center bias, the image saliency models on average explained 23% less variance in scene fixation density than their center biases alone. In comparison, meaning maps explained on average 10% more variance than center bias alone. We conclude that image saliency theory generalizes poorly to real-world scenes.
Collapse
Affiliation(s)
- Taylor R Hayes
- Center for Mind and Brain, University of California, Davis, CA, USA.
| | - John M Henderson
- Center for Mind and Brain, University of California, Davis, CA, USA
- Department of Psychology, University of California, Davis, CA, USA
| |
Collapse
|
10
|
Cronin DA, Hall EH, Goold JE, Hayes TR, Henderson JM. Eye Movements in Real-World Scene Photographs: General Characteristics and Effects of Viewing Task. Front Psychol 2020; 10:2915. [PMID: 32010016 PMCID: PMC6971407 DOI: 10.3389/fpsyg.2019.02915] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 12/10/2019] [Indexed: 11/13/2022] Open
Abstract
The present study examines eye movement behavior in real-world scenes with a large (N = 100) sample. We report baseline measures of eye movement behavior in our sample, including mean fixation duration, saccade amplitude, and initial saccade latency. We also characterize how eye movement behaviors change over the course of a 12 s trial. These baseline measures will be of use to future work studying eye movement behavior in scenes in a variety of literatures. We also examine effects of viewing task on when and where the eyes move in real-world scenes: participants engaged in a memorization and an aesthetic judgment task while viewing 100 scenes. While we find no difference at the mean-level between the two tasks, temporal- and distribution-level analyses reveal significant task-driven differences in eye movement behavior.
Collapse
Affiliation(s)
- Deborah A. Cronin
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Elizabeth H. Hall
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
- Department of Psychology, University of California, Davis, Davis, CA, United States
| | - Jessica E. Goold
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Taylor R. Hayes
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - John M. Henderson
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
- Department of Psychology, University of California, Davis, Davis, CA, United States
| |
Collapse
|
11
|
Ioannou C, Seernani D, Stefanou ME, Biscaldi-Schaefer M, Tebartz Van Elst L, Fleischhaker C, Boccignone G, Klein C. Social Visual Perception Under the Eye of Bayesian Theories in Autism Spectrum Disorder Using Advanced Modeling of Spatial and Temporal Parameters. Front Psychiatry 2020; 11:585149. [PMID: 33101094 PMCID: PMC7546363 DOI: 10.3389/fpsyt.2020.585149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 08/28/2020] [Indexed: 01/03/2023] Open
Abstract
Social interaction in individuals with Autism Spectrum Disorder (ASD) is characterized by qualitative impairments that highly impact quality of life. Bayesian theories in ASD frame an understanding of underlying mechanisms suggesting atypicalities in the evaluation of probabilistic links within the perceptual environment of the affected individual. To address these theories, the present study explores the applicability of an innovative Bayesian framework on social visual perception in ASD and demonstrates the use of gaze transitions between different parts of social scenes. We applied advanced analyses with Bayesian Hidden Markov Modeling (BHMM) to track gaze movements while presenting real-life scenes to typically developing (TD) children and adolescents (N = 25) and participants with ASD and Attention-Deficit/Hyperactivity Disorder (ASD+ADHD, N = 15) and ASD without comorbidity (ASD, N = 12). Regions of interest (ROIs) were generated by BHMM based both on spatial and temporal gaze behavior. Social visual perception was compared between groups using transition and fixation variables for social (faces, bodies) and non-social ROIs. Transition variables between faces, namely gaze transitions between faces and likelihood of linking faces, were reduced in the ASD+ADHD compared to TD participants. Fixation count to faces was also reduced in this group. The ASD group showed similar performance to TD in the studied variables. There was no difference between groups for non-social ROIs. Our study provides an innovative, interpretable example of applying Bayesian theories of social visual perception in ASD. BHMM analyses and gaze transitions have the potential to reveal fundamental social perception components in ASD, contributing thus to amelioration of social-skill interventions.
Collapse
Affiliation(s)
- Chara Ioannou
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany
| | - Divya Seernani
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany
| | - Maria Elena Stefanou
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany.,School of Psychology and Clinical Language Sciences, University of Reading, Reading, United Kingdom
| | - Monica Biscaldi-Schaefer
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany
| | - Ludger Tebartz Van Elst
- Department of Psychiatry and Psychotherapy, Medical Faculty, University of Freiburg, Freiburg, Germany
| | - Christian Fleischhaker
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany
| | | | - Christoph Klein
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty, University of Freiburg, Freiburg, Germany.,Department of Child and Adoelscent Psychiatry, University Hospital Cologne, Cologne, North Rhine-Westphalia, Germany.,Department of Psychiatry, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
12
|
Loschky LC, Larson AM, Smith TJ, Magliano JP. The Scene Perception & Event Comprehension Theory (SPECT) Applied to Visual Narratives. Top Cogn Sci 2019; 12:311-351. [PMID: 31486277 PMCID: PMC9328418 DOI: 10.1111/tops.12455] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 08/05/2019] [Accepted: 08/05/2019] [Indexed: 11/29/2022]
Abstract
Understanding how people comprehend visual narratives (including picture stories, comics, and film) requires the combination of traditionally separate theories that span the initial sensory and perceptual processing of complex visual scenes, the perception of events over time, and comprehension of narratives. Existing piecemeal approaches fail to capture the interplay between these levels of processing. Here, we propose the Scene Perception & Event Comprehension Theory (SPECT), as applied to visual narratives, which distinguishes between front‐end and back‐end cognitive processes. Front‐end processes occur during single eye fixations and are comprised of attentional selection and information extraction. Back‐end processes occur across multiple fixations and support the construction of event models, which reflect understanding of what is happening now in a narrative (stored in working memory) and over the course of the entire narrative (stored in long‐term episodic memory). We describe relationships between front‐ and back‐end processes, and medium‐specific differences that likely produce variation in front‐end and back‐end processes across media (e.g., picture stories vs. film). We describe several novel research questions derived from SPECT that we have explored. By addressing these questions, we provide greater insight into how attention, information extraction, and event model processes are dynamically coordinated to perceive and understand complex naturalistic visual events in narratives and the real world. Comprehension of visual narratives like comics, picture stories, and films involves both decoding the visual content and construing the meaningful events they represent. The Scene Perception & Event Comprehension Theory (SPECT) proposes a framework for understanding how a comprehender perceptually negotiates the surface of a visual representation and integrates its meaning into a growing mental model.
Collapse
Affiliation(s)
| | | | - Tim J Smith
- Department of Psychological Sciences, Birkbeck, University of London
| | | |
Collapse
|
13
|
Frost-Karlsson M, Galazka MA, Gillberg C, Gillberg C, Miniscalco C, Billstedt E, Hadjikhani N, Åsberg Johnels J. Social scene perception in autism spectrum disorder: An eye-tracking and pupillometric study. J Clin Exp Neuropsychol 2019; 41:1024-1032. [DOI: 10.1080/13803395.2019.1646214] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Morgan Frost-Karlsson
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
- Center for Social and Affective Neuroscience, Linköping University, Linköping, Sweden
| | | | - Christopher Gillberg
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
| | - Carina Gillberg
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
| | - Carmela Miniscalco
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
| | - Eva Billstedt
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
| | - Nouchine Hadjikhani
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
- Harvard Medical School/MGH/MIT, Athinoula A. Martinos Center for Biomedical Imaging, Boston, MA, USA
| | - Jakob Åsberg Johnels
- Gillberg Neuropsychiatry Centre, Sahlgrenska academy, University of Gothenburg, Gothenburg, Sweden
- Section for Speech and Language Pathology, Sahlgrenska academy University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
14
|
Król ME, Król M. Scanpath similarity measure reveals not only a decreased social preference, but also an increased nonsocial preference in individuals with autism. AUTISM : THE INTERNATIONAL JOURNAL OF RESEARCH AND PRACTICE 2019; 24:374-386. [DOI: 10.1177/1362361319865809] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We compared scanpath similarity in response to repeated presentations of social and nonsocial images representing natural scenes in a sample of 30 participants with autism spectrum disorder and 32 matched typically developing individuals. We used scanpath similarity (calculated using ScanMatch) as a novel measure of attentional bias or preference, which constrains eye-movement patterns by directing attention to specific visual or semantic features of the image. We found that, compared with the control group, scanpath similarity of participants with autism was significantly higher in response to nonsocial images, and significantly lower in response to social images. Moreover, scanpaths of participants with autism were more similar to scanpaths of other participants with autism in response to nonsocial images, and less similar in response to social images. Finally, we also found that in response to nonsocial images, scanpath similarity of participants with autism did not decline with stimulus repetition to the same extent as in the control group, which suggests more perseverative attention in the autism spectrum disorder group. These results show a preferential fixation on certain elements of social stimuli in typically developing individuals compared with individuals with autism, and on certain elements of nonsocial stimuli in the autism spectrum disorder group, compared with the typically developing group.
Collapse
Affiliation(s)
| | - Michał Król
- School of Social Sciences, The University of Manchester, UK
| |
Collapse
|
15
|
de Haas B, Iakovidis AL, Schwarzkopf DS, Gegenfurtner KR. Individual differences in visual salience vary along semantic dimensions. Proc Natl Acad Sci U S A 2019; 116:11687-11692. [PMID: 31138705 PMCID: PMC6576124 DOI: 10.1073/pnas.1820553116] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
What determines where we look? Theories of attentional guidance hold that image features and task demands govern fixation behavior, while differences between observers are interpreted as a "noise-ceiling" that strictly limits predictability of fixations. However, recent twin studies suggest a genetic basis of gaze-trace similarity for a given stimulus. This leads to the question of how individuals differ in their gaze behavior and what may explain these differences. Here, we investigated the fixations of >100 human adults freely viewing a large set of complex scenes containing thousands of semantically annotated objects. We found systematic individual differences in fixation frequencies along six semantic stimulus dimensions. These differences were large (>twofold) and highly stable across images and time. Surprisingly, they also held for first fixations directed toward each image, commonly interpreted as "bottom-up" visual salience. Their perceptual relevance was documented by a correlation between individual face salience and face recognition skills. The set of reliable individual salience dimensions and their covariance pattern replicated across samples from three different countries, suggesting they reflect fundamental biological mechanisms of attention. Our findings show stable individual differences in salience along a set of fundamental semantic dimensions and that these differences have meaningful perceptual implications. Visual salience reflects features of the observer as well as the image.
Collapse
Affiliation(s)
- Benjamin de Haas
- Department of Psychology, Justus Liebig Universität, 35394 Giessen, Germany;
- Experimental Psychology, University College London, WC1H 0AP London, United Kingdom
| | - Alexios L Iakovidis
- Experimental Psychology, University College London, WC1H 0AP London, United Kingdom
| | - D Samuel Schwarzkopf
- Experimental Psychology, University College London, WC1H 0AP London, United Kingdom
- School of Optometry & Vision Science, University of Auckland, 1142 Auckland, New Zealand
| | | |
Collapse
|