1
|
Roth N, Rolfs M, Hellwich O, Obermayer K. Objects guide human gaze behavior in dynamic real-world scenes. PLoS Comput Biol 2023; 19:e1011512. [PMID: 37883331 PMCID: PMC10602265 DOI: 10.1371/journal.pcbi.1011512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 09/12/2023] [Indexed: 10/28/2023] Open
Abstract
The complexity of natural scenes makes it challenging to experimentally study the mechanisms behind human gaze behavior when viewing dynamic environments. Historically, eye movements were believed to be driven primarily by space-based attention towards locations with salient features. Increasing evidence suggests, however, that visual attention does not select locations with high saliency but operates on attentional units given by the objects in the scene. We present a new computational framework to investigate the importance of objects for attentional guidance. This framework is designed to simulate realistic scanpaths for dynamic real-world scenes, including saccade timing and smooth pursuit behavior. Individual model components are based on psychophysically uncovered mechanisms of visual attention and saccadic decision-making. All mechanisms are implemented in a modular fashion with a small number of well-interpretable parameters. To systematically analyze the importance of objects in guiding gaze behavior, we implemented five different models within this framework: two purely spatial models, where one is based on low-level saliency and one on high-level saliency, two object-based models, with one incorporating low-level saliency for each object and the other one not using any saliency information, and a mixed model with object-based attention and selection but space-based inhibition of return. We optimized each model's parameters to reproduce the saccade amplitude and fixation duration distributions of human scanpaths using evolutionary algorithms. We compared model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and returning to objects. A model with object-based attention and inhibition, which uses saliency information to prioritize between objects for saccadic selection, leads to scanpath statistics with the highest similarity to the human data. This demonstrates that scanpath models benefit from object-based attention and selection, suggesting that object-level attentional units play an important role in guiding attentional processing.
Collapse
Affiliation(s)
- Nicolas Roth
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
| | - Martin Rolfs
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Department of Psychology, Humboldt-Universität zu Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | - Olaf Hellwich
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Computer Engineering and Microelectronics, Technische Universität Berlin, Germany
| | - Klaus Obermayer
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| |
Collapse
|
2
|
Lin TC, Krishnan AU, Li Z. Perception-Motion Coupling in Active Telepresence: Human Behavior and Teleoperation Interface Design. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION 2022. [DOI: 10.1145/3571599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Teleoperation enables complex robot platforms to perform tasks beyond the scope of the current state-of-the-art robot autonomy by imparting human intelligence and critical thinking to these operations. For seamless control of robot platforms, it is essential to facilitate optimal situational awareness of the workspace for the operator through active telepresence cameras. However, the control of these active telepresence cameras adds an additional degree of complexity to the task of teleoperation. In this paper we present our results from the user study that investigates: 1) how the teleoperator learns or adapts to performing the tasks via active cameras modeled after camera placements on the TRINA humanoid robot; 2) the perception-action coupling operators implement to control active telepresence cameras, and 3) the camera preferences for performing the tasks. These findings from the human motion analysis and post-study survey will help us determine desired design features for robot teleoperation interfaces and assistive autonomy.
Collapse
Affiliation(s)
- Tsung-Chi Lin
- Worcester Polytechnic Institute, Robotics Engineering
| | | | - Zhi Li
- Worcester Polytechnic Institute, Robotics Engineering
| |
Collapse
|
3
|
Morales A, Costela FM, Woods RL. Saccade Landing Point Prediction Based on Fine-Grained Learning Method. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2021; 9:52474-52484. [PMID: 33981520 PMCID: PMC8112574 DOI: 10.1109/access.2021.3070511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The landing point of a saccade defines the new fixation region, the new region of interest. We asked whether it was possible to predict the saccade landing point early in this very fast eye movement. This work proposes a new algorithm based on LSTM networks and a fine-grained loss function for saccade landing point prediction in real-world scenarios. Predicting the landing point is a critical milestone toward reducing the problems caused by display-update latency in gaze-contingent systems that make real-time changes in the display based on eye tracking. Saccadic eye movements are some of the fastest human neuro-motor activities with angular velocities of up to 1,000°/s. We present a comprehensive analysis of the performance of our method using a database with almost 220,000 saccades from 75 participants captured during natural viewing of videos. We include a comparison with state-of-the-art saccade landing point prediction algorithms. The results obtained using our proposed method outperformed existing approaches with improvements of up to 50% error reduction. Finally, we analyzed some factors that affected prediction errors including duration, length, age, and user intrinsic characteristics.
Collapse
Affiliation(s)
- Aythami Morales
- BiDA-Lab, Department of Electrical Engineering, Universidad Autonoma de Madrid, 28049 Madrid, Spain
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, MA 02114, USA
| | - Francisco M Costela
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, MA 02114, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA 02115, USA
| | - Russell L Woods
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, MA 02114, USA
- Department of Ophthalmology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
4
|
Costela FM, Reeves SM, Woods RL. Orientation of the preferred retinal locus (PRL) is maintained following changes in simulated scotoma size. J Vis 2020; 20:25. [PMID: 33555170 PMCID: PMC7424101 DOI: 10.1167/jov.20.7.25] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although macular lesions often enlarge, we know little about what happens when the preferred retinal locus (PRL) is enveloped by the lesion. We present a prospective study of subjects with normal vision who were trained to develop a PRL using simulated scotomas with a gaze-contingent visual display. We hypothesized that, when subjects had developed a robust PRL and the scotoma size was increased, the PRL would move to remain outside the scotoma and in a direction that maintained the orientation (theta) of the PRL relative to the fovea. Nine subjects with normal vision were trained to develop a PRL and were then exposed to scotoma sizes that ranged from 4° to 24° in diameter. Subjects tracked a stimulus using saccades or smooth pursuits. Fixation stability was measured by calculating the bivariate contour ellipse area (BCEA). To measure the reassignment of the oculomotor reference (OMR) to the PRL, we analyzed the spread (BCEA) of saccade first landing points. All subjects developed a robust PRL that did not vary more than 0.8° on average between blocks of trials of a scotoma size, and they maintained the orientation of the PRL as the simulated scotoma size varied (±9° median standard deviation in theta, defined as orientation angle). Fixation stability and OMR to the PRL worsened (larger BCEA) with increasing scotoma size. This, and related studies, could guide development of a PRL training method to help people with central vision loss.
Collapse
Affiliation(s)
- Francisco M Costela
- Schepens Eye Research Institute, Massachusetts Eye and Ear , Boston, MA , USA.,Department of Ophthalmology, Harvard Medical School , Boston, MA , USA.,
| | - Stephanie M Reeves
- Schepens Eye Research Institute, Massachusetts Eye and Ear , Boston, MA , USA.,
| | - Russell L Woods
- Schepens Eye Research Institute, Massachusetts Eye and Ear , Boston, MA , USA.,Department of Ophthalmology, Harvard Medical School , Boston, MA , USA.,
| |
Collapse
|
5
|
Zhu H, Salcudean S, Rohling R. The Neyman Pearson detection of microsaccades with maximum likelihood estimation of parameters. J Vis 2019; 19:17. [DOI: 10.1167/19.13.17] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Hongzhi Zhu
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Septimiu Salcudean
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
- ://www.ece.ubc.ca/faculty/tim-salcudean
| | - Robert Rohling
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
- ://www.ece.ubc.ca/faculty/robert-rohling
| |
Collapse
|
6
|
Costela FM, Woods RL. When Watching Video, Many Saccades Are Curved and Deviate From a Velocity Profile Model. Front Neurosci 2019; 12:960. [PMID: 30666178 PMCID: PMC6330331 DOI: 10.3389/fnins.2018.00960] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Accepted: 12/03/2018] [Indexed: 12/20/2022] Open
Abstract
Commonly, saccades are thought to be ballistic eye movements, not modified during flight, with a straight path and a well-described velocity profile. However, they do not always follow a straight path and studies of saccade curvature have been reported previously. In a prior study, we developed a real-time, saccade-trajectory prediction algorithm to improve the updating of gaze-contingent displays and found that saccades with a curved path or that deviated from the expected velocity profile were not well fit by our saccade-prediction algorithm (velocity-profile deviation), and thus had larger updating errors than saccades that had a straight path and had a velocity profile that was fit well by the model. Further, we noticed that the curved saccades and saccades with high velocity-profile deviations were more common than we had expected when participants performed a natural-viewing task. Since those saccades caused larger display updating errors, we sought a better understanding of them. Here we examine factors that could affect curvature and velocity profile of saccades using a pool of 218,744 saccades from 71 participants watching “Hollywood” video clips. Those factors included characteristics of the participants (e.g., age), of the videos (importance of faces for following the story, genre), of the saccade (e.g., magnitude, direction), time during the session (e.g., fatigue) and presence and timing of scene cuts. While viewing the video clips, saccades were most likely horizontal or vertical over oblique. Measured curvature and velocity-profile deviation had continuous, skewed frequency distributions. We used mixed-effects regression models that included cubic terms and found a complex relationship between curvature, velocity-profile deviation and saccade duration (or magnitude). Curvature and velocity-profile deviation were related to some video-dependent features such as lighting, face presence, or nature and human figure content. Time during the session was a predictor for velocity profile deviations. Further, we found a relationship for saccades that were in flight at the time of a scene cut to have higher velocity-profile deviations and lower curvature in univariable models. Saccades characteristics vary with a variety of factors, which suggests complex interactions between oculomotor control and scene content that could be explored further.
Collapse
Affiliation(s)
- Francisco M Costela
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, MA, United States.,Department of Ophthalmology, Harvard Medical School, Boston, MA, United States
| | - Russell L Woods
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, MA, United States.,Department of Ophthalmology, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
7
|
Costela FM, Saunders DR, Rose DJ, Katjezovic S, Reeves SM, Woods RL. People With Central Vision Loss Have Difficulty Watching Videos. Invest Ophthalmol Vis Sci 2019; 60:358-364. [PMID: 30682208 PMCID: PMC6354940 DOI: 10.1167/iovs.18-25540] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 12/03/2018] [Indexed: 11/24/2022] Open
Abstract
Purpose People with central vision loss (CVL) often report difficulties watching video. We objectively evaluated the ability to follow the story (using the information acquisition method). Methods Subjects with CVL (n = 23) or normal vision (NV, n = 60) described the content of 30-second video clips from movies and documentaries. We derived an objective information acquisition (IA) score for each response using natural-language processing. To test whether the impact of CVL was simply due to reduced resolution, another group of NV subjects (n = 15) described video clips with defocus blur that reduced visual acuity to 20/50 to 20/800. Mixed models included random effects correcting for differences between subjects and between the clips, with age, gender, cognitive status, and education as covariates. Results Compared to both NV groups, IA scores were worse for the CVL group (P < 0.001). IA reduced with worsening visual acuity (P < 0.001), and the reduction with worsening visual acuity was greater for the CVL group than the NV-defocus group (P = 0.01), which was seen as a greater discrepancy at worse levels of visual acuity. Conclusions The IA method was able to detect difficulties in following the story experienced by people with CVL. Defocus blur failed to recreate the CVL experience. IA is likely to be useful for evaluations of the effects of vision rehabilitation.
Collapse
Affiliation(s)
- Francisco M. Costela
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
- Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, United States
| | - Daniel R. Saunders
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
- Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, United States
| | - Dylan J. Rose
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
| | - Sidika Katjezovic
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
| | - Stephanie M. Reeves
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
| | - Russell L. Woods
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
- Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, United States
| |
Collapse
|
8
|
Lodato C, Ribino P. A Novel Vision-Enhancing Technology for Low-Vision Impairments. J Med Syst 2018; 42:256. [PMID: 30406503 DOI: 10.1007/s10916-018-1108-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 10/23/2018] [Indexed: 10/27/2022]
Abstract
Ocular disorders such as vitreoretinal pathologies are widespread, especially in older adults. In particular, degenerative diseases of the retina such as macular senile degenerations are on the rise and affect millions of people with hundreds of thousands of new cases each year. These diseases can cause profoundly disabling visual impairments, in some cases severely compromising the central and/or the peripheral vision in one or both eyes. In this paper, we present a novel vision aids technology that allows for correcting or attenuating the perception of visual field defects due to ocular pathologies of diverse origins or traumas by using techniques of 3D visualisation, eye tracking, and image processing. The presented technology is mainly conceived for providing vision aids that can significantly improve the quality of life of people with this kind of visual disorders. As well, it could be employed for supporting the diagnosis of ocular dysfunctions and for monitoring the progression of diseases. The technology shown in this work is protected by an International Application in Patent Cooperation Treaty (PCT).
Collapse
Affiliation(s)
- Carmelo Lodato
- Istituto di Calcolo e Reti ad Alte Prestazioni, via Ugo La Malfa 153, Palermo, Italy.
| | - Patrizia Ribino
- Istituto di Calcolo e Reti ad Alte Prestazioni, via Ugo La Malfa 153, Palermo, Italy
| |
Collapse
|
9
|
Costela FM, Kajtezovic S, Woods RL. The Preferred Retinal Locus Used to Watch Videos. Invest Ophthalmol Vis Sci 2017; 58:6073-6081. [PMID: 29204647 PMCID: PMC5714047 DOI: 10.1167/iovs.17-21839] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Purpose Eccentric viewing is a common strategy used by people with central vision loss (CVL) to direct the eye such that the image falls onto functioning peripheral retina, known as the preferred retinal locus (PRL). It has been long acknowledged that we do not know whether the PRL used in a fixation test is also used when performing tasks. We present an innovative method to determine whether the same PRL observed during a fixation task was used to watch videos and whether poor resolution affects gaze location. Methods The gaze of a group of 60 normal vision (NV) observers was used to define a democratic center of interest (COI) of video clips from movies and television. For each CVL participant (N = 20), we computed the gaze offsets from the COI across the video clips. The distribution of gaze offsets of the NV participants was used to define the limits of NV behavior. If the gaze offset was within this 95% degree confidence interval, we presumed that the same PRL was used for fixation and video watching. Another 15 NV participants watched the video clips with various levels of defocus blur. Results CVL participants had wider gaze-offset distributions than NV participants (P < 0.001). Gaze offsets of 18/20 CVL participants were outside the NV confidence interval. Further, none of the 15 NV participants watching the same videos with spherical defocus blur had a gaze offset that was decentered (outside the NV confidence interval), suggesting that resolution was not the problem. Conclusions This indicates that many CVL participants were using a PRL to view videos that differed from that found with a fixation task and that it was not caused by poor resolution alone. The relationship between these locations needs further investigation.
Collapse
Affiliation(s)
- Francisco M Costela
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States.,Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, United States
| | - Sidika Kajtezovic
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States
| | - Russell L Woods
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Boston, Massachusetts, United States.,Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts, United States
| |
Collapse
|
10
|
Wang S, Woods RL, Costela FM, Luo G. Dynamic gaze-position prediction of saccadic eye movements using a Taylor series. J Vis 2017; 17:3. [PMID: 29196761 PMCID: PMC5710308 DOI: 10.1167/17.14.3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Gaze-contingent displays have been widely used in vision research and virtual reality applications. Due to data transmission, image processing, and display preparation, the time delay between the eye tracker and the monitor update may lead to a misalignment between the eye position and the image manipulation during eye movements. We propose a method to reduce the misalignment using a Taylor series to predict the saccadic eye movement. The proposed method was evaluated using two large datasets including 219,335 human saccades (collected with an EyeLink 1000 system, 95% range from 1° to 32°) and 21,844 monkey saccades (collected with a scleral search coil, 95% range from 1° to 9°). When assuming a 10-ms time delay, the prediction of saccade movements using the proposed method could reduce the misalignment greater than the state-of-the-art methods. The average error was about 0.93° for human saccades and 0.26° for monkey saccades. Our results suggest that this proposed saccade prediction method will create more accurate gaze-contingent displays.
Collapse
Affiliation(s)
- Shuhang Wang
- Schepens Eye Research Institute, Mass Eye and Ear, and Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Russell L Woods
- Schepens Eye Research Institute, Mass Eye and Ear, and Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Francisco M Costela
- Schepens Eye Research Institute, Mass Eye and Ear, and Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| | - Gang Luo
- Schepens Eye Research Institute, Mass Eye and Ear, and Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
11
|
Aguilar C, Castet E. Evaluation of a gaze-controlled vision enhancement system for reading in visually impaired people. PLoS One 2017; 12:e0174910. [PMID: 28380004 PMCID: PMC5381883 DOI: 10.1371/journal.pone.0174910] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 03/17/2017] [Indexed: 11/23/2022] Open
Abstract
People with low vision, especially those with Central Field Loss (CFL), need magnification to read. The flexibility of Electronic Vision Enhancement Systems (EVES) offers several ways of magnifying text. Due to the restricted field of view of EVES, the need for magnification is conflicting with the need to navigate through text (panning). We have developed and implemented a real-time gaze-controlled system whose goal is to optimize the possibility of magnifying a portion of text while maintaining global viewing of the other portions of the text (condition 1). Two other conditions were implemented that mimicked commercially available advanced systems known as CCTV (closed-circuit television systems)—conditions 2 and 3. In these two conditions, magnification was uniformly applied to the whole text without any possibility to specifically select a region of interest. The three conditions were implemented on the same computer to remove differences that might have been induced by dissimilar equipment. A gaze-contingent artificial 10° scotoma (a mask continuously displayed in real time on the screen at the gaze location) was used in the three conditions in order to simulate macular degeneration. Ten healthy subjects with a gaze-contingent scotoma read aloud sentences from a French newspaper in nine experimental one-hour sessions. Reading speed was measured and constituted the main dependent variable to compare the three conditions. All subjects were able to use condition 1 and they found it slightly more comfortable to use than condition 2 (and similar to condition 3). Importantly, reading speed results did not show any significant difference between the three systems. In addition, learning curves were similar in the three conditions. This proof of concept study suggests that the principles underlying the gaze-controlled enhanced system might be further developed and fruitfully incorporated in different kinds of EVES for low vision reading.
Collapse
Affiliation(s)
| | - Eric Castet
- LPC, Aix Marseille Univ, CNRS, Marseille, France
- * E-mail:
| |
Collapse
|
12
|
|