1
|
Kasowski J, Johnson BA, Neydavood R, Akkaraju A, Beyeler M. A systematic review of extended reality (XR) for understanding and augmenting vision loss. J Vis 2023; 23:5. [PMID: 37140911 PMCID: PMC10166121 DOI: 10.1167/jov.23.5.5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 04/04/2023] [Indexed: 05/05/2023] Open
Abstract
Over the past decade, extended reality (XR) has emerged as an assistive technology not only to augment residual vision of people losing their sight but also to study the rudimentary vision restored to blind people by a visual neuroprosthesis. A defining quality of these XR technologies is their ability to update the stimulus based on the user's eye, head, or body movements. To make the best use of these emerging technologies, it is valuable and timely to understand the state of this research and identify any shortcomings that are present. Here we present a systematic literature review of 227 publications from 106 different venues assessing the potential of XR technology to further visual accessibility. In contrast to other reviews, we sample studies from multiple scientific disciplines, focus on technology that augments a person's residual vision, and require studies to feature a quantitative evaluation with appropriate end users. We summarize prominent findings from different XR research areas, show how the landscape has changed over the past decade, and identify scientific gaps in the literature. Specifically, we highlight the need for real-world validation, the broadening of end-user participation, and a more nuanced understanding of the usability of different XR-based accessibility aids.
Collapse
Affiliation(s)
- Justin Kasowski
- Graduate Program in Dynamical Neuroscience, University of California, Santa Barbara, CA, USA
| | - Byron A Johnson
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | - Ryan Neydavood
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | - Anvitha Akkaraju
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
| | - Michael Beyeler
- Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA, USA
- Department of Computer Science, University of California, Santa Barbara, CA, USA
| |
Collapse
|
2
|
Htike HM, Margrain TH, Lai YK, Eslambolchilar P. Ability of Head-Mounted Display Technology to Improve Mobility in People With Low Vision: A Systematic Review. Transl Vis Sci Technol 2020; 9:26. [PMID: 33024619 PMCID: PMC7521174 DOI: 10.1167/tvst.9.10.26] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Accepted: 08/17/2020] [Indexed: 02/06/2023] Open
Abstract
Purpose The purpose of this study was to undertake a systematic literature review on how vision enhancements, implemented using head-mounted displays (HMDs), can improve mobility, orientation, and associated aspects of visual function in people with low vision. Methods The databases Medline, Chinl, Scopus, and Web of Science were searched for potentially relevant studies. Publications from all years until November 2018 were identified based on predefined inclusion and exclusion criteria. The data were tabulated and synthesized to produce a systematic review. Results The search identified 28 relevant papers describing the performance of vision enhancement techniques on mobility and associated visual tasks. Simplifying visual scenes improved obstacle detection and object recognition but decreased walking speed. Minification techniques increased the size of the visual field by 3 to 5 times and improved visual search performance. However, the impact of minification on mobility has not been studied extensively. Clinical trials with commercially available devices recorded poor results relative to conventional aids. Conclusions The effects of current vision enhancements using HMDs are mixed. They appear to reduce mobility efficiency but improved obstacle detection and object recognition. The review highlights the lack of controlled studies with robust study designs. To support the evidence base, well-designed trials with larger sample sizes that represent different types of impairments and real-life scenarios are required. Future work should focus on identifying the needs of people with different types of vision impairment and providing targeted enhancements. Translational Relevance This literature review examines the evidence regarding the ability of HMD technology to improve mobility in people with sight loss.
Collapse
Affiliation(s)
- Hein Min Htike
- School of Computer Science and Informatics, Cardiff University, Cardiff, UK
| | - Tom H Margrain
- School of Optometry and Vision Sciences, Cardiff University, Cardiff, UK
| | - Yu-Kun Lai
- School of Computer Science and Informatics, Cardiff University, Cardiff, UK
| | | |
Collapse
|
3
|
McKone E, Robbins RA, He X, Barnes N. Caricaturing faces to improve identity recognition in low vision simulations: How effective is current-generation automatic assignment of landmark points? PLoS One 2018; 13:e0204361. [PMID: 30286112 PMCID: PMC6171855 DOI: 10.1371/journal.pone.0204361] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 09/05/2018] [Indexed: 11/25/2022] Open
Abstract
PURPOSE Previous behavioural studies demonstrate that face caricaturing can provide an effective image enhancement method for improving poor face identity perception in low vision simulations (e.g., age-related macular degeneration, bionic eye). To translate caricaturing usefully to patients, assignment of the multiple face landmark points needed to produce the caricatures needs to be fully automatised. Recent development in computer science allows automatic face landmark detection of 68 points in real time and in multiple viewpoints. However, previous demonstrations of the behavioural effectiveness of caricaturing have used higher-precision caricatures with 147 landmark points per face, assigned by hand. Here, we test the effectiveness of the auto-assigned 68-point caricatures. We also compare this to the hand-assigned 147-point caricatures. METHOD We assessed human perception of how different in identity pairs of faces appear, when veridical (uncaricatured), caricatured with 68-points, and caricatured with 147-points. Across two experiments, we tested two types of low-vision images: a simulation of blur, as experienced in macular degeneration (testing two blur levels); and a simulation of the phosphenised images seen in prosthetic vision (at three resolutions). RESULTS The 68-point caricatures produced significant improvements in identity discrimination relative to veridical. They were approximately 50% as effective as the 147-point caricatures. CONCLUSION Realistic translation to patients (e.g., via real time caricaturing with the enhanced signal sent to smart glasses or visual prosthetic) is approaching feasibility. For maximum effectiveness software needs to be able to assign landmark points tracing out all details of feature and face shape, to produce high-precision caricatures.
Collapse
Affiliation(s)
- Elinor McKone
- Research School of Psychology, and ARC Centre of Excellence in Cognition and its Disorders, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Rachel A. Robbins
- Research School of Psychology, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Xuming He
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Nick Barnes
- Research School of Engineering, Australian National University, Canberra, Australian Capital Territory, Australia
- Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, Australian Capital Territory, Australia
- Bionic Vision Australia, Carlton, Victoria, Australia
| |
Collapse
|
4
|
Irons JL, Gradden T, Zhang A, He X, Barnes N, Scott AF, McKone E. Face identity recognition in simulated prosthetic vision is poorer than previously reported and can be improved by caricaturing. Vision Res 2017; 137:61-79. [PMID: 28688907 DOI: 10.1016/j.visres.2017.06.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 04/15/2017] [Accepted: 06/08/2017] [Indexed: 10/19/2022]
Abstract
The visual prosthesis (or "bionic eye") has become a reality but provides a low resolution view of the world. Simulating prosthetic vision in normal-vision observers, previous studies report good face recognition ability using tasks that allow recognition to be achieved on the basis of information that survives low resolution well, including basic category (sex, age) and extra-face information (hairstyle, glasses). Here, we test within-category individuation for face-only information (e.g., distinguishing between multiple Caucasian young men with hair covered). Under these conditions, recognition was poor (although above chance) even for a simulated 40×40 array with all phosphene elements assumed functional, a resolution above the upper end of current-generation prosthetic implants. This indicates that a significant challenge is to develop methods to improve face identity recognition. Inspired by "bionic ear" improvements achieved by altering signal input to match high-level perceptual (speech) requirements, we test a high-level perceptual enhancement of face images, namely face caricaturing (exaggerating identity information away from an average face). Results show caricaturing improved identity recognition in memory and/or perception (degree by which two faces look dissimilar) down to a resolution of 32×32 with 30% phosphene dropout. Findings imply caricaturing may offer benefits for patients at resolutions realistic for some current-generation or in-development implants.
Collapse
Affiliation(s)
- Jessica L Irons
- Research School of Psychology, Australian National University, Australia; ARC Centre for Cognition and Its Disorders, Australian National University, Australia.
| | - Tamara Gradden
- Research School of Psychology, Australian National University, Australia
| | - Angel Zhang
- Research School of Psychology, Australian National University, Australia
| | - Xuming He
- National Information and Communication Technology Australia (NICTA), Australia; College of Engineering and Computer Science, Australian National University, Australia; Data61, CSIRO, Australia
| | - Nick Barnes
- National Information and Communication Technology Australia (NICTA), Australia; College of Engineering and Computer Science, Australian National University, Australia; Bionic Vision Australia, Australia; Data61, CSIRO, Australia
| | - Adele F Scott
- National Information and Communication Technology Australia (NICTA), Australia; Bionic Vision Australia, Australia; Data61, CSIRO, Australia
| | - Elinor McKone
- Research School of Psychology, Australian National University, Australia; ARC Centre for Cognition and Its Disorders, Australian National University, Australia.
| |
Collapse
|
5
|
Bermudez-Cameo J, Badias-Herbera A, Guerrero-Viu M, Lopez-Nicolas G, Guerrero JJ. RGB-D Computer Vision Techniques for Simulated Prosthetic Vision. PATTERN RECOGNITION AND IMAGE ANALYSIS 2017. [DOI: 10.1007/978-3-319-58838-4_47] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
6
|
Macé MJM, Guivarch V, Denis G, Jouffrais C. Simulated Prosthetic Vision: The Benefits of Computer-Based Object Recognition and Localization. Artif Organs 2015; 39:E102-13. [PMID: 25900238 DOI: 10.1111/aor.12476] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Jung JH, Aloni D, Yitzhaky Y, Peli E. Active confocal imaging for visual prostheses. Vision Res 2014; 111:182-96. [PMID: 25448710 DOI: 10.1016/j.visres.2014.10.023] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Revised: 10/14/2014] [Accepted: 10/25/2014] [Indexed: 11/26/2022]
Abstract
There are encouraging advances in prosthetic vision for the blind, including retinal and cortical implants, and other "sensory substitution devices" that use tactile or electrical stimulation. However, they all have low resolution, limited visual field, and can display only few gray levels (limited dynamic range), severely restricting their utility. To overcome these limitations, image processing or the imaging system could emphasize objects of interest and suppress the background clutter. We propose an active confocal imaging system based on light-field technology that will enable a blind user of any visual prosthesis to efficiently scan, focus on, and "see" only an object of interest while suppressing interference from background clutter. The system captures three-dimensional scene information using a light-field sensor and displays only an in-focused plane with objects in it. After capturing a confocal image, a de-cluttering process removes the clutter based on blur difference. In preliminary experiments we verified the positive impact of confocal-based background clutter removal on recognition of objects in low resolution and limited dynamic range simulated phosphene images. Using a custom-made multiple-camera system based on light-field imaging, we confirmed that the concept of a confocal de-cluttered image can be realized effectively.
Collapse
Affiliation(s)
- Jae-Hyun Jung
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Doron Aloni
- Department of Electro-Optics Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yitzhak Yitzhaky
- Department of Electro-Optics Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Eli Peli
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
8
|
Li Y, McCarthy C, Barnes N. On just noticeable difference for bionic eye. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2012:2961-4. [PMID: 23366546 DOI: 10.1109/embc.2012.6346585] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We propose to use Just Noticeable Difference (JND) as the principle in visualizing results for image processing modules for prosthetic vision. Current Bionic Eye hardware implants have limited levels of separately perceivable brightness (i.e., low dynamic range in visualizing images). Therefore, it is important to ensure that the critical contrast must remain perceivable by maintaining of visual differences in downsampled images with reduced dynamic range. JND provides a mathematical framework for these psychophysics events. An increase by 1 in JND space corresponds to the smallest detectable change in visual space (i.e., just noticeable). Combining this principle and the dynamic range constraint, we cast the visualization problem to a linear optimization problem, which enables us to generate optimal visualization images. We demonstrate the usefulness of this principle on visualizing ground-plane segmentation. Experiments show that the proposed principle effectively provides critical visual information at different dynamic ranges, and generates consistent results for image sequences.
Collapse
Affiliation(s)
- Yi Li
- National Informationand Communication Technology Australia (NICTA) and College of Engineering and Computer Science at the Australian National University, Canberra, ACT Australia 2601.
| | | | | |
Collapse
|
9
|
Horne L, Barnes N, McCarthy C, He X. Image segmentation for enhancing symbol recognition in prosthetic vision. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2012:2792-5. [PMID: 23366505 DOI: 10.1109/embc.2012.6346544] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Current and near-term implantable prosthetic vision systems offer the potential to restore some visual function, but suffer from poor resolution and dynamic range of induced phosphenes. This can make it difficult for users of prosthetic vision systems to identify symbolic information (such as signs) except in controlled conditions. Using image segmentation techniques from computer vision, we show it is possible to improve the clarity of such symbolic information for users of prosthetic vision implants in uncontrolled conditions. We use image segmentation to automatically divide a natural image into regions, and using a fixation point controlled by the user, select a region to phosphenize. This technique improves the apparent contrast and clarity of symbolic information over traditional phosphenization approaches.
Collapse
Affiliation(s)
- Lachlan Horne
- NICTA Canberra Research Laboratory, Tower A, 7 London Circuit, Canberra ACT 2600, Australia
| | | | | | | |
Collapse
|
10
|
Zapf MP, Matteucci PB, Lovell NH, Suaning GJ. Smartphones as image processing systems for prosthetic vision. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2013:3690-3693. [PMID: 24110531 DOI: 10.1109/embc.2013.6610344] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The feasibility of implants for prosthetic vision has been demonstrated by research and commercial organizations. In most devices, an essential forerunner to the internal stimulation circuit is an external electronics solution for capturing, processing and relaying image information as well as extracting useful features from the scene surrounding the patient. The capabilities and multitude of image processing algorithms that can be performed by the device in real-time plays a major part in the final quality of the prosthetic vision. It is therefore optimal to use powerful hardware yet to avoid bulky, straining solutions. Recent publications have reported of portable single-board computers fast enough for computationally intensive image processing. Following the rapid evolution of commercial, ultra-portable ARM (Advanced RISC machine) mobile devices, the authors investigated the feasibility of modern smartphones running complex face detection as external processing devices for vision implants. The role of dedicated graphics processors in speeding up computation was evaluated while performing a demanding noise reduction algorithm (image denoising). The time required for face detection was found to decrease by 95% from 2.5 year old to recent devices. In denoising, graphics acceleration played a major role, speeding up denoising by a factor of 18. These results demonstrate that the technology has matured sufficiently to be considered as a valid external electronics platform for visual prosthetic research.
Collapse
|
11
|
Barnes N, He X, McCarthy C, Horne L, Kim J, Scott A, Lieby P. The role of vision processing in prosthetic vision. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:308-311. [PMID: 23365891 DOI: 10.1109/embc.2012.6345930] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Prosthetic vision provides vision which is reduced in resolution and dynamic range compared to normal human vision. This comes about both due to residual damage to the visual system from the condition that caused vision loss, and due to limitations of current technology. However, even with limitations, prosthetic vision may still be able to support functional performance which is sufficient for tasks which are key to restoring independent living and quality of life. Here vision processing can play a key role, ensuring that information which is critical to the performance of key tasks is available within the capability of the available prosthetic vision. In this paper, we frame vision processing for prosthetic vision, highlight some key areas which present problems in terms of quality of life, and present examples where vision processing can help achieve better outcomes.
Collapse
|
12
|
Savage CO, Kiral-Kornek FI, Tahayori B, Grayden DB. Can electric current steering be used to control perception of a retinal prosthesis patient. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:3013-3016. [PMID: 23366559 DOI: 10.1109/embc.2012.6346598] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We consider a form of current steering to elicit desired perceptions in users of a retinal prosthesis. While it is common to use a single, remote return electrode to balance electrical stimulation, advances in chip design and electrical switching have enabled more flexibility in stimulation paradigms. We have created a finite-element model of a retina and a ten electrode prosthesis in COMSOL. Different configurations of stimulating and return electrodes are considered and employed to predict possible user perception. We investigate charge balance on electrodes in our varying geometries and consider the impact of inhomogeneous resistance between electrodes and the tissue.
Collapse
Affiliation(s)
- Craig O Savage
- NeuroEngineering Laboratory, Dept. of Electrical and Electronic Engineering, University of Melbourne, VIC 3010 Australia.
| | | | | | | |
Collapse
|
13
|
McCarthy C, Barnes N. Time-to-contact maps for navigation with a low resolution visual prosthesis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:2780-2783. [PMID: 23366502 DOI: 10.1109/embc.2012.6346541] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The perception of independently moving objects in the scene is an important capability for prosthetic vision, but is impeded by the limited resolution and dynamic range of current and near-term retinal prostheses. We propose a novel, biologically-inspired visual representation for prosthetic vision based on the recovery of time-to-contact (τ) with surfaces in the scene. The representation directly encodes the extent of motion towards the observer, placing greatest emphasis on objects posing an imminent threat of collision. Our results suggest the proposed τ-based representation may facilitate earlier perception of incoming objects, and provide clearer distinction between moving objects and the static structure of the scene compared with intensity and depth-based scene representations.
Collapse
Affiliation(s)
- Chris McCarthy
- NICTA Canberra Research Laboratory, Canberra ACT, Australia
| | | |
Collapse
|
14
|
Xie Y, Liu N, Barnes N. Phosphene vision of depth and boundary from segmentation-based associative MRFs. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:5314-5318. [PMID: 23367129 DOI: 10.1109/embc.2012.6347194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This paper presents a novel low-resolution phosphene visualization of depth and boundary computed by a two-layer Associative Markov Random Fields. Unlike conventional methods modeling the depth and boundary as an individual MRF respectively, our algorithm proposed a two-layer associative MRFs framework by combining the depth with geometry-based surface boundary estimation, in which both variables are inferred globally and simultaneously. With surface boundary integration, the experiments demonstrates three significant improvements as: 1) eliminating depth ambiguities and increasing the accuracy, 2) providing comprehensive information of depth and boundary for human navigation under low-resolution phosphene vision, 3) when integrating the boundary clues into downsampling process, the foreground obstacle has been clearly enhanced and discriminated from the surrounding background. In order to gain higher efficiency and lower computational cost, the work is initialized on segmentation based depth plane fitting and labeling, and then applying the latest projected graph cut for global optimization. The proposed approach has been tested on both Middlebury and indoor real-scene data set, and achieves a much better performance with significant accuracy than other popular methods in both regular and low resolutions.
Collapse
Affiliation(s)
- Yiran Xie
- College of Engineering and Computer Science of Australian National University and Canberra Research Laboratory of National ICT Australia.
| | | | | |
Collapse
|
15
|
He X, Kim J, Barnes N. An face-based visual fixation system for prosthetic vision. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:2981-2984. [PMID: 23366551 DOI: 10.1109/embc.2012.6346590] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Recent studies have shown the success of face recognition using low resolution prosthetic vision, but it requires a zoomed-in and stably-fixated view, which will be challenging for a user with the limited resolution of current prosthetic vision devices. We propose a real-time object detection and tracking system capable of fixating human faces. By integrating both static and temporal information, we are able to improve the robustness of face localization so that it can fixate on faces with large pose variations. Our qualitative and quantitative results demonstrate the viability of supplementing visual prosthetic devices with the ability to visually fixate objects automatically, and provide a stable zoomed-in image stream to facilitate face and expression recognition.
Collapse
|