1
|
Minh Tran TT, Brown S, Weidlich O, Billinghurst M, Parker C. Wearable Augmented Reality: Research Trends and Future Directions from Three Major Venues. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4782-4793. [PMID: 37782599 DOI: 10.1109/tvcg.2023.3320231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Wearable Augmented Reality (AR) has attracted considerable attention in recent years, as evidenced by the growing number of research publications and industry investments. With swift advancements and a multitude of interdisciplinary research areas within wearable AR, a comprehensive review is crucial for integrating the current state of the field. In this paper, we present a review of 389 research papers on wearable AR, published between 2018 and 2022 in three major venues: ISMAR, TVCG, and CHI. Drawing inspiration from previous works by Zhou et al. and Kim et al., which summarized AR research at ISMAR over the past two decades (1998-2017), we categorize the papers into different topics and identify prevailing trends. One notable finding is that wearable AR research is increasingly geared towards enabling broader consumer adoption. From our analysis, we highlight key observations related to potential future research areas essential for capitalizing on this trend and achieving widespread adoption. These include addressing challenges in Display, Tracking, Interaction, and Applications, and exploring emerging frontiers in Ethics, Accessibility, Avatar and Embodiment, and Intelligent Virtual Agents.
Collapse
|
2
|
Jang Y, Jeong I, Younesi Heravi M, Sarkar S, Shin H, Ahn Y. Multi-Camera-Based Human Activity Recognition for Human-Robot Collaboration in Construction. SENSORS (BASEL, SWITZERLAND) 2023; 23:6997. [PMID: 37571779 PMCID: PMC10422633 DOI: 10.3390/s23156997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/27/2023] [Accepted: 08/05/2023] [Indexed: 08/13/2023]
Abstract
As the use of construction robots continues to increase, ensuring safety and productivity while working alongside human workers becomes crucial. To prevent collisions, robots must recognize human behavior in close proximity. However, single, or RGB-depth cameras have limitations, such as detection failure, sensor malfunction, occlusions, unconstrained lighting, and motion blur. Therefore, this study proposes a multiple-camera approach for human activity recognition during human-robot collaborative activities in construction. The proposed approach employs a particle filter, to estimate the 3D human pose by fusing 2D joint locations extracted from multiple cameras and applies long short-term memory network (LSTM) to recognize ten activities associated with human and robot collaboration tasks in construction. The study compared the performance of human activity recognition models using one, two, three, and four cameras. Results showed that using multiple cameras enhances recognition performance, providing a more accurate and reliable means of identifying and differentiating between various activities. The results of this study are expected to contribute to the advancement of human activity recognition and utilization in human-robot collaboration in construction.
Collapse
Affiliation(s)
- Youjin Jang
- Department of Civil, Construction and Environmental Engineering, North Dakota State University, Fargo, ND 58108, USA; (M.Y.H.); (S.S.)
| | - Inbae Jeong
- Department of Mechanical Engineering, North Dakota State University, Fargo, ND 58108, USA;
| | - Moein Younesi Heravi
- Department of Civil, Construction and Environmental Engineering, North Dakota State University, Fargo, ND 58108, USA; (M.Y.H.); (S.S.)
| | - Sajib Sarkar
- Department of Civil, Construction and Environmental Engineering, North Dakota State University, Fargo, ND 58108, USA; (M.Y.H.); (S.S.)
| | - Hyunkyu Shin
- Sustainable Smart City Convergence Educational Research Center, Hanyang University ERICA, Ansan 15588, Republic of Korea;
| | - Yonghan Ahn
- Department of Architectural Engineering, Hanyang University ERICA, Ansan 15588, Republic of Korea;
| |
Collapse
|
3
|
Parger M, Tang C, Xu Y, Twigg CD, Tao L, Li Y, Wang R, Steinberger M. UNOC: Understanding Occlusion for Embodied Presence in Virtual Reality. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:4240-4251. [PMID: 34061744 DOI: 10.1109/tvcg.2021.3085407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tracking body and hand motions in 3D space is essential for social and self-presence in augmented and virtual environments. Unlike the popular 3D pose estimation setting, the problem is often formulated as egocentric tracking based on embodied perception (e.g., egocentric cameras, handheld sensors). In this article, we propose a new data-driven framework for egocentric body tracking, targeting challenges of omnipresent occlusions in optimization-based methods (e.g., inverse kinematics solvers). We first collect a large-scale motion capture dataset with both body and finger motions using optical markers and inertial sensors. This dataset focuses on social scenarios and captures ground truth poses under self-occlusions and body-hand interactions. We then simulate the occlusion patterns in head-mounted camera views on the captured ground truth using a ray casting algorithm and learn a deep neural network to infer the occluded body parts. Our experiments show that our method is able to generate high-fidelity embodied poses by applying the proposed method to the task of real-time egocentric body tracking, finger motion synthesis, and 3-point inverse kinematics.
Collapse
|
4
|
Pose Estimation-Assisted Dance Tracking System Based on Convolutional Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2301395. [PMID: 35694578 PMCID: PMC9187454 DOI: 10.1155/2022/2301395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 05/07/2022] [Accepted: 05/20/2022] [Indexed: 11/18/2022]
Abstract
In the field of music-driven, computer-assisted dance movement generation, traditional music movement adaptations and statistical mapping models have the following problems: Firstly, the dance sequences generated by the model are not powerful enough to fit the music itself. Secondly, the integrity of the dance movements produced is not sufficient. Thirdly, it is necessary to improve the suppleness and rationality of long-term dance sequences. Fourthly, traditional models cannot produce new dance movements. How to create smooth and complete dance gesture sequences after music is a problem that needs to be investigated in this paper. To address these problems, we design a deep learning dance generation algorithm to extract the association between sound and movement characteristics. During the feature extraction phase, rhythmic features extracted from music and audio beat features are used as musical features, and coordinates of the main points of human bones extracted from dance videos are used for training as movement characteristics. During the model building phase, the model’s generator module is used to achieve a basic mapping of music and dance movements and to generate gentle dance gestures. The identification module is used to achieve consistency between dance and music. The self-encoder module is used to make the audio function more representative. Experimental results on the DeepFashion dataset show that the generated model can synthesize the new view of the target person in any human posture of a given posture, complete the transformation of different postures of the same person, and retain the external features and clothing textures of the target person. Using a whole-to-detail generation strategy can improve the final video composition. For the problem of incoherent character movements in video synthesis, we propose to optimize the character movements by using a generative adversarial network, specifically by inserting generated motion compensation frames into the incoherent movement sequences to improve the smoothness of the synthesized video.
Collapse
|
5
|
Miura T, Sako S. Simple yet effective 3D ego-pose lift-up based on vector and distance for a mounted omnidirectional camera. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03417-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractFollowing the advances in convolutional neural networks and synthetic data generation, 3D egocentric body pose estimations from a mounted fisheye camera have been developed. Previous works estimated 3D joint positions from raw image pixels and intermediate supervision during the process. The mounted fisheye camera captures notably different images that are affected by the optical properties of the lens, angle of views, and setup positions. Therefore, 3D ego-pose estimation from a mounted fisheye camera must be trained for each set of camera optics and setup. We propose a 3D ego-pose estimation from a single mounted omnidirectional camera that captures the entire circumference by back-to-back dual fisheye cameras. The omnidirectional camera can capture the user’s body in the 360∘ field of view under a wide variety of motions. We also propose a simple feed-forward network model to estimate 3D joint positions from 2D joint locations. The lift-up model can be used in real time yet obtains accuracy comparable to those of previous works on our new dataset. Moreover, our model is trainable with the ground truth 3D joint positions and the unit vectors toward the 3D joint positions, which are easily generated from existing publicly available 3D mocap datasets. This advantage alleviates the data collection and training burden due to changes in the camera optics and setups, although it is limited to the effect after the 2D joint location estimation.
Collapse
|
6
|
Semantic–Structural Graph Convolutional Networks for Whole-Body Human Pose Estimation. INFORMATION 2022. [DOI: 10.3390/info13030109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Existing whole-body human pose estimation methods mostly segment the parts of the body’s hands and feet for specific processing, which not only splits the overall semantics of the body, but also increases the amount of calculation and the complexity of the model. To address these drawbacks, we designed a novel semantic–structural graph convolutional network (SSGCN) for whole-body human pose estimation tasks, which leverages the whole-body graph structure to analyze the semantics of the whole-body keypoints through a graph convolutional network and improves the accuracy of pose estimation. Firstly, we introduced a novel heat-map-based keypoint embedding, which encodes the position information and feature information of the keypoints of the human body. Secondly, we propose a novel semantic–structural graph convolutional network consisting of several sets of cascaded structure-based graph layers and data-dependent whole-body non-local layers. Specifically, the proposed method extracts groups of keypoints and constructs a high-level abstract body graph to process the high-level semantic information of the whole-body keypoints. The experimental results showed that our method achieved very promising results on the challenging COCO whole-body dataset.
Collapse
|
7
|
Proposed New AV-Type Test-Bed for Accurate and Reliable Fish-Eye Lens Camera Self-Calibration. SENSORS 2021; 21:s21082776. [PMID: 33920063 PMCID: PMC8071168 DOI: 10.3390/s21082776] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/08/2021] [Accepted: 04/12/2021] [Indexed: 01/18/2023]
Abstract
The fish-eye lens camera has a wide field of view that makes it effective for various applications and sensor systems. However, it incurs strong geometric distortion in the image due to compressive recording of the outer part of the image. Such distortion must be interpreted accurately through a self-calibration procedure. This paper proposes a new type of test-bed (the AV-type test-bed) that can effect a balanced distribution of image points and a low level of correlation between orientation parameters. The effectiveness of the proposed test-bed in the process of camera self-calibration was verified through the analysis of experimental results from both a simulation and real datasets. In the simulation experiments, the self-calibration procedures were performed using the proposed test-bed, four different projection models, and five different datasets. For all of the cases, the Root Mean Square residuals (RMS-residuals) of the experiments were lower than one-half pixel. The real experiments, meanwhile, were carried out using two different cameras and five different datasets. These results showed high levels of calibration accuracy (i.e., lower than the minimum value of RMS-residuals: 0.39 pixels). Based on the above analyses, we were able to verify the effectiveness of the proposed AV-type test-bed in the process of camera self-calibration.
Collapse
|
8
|
Sy LWF, Lovell NH, Redmond SJ. Estimating Lower Limb Kinematics Using a Lie Group Constrained Extended Kalman Filter with a Reduced Wearable IMU Count and Distance Measurements. SENSORS (BASEL, SWITZERLAND) 2020; 20:s20236829. [PMID: 33260386 PMCID: PMC7730686 DOI: 10.3390/s20236829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 11/17/2020] [Accepted: 11/24/2020] [Indexed: 06/12/2023]
Abstract
Tracking the kinematics of human movement usually requires the use of equipment that constrains the user within a room (e.g., optical motion capture systems), or requires the use of a conspicuous body-worn measurement system (e.g., inertial measurement units (IMUs) attached to each body segment). This paper presents a novel Lie group constrained extended Kalman filter to estimate lower limb kinematics using IMU and inter-IMU distance measurements in a reduced sensor count configuration. The algorithm iterates through the prediction (kinematic equations), measurement (pelvis height assumption/inter-IMU distance measurements, zero velocity update for feet/ankles, flat-floor assumption for feet/ankles, and covariance limiter), and constraint update (formulation of hinged knee joints and ball-and-socket hip joints). The knee and hip joint angle root-mean-square errors in the sagittal plane for straight walking were 7.6±2.6∘ and 6.6±2.7∘, respectively, while the correlation coefficients were 0.95±0.03 and 0.87±0.16, respectively. Furthermore, experiments using simulated inter-IMU distance measurements show that performance improved substantially for dynamic movements, even at large noise levels (σ=0.2 m). However, further validation is recommended with actual distance measurement sensors, such as ultra-wideband ranging sensors.
Collapse
Affiliation(s)
- Luke Wicent F. Sy
- Graduate School of Biomedical Engineering, UNSW Sydney, Sydney 2052, Australia;
| | - Nigel H. Lovell
- Graduate School of Biomedical Engineering, UNSW Sydney, Sydney 2052, Australia;
| | - Stephen J. Redmond
- UCD School of Electrical and Electronic Engineering, University College Dublin, Belfield, 4 Dublin, Ireland;
| |
Collapse
|
9
|
Chasing Feet in the Wild: A Proposed Egocentric Motion-Aware Gait Assessment Tool. LECTURE NOTES IN COMPUTER SCIENCE 2019. [DOI: 10.1007/978-3-030-11024-6_12] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|