1
|
Yu BXB, Liu Y, Chan KCC, Chen CW. EGCN++: A New Fusion Strategy for Ensemble Learning in Skeleton-Based Rehabilitation Exercise Assessment. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:6471-6485. [PMID: 38502632 DOI: 10.1109/tpami.2024.3378753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
Skeleton-based exercise assessment focuses on evaluating the correctness or quality of an exercise performed by a subject. Skeleton data provide two groups of features (i.e., position and orientation), which existing methods have not fully harnessed. We previously proposed an ensemble-based graph convolutional network (EGCN) that considers both position and orientation features to construct a model-based approach. Integrating these types of features achieved better performance than available methods. However, EGCN lacked a fusion strategy across the data, feature, decision, and model levels. In this paper, we present an advanced framework, EGCN++, for rehabilitation exercise assessment. Based on EGCN, a new fusion strategy called MLE-PO is proposed for EGCN++; this technique considers fusion at the data and model levels. We conduct extensive cross-validation experiments and investigate the consistency between machine and human evaluations on three datasets: UI-PRMD, KIMORE, and EHE. Results demonstrate that MLE-PO outperforms other EGCN ensemble strategies and representative baselines. Furthermore, the MLE-PO's model evaluation scores are more quantitatively consistent with clinical evaluations than other ensemble strategies.
Collapse
|
2
|
Wang J, Li C, Zhang B, Zhang Y, Shi L, Wang X, Zhou L, Xiong D. Automatic rehabilitation exercise task assessment of stroke patients based on wearable sensors with a lightweight multichannel 1D-CNN model. Sci Rep 2024; 14:19204. [PMID: 39160147 PMCID: PMC11333737 DOI: 10.1038/s41598-024-68204-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 07/22/2024] [Indexed: 08/21/2024] Open
Abstract
Approximately 75% of stroke survivors have movement dysfunction. Rehabilitation exercises are capable of improving physical coordination. They are mostly conducted in the home environment without guidance from therapists. It is impossible to provide timely feedback on exercises without suitable devices or therapists. Human action quality assessment in the home setting is a challenging topic for current research. In this paper, a low-cost HREA system in which wearable sensors are used to collect upper limb exercise data and a multichannel 1D-CNN framework is used to automatically assess action quality. The proposed 1D-CNN model is first pretrained on the UCI-HAR dataset, and it achieves a performance of 91.96%. Then, five typical actions were selected from the Fugl-Meyer Assessment Scale for the experiment, wearable sensors were used to collect the participants' exercise data, and experienced therapists were employed to assess participants' exercise at the same time. Following the above process, a dataset was built based on the Fugl-Meyer scale. Based on the 1D-CNN model, a multichannel 1D-CNN model was built, and the model using the Naive Bayes fusion had the best performance (precision: 97.26%, recall: 97.22%, F1-score: 97.23%) on the dataset. This shows that the HREA system provides accurate and timely assessment, which can provide real-time feedback for stroke survivors' home rehabilitation.
Collapse
Affiliation(s)
- Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Chengqi Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Bochao Zhang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Yunpeng Zhang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Lei Shi
- Neurology Department, Suzhou Xiangcheng People's Hospital, Suzhou, 215163, China
| | - Xiaojun Wang
- Neurology Department, Suzhou Xiangcheng People's Hospital, Suzhou, 215163, China
| | - Linfu Zhou
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, 210029, China
| | - Daxi Xiong
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China.
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| |
Collapse
|
3
|
Shen YY, Xing QJ, Shen YF. Markerless vision-based functional movement screening movements evaluation with deep neural networks. iScience 2024; 27:108705. [PMID: 38222112 PMCID: PMC10784700 DOI: 10.1016/j.isci.2023.108705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 10/03/2023] [Accepted: 12/07/2023] [Indexed: 01/16/2024] Open
Abstract
The functional movement screen (FMS) test is a seven-test battery used to assess fundamental movement abilities of individuals. It is commonly used to predict sports injuries but relies on clinical expertise and is not suitable for self-examination. This study presents an automatic FMS movement assessment framework using a multi-view deep neural network called MVDNN. The framework combines automatic skeleton extraction with manual feature selection to extract 3D trajectory features of human skeleton joints from two different directions. Three mainstream methods of time-series modeling are then used to learn high-level feature representation from skeleton sequences, and motion features from two views are fused to provide complementary information. Results of public FMS movements dataset demonstrate that our MVDNN outperforms current state-of-the-art methods with an average miF1 score of 0.857, maF1 score of 0.768, and Kappa score of 0.640 over ten runs.
Collapse
Affiliation(s)
- Yuan-Yuan Shen
- School of Sport Engineering, Beijing Sport University, Beijing 100084, China
| | - Qing-Jun Xing
- School of Sport Science, Beijing Sport University, Beijing 100084, China
| | - Yan-Fei Shen
- School of Sport Engineering, Beijing Sport University, Beijing 100084, China
| |
Collapse
|
4
|
Hirosawa S, Kato T, Yamashita T, Aoki Y. Action Quality Assessment Model Using Specialists' Gaze Location and Kinematics Data-Focusing on Evaluating Figure Skating Jumps. SENSORS (BASEL, SWITZERLAND) 2023; 23:9282. [PMID: 38005668 PMCID: PMC10675807 DOI: 10.3390/s23229282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/11/2023] [Accepted: 11/12/2023] [Indexed: 11/26/2023]
Abstract
Action quality assessment (AQA) tasks in computer vision evaluate action quality in videos, and they can be applied to sports for performance evaluation. A typical example of AQA is predicting the final score from a video that captures an entire figure skating program. However, no previous studies have predicted individual jump scores, which are of great interest to competitors because of the high weight of competition. Despite the presence of unnecessary information in figure skating videos, human specialists can focus and reduce information when they evaluate jumps. In this study, we clarified the eye movements of figure skating judges and skaters while evaluating jumps and proposed a prediction model for jump performance that utilized specialists' gaze location to reduce information. Kinematic features obtained from the tracking system were input into the model in addition to videos to improve accuracy. The results showed that skaters focused more on the face, whereas judges focused on the lower extremities. These gaze locations were applied to the model, which demonstrated the highest accuracy when utilizing both specialists' gaze locations. The model outperformed human predictions and the baseline model (RMSE:0.775), suggesting a combination of human specialist knowledge and machine capabilities could yield higher accuracy.
Collapse
Affiliation(s)
- Seiji Hirosawa
- Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan
- Faculty of Sport and Health Sciences, Toin University of Yokohama, Yokohama 225-8503, Japan
| | - Takaaki Kato
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Japan
| | | | - Yoshimitsu Aoki
- Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan
| |
Collapse
|
5
|
Xing Q, Hong R, Shen Y, Shen Y. Design and validation of depth camera-based static posture assessment system. iScience 2023; 26:107974. [PMID: 37810248 PMCID: PMC10551660 DOI: 10.1016/j.isci.2023.107974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/20/2023] [Accepted: 09/16/2023] [Indexed: 10/10/2023] Open
Abstract
Postural abnormalities have become a prevalent issue affecting individuals of all ages, resulting in a diminished quality of life. Easy-use and reliable posture assessment tools can aid in screening for and correcting posture deviation at an early stage. In this study, we present a depth camera-based static posture assessment system to screen for common postural anomalies such as uneven shoulders, pelvic tilt, bowlegs and knock-knees, forward head, scoliosis, and shoulder blade inclination. The system consists of an Azure Kinect camera, a laptop, and evaluation software. Our system accurately measures skeleton and posture indexes and shows favorable agreement with a golden standard optical infrared motion capture system. The findings indicate that the system is a low-cost posture assessment tool with high precision and accuracy, suitable for initial screening of postural abnormalities in individuals of all ages.
Collapse
Affiliation(s)
- Qingjun Xing
- School of Sport Science, Beijing Sport University, Beijing 100084, China
| | - Ruiwei Hong
- School of Sport Engineering, Beijing Sport University, Beijing 100084, China
| | - Yuanyuan Shen
- School of Sport Engineering, Beijing Sport University, Beijing 100084, China
| | - Yanfei Shen
- School of Sport Engineering, Beijing Sport University, Beijing 100084, China
| |
Collapse
|
6
|
Sardari S, Sharifzadeh S, Daneshkhah A, Nakisa B, Loke SW, Palade V, Duncan MJ. Artificial Intelligence for skeleton-based physical rehabilitation action evaluation: A systematic review. Comput Biol Med 2023; 158:106835. [PMID: 37019012 DOI: 10.1016/j.compbiomed.2023.106835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 03/09/2023] [Accepted: 03/26/2023] [Indexed: 04/03/2023]
Abstract
Performing prescribed physical exercises during home-based rehabilitation programs plays an important role in regaining muscle strength and improving balance for people with different physical disabilities. However, patients attending these programs are not able to assess their action performance in the absence of a medical expert. Recently, vision-based sensors have been deployed in the activity monitoring domain. They are capable of capturing accurate skeleton data. Furthermore, there have been significant advancements in Computer Vision (CV) and Deep Learning (DL) methodologies. These factors have promoted the solutions for designing automatic patient's activity monitoring models. Then, improving such systems' performance to assist patients and physiotherapists has attracted wide interest of the research community. This paper provides a comprehensive and up-to-date literature review on different stages of skeleton data acquisition processes for the aim of physio exercise monitoring. Then, the previously reported Artificial Intelligence (AI) - based methodologies for skeleton data analysis will be reviewed. In particular, feature learning from skeleton data, evaluation, and feedback generation for the purpose of rehabilitation monitoring will be studied. Furthermore, the associated challenges to these processes will be reviewed. Finally, the paper puts forward several suggestions for future research directions in this area.
Collapse
|
7
|
Iannizzotto G, Lo Bello L, Nucita A. Improving BLE-Based Passive Human Sensing with Deep Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:2581. [PMID: 36904785 PMCID: PMC10007112 DOI: 10.3390/s23052581] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Passive Human Sensing (PHS) is an approach to collecting data on human presence, motion or activities that does not require the sensed human to carry devices or participate actively in the sensing process. In the literature, PHS is generally performed by exploiting the Channel State Information variations of dedicated WiFi, affected by human bodies obstructing the WiFi signal propagation path. However, the adoption of WiFi for PHS has some drawbacks, related to power consumption, large-scale deployment costs and interference with other networks in nearby areas. Bluetooth technology and, in particular, its low-energy version Bluetooth Low Energy (BLE), represents a valid candidate solution to the drawbacks of WiFi, thanks to its Adaptive Frequency Hopping (AFH) mechanism. This work proposes the application of a Deep Convolutional Neural Network (DNN) to improve the analysis and classification of the BLE signal deformations for PHS using commercial standard BLE devices. The proposed approach was applied to reliably detect the presence of human occupants in a large and articulated room with only a few transmitters and receivers and in conditions where the occupants do not directly occlude the Line of Sight between transmitters and receivers. This paper shows that the proposed approach significantly outperforms the most accurate technique found in the literature when applied to the same experimental data.
Collapse
Affiliation(s)
- Giancarlo Iannizzotto
- Department of Cognitive Sciences, Psychology, Education and Cultural Studies (COSPECS), University of Messina, 98122 Messina, Italy
| | - Lucia Lo Bello
- Department of Electrical, Electronic and Computer Engineering (DIEEI), University of Catania, 95125 Catania, Italy
| | - Andrea Nucita
- Department of Cognitive Sciences, Psychology, Education and Cultural Studies (COSPECS), University of Messina, 98122 Messina, Italy
| |
Collapse
|
8
|
Gaussian guided frame sequence encoder network for action quality assessment. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00892-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
AbstractCan a computer evaluate an athlete’s performance automatically? Many action quality assessment (AQA) methods have been proposed in recent years. Limited by the randomness of video sampling and the simple strategy of model training, the performance of the existing AQA methods can still be further improved. To achieve this goal, a Gaussian guided frame sequence encoder network is proposed in this paper. In the proposed method, the image feature of each video frame is extracted by Resnet model. And then, a frame sequence encoder network is applied to model temporal information and generate action quality feature. Finally, a fully connected network is designed to predict action quality score. To train the proposed method effectively, inspired by the final score calculation rule in Olympic game, Gaussian loss function is employed to compute the error between the predicted score and the label score. The proposed method is implemented on the AQA-7 and MTL–AQA datasets. The experimental results confirm that compared with the state-of-the-art methods, our proposed method achieves the better performance. And detailed ablation experiments are conducted to verify the effectiveness of each component in the module.
Collapse
|
9
|
KFSENet: A Key Frame-Based Skeleton Feature Estimation and Action Recognition Network for Improved Robot Vision with Face and Emotion Recognition. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115455] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In this paper, we propose an integrated approach to robot vision: a key frame-based skeleton feature estimation and action recognition network (KFSENet) that incorporates action recognition with face and emotion recognition to enable social robots to engage in more personal interactions. Instead of extracting the human skeleton features from the entire video, we propose a key frame-based approach for their extraction using pose estimation models. We select the key frames using the gradient of a proposed total motion metric that is computed using dense optical flow. We use the extracted human skeleton features from the selected key frames to train a deep neural network (i.e., the double-feature double-motion network (DDNet)) for action recognition. The proposed KFSENet utilizes a simpler model to learn and differentiate between the different action classes, is computationally simpler and yields better action recognition performance when compared with existing methods. The use of key frames allows the proposed method to eliminate unnecessary and redundant information, which improves its classification accuracy and decreases its computational cost. The proposed method is tested on both publicly available standard benchmark datasets and self-collected datasets. The performance of the proposed method is compared to existing state-of-the-art methods. Our results indicate that the proposed method yields better performance compared with existing methods. Moreover, our proposed framework integrates face and emotion recognition to enable social robots to engage in more personal interaction with humans.
Collapse
|
10
|
A Deep Learning and Clustering Extraction Mechanism for Recognizing the Actions of Athletes in Sports. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2663834. [PMID: 35371202 PMCID: PMC8970900 DOI: 10.1155/2022/2663834] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/11/2022] [Accepted: 03/01/2022] [Indexed: 12/24/2022]
Abstract
In sports, the essence of a complete technical action is a complete information structure pattern and the athlete's judgment of the action is actually the identification of the movement information structure pattern. Action recognition refers to the ability of the human brain to distinguish a perceived action from other actions and obtain predictive response information when it identifies and confirms it according to the constantly changing motion information on the field. Action recognition mainly includes two aspects: one is to obtain the required action information based on visual observation and the other is to judge the action based on the obtained action information, but the neuropsychological mechanism of this process is still unknown. In this paper, a new key frame extraction method based on the clustering algorithm and multifeature fusion is proposed for sports videos with complex content, many scenes, and rich actions. First, a variety of features are fused, and then, similarity measurement can be used to describe videos with complex content more completely and comprehensively; second, a clustering algorithm is used to cluster sports video sequences according to scenes, eliminating the need for shots in the case of many scenes. It is difficult and complicated to detect segmentation; third, extracting key frames according to the minimum motion standard can more accurately represent the video content with rich actions. At the same time, the clustering algorithm used in this paper is improved to enhance the offline computing efficiency of the key frame extraction system. Based on the analysis of the advantages and disadvantages of the classical convolutional neural network and recurrent neural network algorithms in deep learning, this paper proposes an improved convolutional network and optimization based on the recognition and analysis of human actions under complex scenes, complex actions, and fast motion compared to post-neural network and hybrid neural network algorithm. Experiments show that the algorithm achieves similar human observation of athletes' training execution and completion. Compared with other algorithms, it has been verified that it has very high learning rate and accuracy for the athlete's action recognition.
Collapse
|
11
|
Vandevoorde K, Vollenkemper L, Schwan C, Kohlhase M, Schenck W. Using Artificial Intelligence for Assistance Systems to Bring Motor Learning Principles into Real World Motor Tasks. SENSORS (BASEL, SWITZERLAND) 2022; 22:2481. [PMID: 35408094 PMCID: PMC9002555 DOI: 10.3390/s22072481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 03/18/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
Abstract
Humans learn movements naturally, but it takes a lot of time and training to achieve expert performance in motor skills. In this review, we show how modern technologies can support people in learning new motor skills. First, we introduce important concepts in motor control, motor learning and motor skill learning. We also give an overview about the rapid expansion of machine learning algorithms and sensor technologies for human motion analysis. The integration between motor learning principles, machine learning algorithms and recent sensor technologies has the potential to develop AI-guided assistance systems for motor skill training. We give our perspective on this integration of different fields to transition from motor learning research in laboratory settings to real world environments and real world motor tasks and propose a stepwise approach to facilitate this transition.
Collapse
Affiliation(s)
- Koenraad Vandevoorde
- Center for Applied Data Science (CfADS), Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, 33619 Bielefeld, Germany; (L.V.); (C.S.); (M.K.)
| | | | | | | | - Wolfram Schenck
- Center for Applied Data Science (CfADS), Faculty of Engineering and Mathematics, Bielefeld University of Applied Sciences, 33619 Bielefeld, Germany; (L.V.); (C.S.); (M.K.)
| |
Collapse
|
12
|
Detection of Physical Strain and Fatigue in Industrial Environments Using Visual and Non-Visual Low-Cost Sensors. TECHNOLOGIES 2022. [DOI: 10.3390/technologies10020042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The detection and prevention of workers’ body straining postures and other stressing conditions within the work environment, supports establishing occupational safety and promoting well being and sustainability at work. Developed methods towards this aim typically rely on combining highly ergonomic workplaces and expensive monitoring mechanisms including wearable devices. In this work, we demonstrate how the input from low-cost sensors, specifically, passive camera sensors installed in a real manufacturing workplace, and smartwatches used by the workers can provide useful feedback on the workers’ conditions and can yield key indicators for the prevention of work-related musculo-skeletal disorders (WMSD) and physical fatigue. To this end, we study the ability to assess the risk for physical strain of workers online during work activities based on the classification of ergonomically sub-optimal working postures using visual information, the correlation and fusion of these estimations with synchronous worker heart rate data, as well as the prediction of near-future heart rate using deep learning-based techniques. Moreover, a new multi-modal dataset of video and heart rate data captured in a real manufacturing workplace during car door assembly activities is introduced. The experimental results show the efficiency of the proposed approach that exceeds 70% of classification rate based on the F1 score measure using a set of over 300 annotated video clips of real line workers during work activities. In addition a time lagging correlation between the estimated ergonomic risks for physical strain and high heart rate was assessed using a larger dataset of synchronous visual and heart rate data sequences. The statistical analysis revealed that imposing increased strain to body parts will results in an increase to the heart rate after 100–120 s. This finding is used to improve the short term forecasting of worker’s cardiovascular activity for the next 10 to 30 s by fusing the heart rate data with the estimated ergonomic risks for physical strain and ultimately to train better predictive models for worker fatigue.
Collapse
|
13
|
Feng M, Meunier J. Skeleton Graph-Neural-Network-Based Human Action Recognition: A Survey. SENSORS (BASEL, SWITZERLAND) 2022; 22:2091. [PMID: 35336262 PMCID: PMC8952863 DOI: 10.3390/s22062091] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 02/21/2022] [Accepted: 02/24/2022] [Indexed: 02/01/2023]
Abstract
Human action recognition has been applied in many fields, such as video surveillance and human computer interaction, where it helps to improve performance. Numerous reviews of the literature have been done, but rarely have these reviews concentrated on skeleton-graph-based approaches. Connecting the skeleton joints as in the physical appearance can naturally generate a graph. This paper provides an up-to-date review for readers on skeleton graph-neural-network-based human action recognition. After analyzing previous related studies, a new taxonomy for skeleton-GNN-based methods is proposed according to their designs, and their merits and demerits are analyzed. In addition, the datasets and codes are discussed. Finally, future research directions are suggested.
Collapse
Affiliation(s)
| | - Jean Meunier
- Department of Computer Science and Operations Research, University of Montreal, Montreal, QC H3C 3J7, Canada;
| |
Collapse
|
14
|
A Contemporary Review on Utilizing Semantic Web Technologies in Healthcare, Virtual Communities, and Ontology-Based Information Processing Systems. ELECTRONICS 2022. [DOI: 10.3390/electronics11030453] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The semantic web is an emerging technology that helps to connect different users to create their content and also facilitates the way of representing information in a manner that can be made understandable for computers. As the world is heading towards the fourth industrial revolution, the implicit utilization of artificial-intelligence-enabled semantic web technologies paves the way for many real-time application developments. The fundamental building blocks for the overwhelming utilization of semantic web technologies are ontologies, and it allows sharing as well as reusing the concepts in a standardized way so that the data gathered from heterogeneous sources receive a common nomenclature, and it paves the way for disambiguating the duplicates very easily. In this context, the right utilization of ontology capabilities would further strengthen its presence in many web-based applications such as e-learning, virtual communities, social media sites, healthcare, agriculture, etc. In this paper, we have given the comprehensive review of using the semantic web in the domain of healthcare, some virtual communities, and other information retrieval projects. As the role of semantic web is becoming pervasive in many domains, the demand for the semantic web in healthcare, virtual communities, and information retrieval has been gaining huge momentum in recent years. To obtain the correct sense of the meaning of the words or terms given in the textual content, it is deemed necessary to apply the right ontology to fix the ambiguity and shun any deviations that persist on the concepts. In this review paper, we have highlighted all the necessary information for a good understanding of the semantic web and its ontological frameworks.
Collapse
|
15
|
Wu Y, Lin Q, Yang M, Liu J, Tian J, Kapil D, Vanderbloemen L. A Computer Vision-Based Yoga Pose Grading Approach Using Contrastive Skeleton Feature Representations. Healthcare (Basel) 2021; 10:healthcare10010036. [PMID: 35052200 PMCID: PMC8775687 DOI: 10.3390/healthcare10010036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 12/20/2021] [Accepted: 12/20/2021] [Indexed: 11/16/2022] Open
Abstract
The main objective of yoga pose grading is to assess the input yoga pose and compare it to a standard pose in order to provide a quantitative evaluation as a grade. In this paper, a computer vision-based yoga pose grading approach is proposed using contrastive skeleton feature representations. First, the proposed approach extracts human body skeleton keypoints from the input yoga pose image and then feeds their coordinates into a pose feature encoder, which is trained using contrastive triplet examples; finally, a comparison of similar encoded pose features is made. Furthermore, to tackle the inherent challenge of composing contrastive examples in pose feature encoding, this paper proposes a new strategy to use both a coarse triplet example—comprised of an anchor, a positive example from the same category, and a negative example from a different category, and a fine triplet example—comprised of an anchor, a positive example, and a negative example from the same category with different pose qualities. Extensive experiments are conducted using two benchmark datasets to demonstrate the superior performance of the proposed approach.
Collapse
Affiliation(s)
- Yubin Wu
- Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore; (Y.W.); (Q.L.); (M.Y.); (J.L.)
| | - Qianqian Lin
- Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore; (Y.W.); (Q.L.); (M.Y.); (J.L.)
| | - Mingrun Yang
- Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore; (Y.W.); (Q.L.); (M.Y.); (J.L.)
| | - Jing Liu
- Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore; (Y.W.); (Q.L.); (M.Y.); (J.L.)
| | - Jing Tian
- Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore; (Y.W.); (Q.L.); (M.Y.); (J.L.)
- Correspondence:
| | - Dev Kapil
- One Wellness Pte Ltd., Singapore 188033, Singapore;
| | - Laura Vanderbloemen
- College of Health Sciences, VinUniversity, Hanoi 10000, Vietnam;
- Faculty of Medicine, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
16
|
Hernandez-Gomez JC, Restrepo-Martínez A, Valencia-Aguirre J. Descripción del movimiento humano basado en el marco de Frenet Serret y datos tipo MOCAP. REVISTA POLITÉCNICA 2021. [DOI: 10.33571/rpolitec.v17n34a11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Clasificar el movimiento humano se ha convertido en una necesidad tecnológica, en donde para definir la posición de un sujeto requiere identificar el recorrido de las extremidades y el tronco del cuerpo, y tener la capacidad de diferenciar esta posición respecto a otros sujetos o movimientos, generándose la necesidad tener datos y algoritmos que faciliten su clasificación. Es así, como en este trabajo, se evalúa la capacidad discriminante de datos de captura de movimiento en rehabilitación física, donde la posición de los sujetos es adquirida con el Kinect de Microsoft y marcadores ópticos, y atributos del movimiento generados con el marco de Frenet Serret, evaluando su capacidad discriminante con los algoritmos máquinas de soporte vectorial, redes neuronales y k vecinos más cercanos. Los resultados presentan porcentajes de acierto del 93.5% en la clasificación con datos obtenidos del Kinect, y un éxito del 100% para los movimientos con marcadores ópticos.
Classify human movement has become a technological necessity, where defining the position of a subject requires identifying the trajectory of the limbs and trunk of the body, having the ability to differentiate this position from other subjects or movements, which generates the need to have data and algorithms that help their classification. Therefore, the discriminant capacity of motion capture data in physical rehabilitation is evaluated, where the position of the subjects is acquired with the Microsoft Kinect and optical markers. Attributes of the movement generated with the Frenet Serret framework. Evaluating their discriminant capacity by means of support vector machines, neural networks, and k nearest neighbors algorithms. The obtained results present an accuracy of 93.5% in the classification with data obtained from the Kinect, and success of 100% for movements where the position is defined with optical markers.
Collapse
|
17
|
Estimation of Motion and Respiratory Characteristics during the Meditation Practice Based on Video Analysis. SENSORS 2021; 21:s21113771. [PMID: 34072291 PMCID: PMC8199391 DOI: 10.3390/s21113771] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 05/20/2021] [Accepted: 05/25/2021] [Indexed: 12/27/2022]
Abstract
Meditation practice is mental health training. It helps people to reduce stress and suppress negative thoughts. In this paper, we propose a camera-based meditation evaluation system, that helps meditators to improve their performance. We rely on two main criteria to measure the focus: the breathing characteristics (respiratory rate, breathing rhythmicity and stability), and the body movement. We introduce a contactless sensor to measure the respiratory rate based on a smartphone camera by detecting the chest keypoint at each frame, using an optical flow based algorithm to calculate the displacement between frames, filtering and de-noising the chest movement signal, and calculating the number of real peaks in this signal. We also present an approach to detecting the movement of different body parts (head, thorax, shoulders, elbows, wrists, stomach and knees). We have collected a non-annotated dataset for meditation practice videos consists of ninety videos and the annotated dataset consists of eight videos. The non-annotated dataset was categorized into beginner and professional meditators and was used for the development of the algorithm and for tuning the parameters. The annotated dataset was used for evaluation and showed that human activity during meditation practice could be correctly estimated by the presented approach and that the mean absolute error for the respiratory rate is around 1.75 BPM, which can be considered tolerable for the meditation application.
Collapse
|
18
|
Sardari F, Paiement A, Hannuna S, Mirmehdi M. VI-Net-View-Invariant Quality of Human Movement Assessment. SENSORS 2020; 20:s20185258. [PMID: 32942561 PMCID: PMC7570706 DOI: 10.3390/s20185258] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/05/2020] [Accepted: 09/09/2020] [Indexed: 12/30/2022]
Abstract
We propose a view-invariant method towards the assessment of the quality of human movements which does not rely on skeleton data. Our end-to-end convolutional neural network consists of two stages, where at first a view-invariant trajectory descriptor for each body joint is generated from RGB images, and then the collection of trajectories for all joints are processed by an adapted, pre-trained 2D convolutional neural network (CNN) (e.g., VGG-19 or ResNeXt-50) to learn the relationship amongst the different body parts and deliver a score for the movement quality. We release the only publicly-available, multi-view, non-skeleton, non-mocap, rehabilitation movement dataset (QMAR), and provide results for both cross-subject and cross-view scenarios on this dataset. We show that VI-Net achieves average rank correlation of 0.66 on cross-subject and 0.65 on unseen views when trained on only two views. We also evaluate the proposed method on the single-view rehabilitation dataset KIMORE and obtain 0.66 rank correlation against a baseline of 0.62.
Collapse
Affiliation(s)
- Faegheh Sardari
- Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK; (S.H.); (M.M.)
- Correspondence: ; Tel.:+44-(0)117-954 5139
| | - Adeline Paiement
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, Marseille, France;
| | - Sion Hannuna
- Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK; (S.H.); (M.M.)
| | - Majid Mirmehdi
- Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK; (S.H.); (M.M.)
| |
Collapse
|
19
|
Human Psychophysiological Activity Estimation Based on Smartphone Camera and Wearable Electronics. FUTURE INTERNET 2020. [DOI: 10.3390/fi12070111] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This paper presents a study related to human psychophysiological activity estimation based on a smartphone camera and sensors. In recent years, awareness of the human body, as well as human mental states, has become more and more popular. Yoga and meditation practices have moved from the east to Europe, the USA, Russia, and other countries, and there are a lot of people who are interested in them. However, recently, people have tried the practice but would prefer an objective assessment. We propose to apply the modern methods of computer vision, pattern recognition, competence management, and dynamic motivation to estimate the quality of the meditation process and provide the users with objective information about their practice. We propose an approach that covers the possibility of recognizing pictures of humans from a smartphone and utilizes wearable electronics to measure the user’s heart rate and motions. We propose a model that allows building meditation estimation scores based on these parameters. Moreover, we propose a meditation expert network through which users can find the coach that is most appropriate for him/her. Finally, we propose the dynamic motivation model, which encourages people to perform the practice every day.
Collapse
|
20
|
Learning Effective Skeletal Representations on RGB Video for Fine-Grained Human Action Quality Assessment. ELECTRONICS 2020. [DOI: 10.3390/electronics9040568] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, we propose an integrated action classification and regression learning framework for the fine-grained human action quality assessment of RGB videos. On the basis of 2D skeleton data obtained per frame of RGB video sequences, we present an effective representation of joint trajectories to train action classifiers and a class-specific regression model for a fine-grained assessment of the quality of human actions. To manage the challenge of view changes due to camera motion, we develop a self-similarity feature descriptor extracted from joint trajectories and a joint displacement sequence to represent dynamic patterns of the movement and posture of the human body. To weigh the impact of joints for different action categories, a class-specific regression model is developed to obtain effective fine-grained assessment functions. In the testing stage, with the supervision of the action classifier’s output, the regression model of a specific action category is selected to assess the quality of skeleton motion extracted from the action video. We take advantage of the discrimination of the action classifier and the viewpoint invariance of the self-similarity feature to boost the performance of the learning-based quality assessment method in a realistic scene. We evaluate our proposed method using diving and figure skating videos of the publicly available MIT Olympic Scoring dataset, and gymnastic vaulting videos of the recent benchmark University of Nevada Las Vegas (UNLV) Olympic Scoring dataset. The experimental results show that the proposed method achieved an improved performance, which is measured by the mean rank correlation coefficient between the predicted regression scores and the ground truths.
Collapse
|
21
|
Lian KY, Hsu WH, Balram D, Lee CY. A Real-Time Wearable Assist System for Upper Extremity Throwing Action Based on Accelerometers. SENSORS 2020; 20:s20051344. [PMID: 32121453 PMCID: PMC7085600 DOI: 10.3390/s20051344] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 02/20/2020] [Accepted: 02/28/2020] [Indexed: 11/30/2022]
Abstract
This paper focuses on the development of a real-time wearable assist system for upper extremity throwing action based on the accelerometers of inertial measurement unit (IMU) sensors. This real-time assist system can be utilized to the learning, rectification, and rehabilitation for the upper extremity throwing action of players in the field of baseball, where incorrect throwing phases are recognized by a delicate action analysis. The throwing action includes not only the posture characteristics of each phase, but also the transition of continuous posture movements, which is more complex when compared to general action recognition with no continuous phase change. In this work, we have considered six serial phases including wind-up, stride, arm cocking, arm acceleration, arm deceleration, and follow-through in the throwing action recognition process. The continuous movement of each phase of the throwing action is represented by a one-dimensional data sequence after the three-axial acceleration signals are processed by efficient noise filtering based on Kalman filter followed by conversion processes such as leveling and labeling techniques. The longest common subsequence (LCS) method is then used to determine the six serial phases of the throwing action by verifying the sequence data with a sample sequence. We have incorporated various intelligent action recognition functions including automatic recognition for getting ready status, starting movement, handle interrupt situation, and detailed posture transition in the proposed assist system. Moreover, a liquid crystal display (LCD) panel and mobile interface are incorporated into the developed assist system to make it more user-friendly. The real-time system provides precise comments to assist players to attain improved throwing action by analyzing their posture during throwing action. Various experiments were conducted to analyze the efficiency and practicality of the developed assist system as part of this work. We have obtained an average percentage accuracy of 95.14%, 91.42%, and 95.14%, respectively, for all the three users considered in this study. We were able to successfully recognize the throwing action with good precision and the high percentage accuracy exhibited by the proposed assist system indicates its excellent performance.
Collapse
|