1
|
Pang H, Zheng L, Fang H. Cross-Attention Enhanced Pyramid Multi-Scale Networks for Sensor-Based Human Activity Recognition. IEEE J Biomed Health Inform 2024; 28:2733-2744. [PMID: 38483804 DOI: 10.1109/jbhi.2024.3377353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Human Activity Recognition (HAR) has recently attracted widespread attention, with the effective application of this technology helping people in areas such as healthcare, smart homes, and gait analysis. Deep learning methods have shown remarkable performance in HAR. A pivotal challenge is the trade-off between recognition accuracy and computational efficiency, especially in resource-constrained mobile devices. This challenge necessitates the development of models that enhance feature representation capabilities without imposing additional computational burdens. Addressing this, we introduce a novel HAR model leveraging deep learning, ingeniously designed to navigate the accuracy-efficiency trade-off. The model comprises two innovative modules: 1) Pyramid Multi-scale Convolutional Network (PMCN), which is designed with a symmetric structure and is capable of obtaining a rich receptive field at a finer level through its multiscale representation capability; 2) Cross-Attention Mechanism, which establishes interrelationships among sensor dimensions, temporal dimensions, and channel dimensions, and effectively enhances useful information while suppressing irrelevant data. The proposed model is rigorously evaluated across four diverse datasets: UCI, WISDM, PAMAP2, and OPPORTUNITY. Additional ablation and comparative studies are conducted to comprehensively assess the performance of the model. Experimental results demonstrate that the proposed model achieves superior activity recognition accuracy while maintaining low computational overhead.
Collapse
|
2
|
Oh Y. Data Augmentation Techniques for Accurate Action Classification in Stroke Patients with Hemiparesis. SENSORS (BASEL, SWITZERLAND) 2024; 24:1618. [PMID: 38475154 DOI: 10.3390/s24051618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 02/29/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024]
Abstract
Stroke survivors with hemiparesis require extensive home-based rehabilitation. Deep learning-based classifiers can detect actions and provide feedback based on patient data; however, this is difficult owing to data sparsity and heterogeneity. In this study, we investigate data augmentation and model training strategies to address this problem. Three transformations are tested with varying data volumes to analyze the changes in the classification performance of individual data. Moreover, the impact of transfer learning relative to a pre-trained one-dimensional convolutional neural network (Conv1D) and training with an advanced InceptionTime model are estimated with data augmentation. In Conv1D, the joint training data of non-disabled (ND) participants and double rotationally augmented data of stroke patients is observed to outperform the baseline in terms of F1-score (60.9% vs. 47.3%). Transfer learning pre-trained with ND data exhibits 60.3% accuracy, whereas joint training with InceptionTime exhibits 67.2% accuracy under the same conditions. Our results indicate that rotational augmentation is more effective for individual data with initially lower performance and subset data with smaller numbers of participants than other techniques, suggesting that joint training on rotationally augmented ND and stroke data enhances classification performance, particularly in cases with sparse data and lower initial performance.
Collapse
Affiliation(s)
- Youngmin Oh
- School of Computing, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
3
|
Azadi B, Haslgrübler M, Anzengruber-Tanase B, Sopidis G, Ferscha A. Robust Feature Representation Using Multi-Task Learning for Human Activity Recognition. SENSORS (BASEL, SWITZERLAND) 2024; 24:681. [PMID: 38276371 PMCID: PMC10819053 DOI: 10.3390/s24020681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 01/11/2024] [Accepted: 01/19/2024] [Indexed: 01/27/2024]
Abstract
Learning underlying patterns from sensory data is crucial in the Human Activity Recognition (HAR) task to avoid poor generalization when coping with unseen data. A key solution to such an issue is representation learning, which becomes essential when input signals contain activities with similar patterns or when patterns generated by different subjects for the same activity vary. To address these issues, we seek a solution to increase generalization by learning the underlying factors of each sensor signal. We develop a novel multi-channel asymmetric auto-encoder to recreate input signals precisely and extract indicative unsupervised futures. Further, we investigate the role of various activation functions in signal reconstruction to ensure the model preserves the patterns of each activity in the output. Our main contribution is that we propose a multi-task learning model to enhance representation learning through shared layers between signal reconstruction and the HAR task to improve the robustness of the model in coping with users not included in the training phase. The proposed model learns shared features between different tasks that are indeed the underlying factors of each input signal. We validate our multi-task learning model using several publicly available HAR datasets, UCI-HAR, MHealth, PAMAP2, and USC-HAD, and an in-house alpine skiing dataset collected in the wild, where our model achieved 99%, 99%, 95%, 88%, and 92% accuracy. Our proposed method shows consistent performance and good generalization on all the datasets compared to the state of the art.
Collapse
Affiliation(s)
- Behrooz Azadi
- Pro2Future GmbH, Altenberger Strasse 69, 4040 Linz, Austria; (M.H.); (B.A.-T.); (G.S.)
| | - Michael Haslgrübler
- Pro2Future GmbH, Altenberger Strasse 69, 4040 Linz, Austria; (M.H.); (B.A.-T.); (G.S.)
| | | | - Georgios Sopidis
- Pro2Future GmbH, Altenberger Strasse 69, 4040 Linz, Austria; (M.H.); (B.A.-T.); (G.S.)
| | - Alois Ferscha
- Institute of Pervasive Computing, Johannes Kepler University, Altenberger Straße 69, 4040 Linz, Austria;
| |
Collapse
|
4
|
Vuong TH, Doan T, Takasu A. Deep Wavelet Convolutional Neural Networks for Multimodal Human Activity Recognition Using Wearable Inertial Sensors. SENSORS (BASEL, SWITZERLAND) 2023; 23:9721. [PMID: 38139567 PMCID: PMC10747357 DOI: 10.3390/s23249721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/02/2023] [Accepted: 12/05/2023] [Indexed: 12/24/2023]
Abstract
Recent advances in wearable systems have made inertial sensors, such as accelerometers and gyroscopes, compact, lightweight, multimodal, low-cost, and highly accurate. Wearable inertial sensor-based multimodal human activity recognition (HAR) methods utilize the rich sensing data from embedded multimodal sensors to infer human activities. However, existing HAR approaches either rely on domain knowledge or fail to address the time-frequency dependencies of multimodal sensor signals. In this paper, we propose a novel method called deep wavelet convolutional neural networks (DWCNN) designed to learn features from the time-frequency domain and improve accuracy for multimodal HAR. DWCNN introduces a framework that combines continuous wavelet transforms (CWT) with enhanced deep convolutional neural networks (DCNN) to capture the dependencies of sensing signals in the time-frequency domain, thereby enhancing the feature representation ability for multiple wearable inertial sensor-based HAR tasks. Within the CWT, we further propose an algorithm to estimate the wavelet scale parameter. This helps enhance the performance of CWT when computing the time-frequency representation of the input signals. The output of the CWT then serves as input for the proposed DCNN, which consists of residual blocks for extracting features from different modalities and attention blocks for fusing these features of multimodal signals. We conducted extensive experiments on five benchmark HAR datasets: WISDM, UCI-HAR, Heterogeneous, PAMAP2, and UniMiB SHAR. The experimental results demonstrate the superior performance of the proposed model over existing competitors.
Collapse
Affiliation(s)
- Thi Hong Vuong
- Department of Informatics, National Institute of Informatics, Tokyo 101-0003, Japan;
| | - Tung Doan
- Department of Computer Engineering, School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi 11615, Vietnam;
| | - Atsuhiro Takasu
- Department of Informatics, National Institute of Informatics, Tokyo 101-0003, Japan;
| |
Collapse
|
5
|
Liu H, Zhao B, Dai C, Sun B, Li A, Wang Z. MAG-Res2Net: a novel deep learning network for human activity recognition. Physiol Meas 2023; 44:115007. [PMID: 37939391 DOI: 10.1088/1361-6579/ad0ab8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 11/07/2023] [Indexed: 11/10/2023]
Abstract
Objective.Human activity recognition (HAR) has become increasingly important in healthcare, sports, and fitness domains due to its wide range of applications. However, existing deep learning based HAR methods often overlook the challenges posed by the diversity of human activities and data quality, which can make feature extraction difficult. To address these issues, we propose a new neural network model called MAG-Res2Net, which incorporates the Borderline-SMOTE data upsampling algorithm, a loss function combination algorithm based on metric learning, and the Lion optimization algorithm.Approach.We evaluated the proposed method on two commonly utilized public datasets, UCI-HAR and WISDM, and leveraged the CSL-SHARE multimodal human activity recognition dataset for comparison with state-of-the-art models.Main results.On the UCI-HAR dataset, our model achieved accuracy, F1-macro, and F1-weighted scores of 94.44%, 94.38%, and 94.26%, respectively. On the WISDM dataset, the corresponding scores were 98.32%, 97.26%, and 98.42%, respectively.Significance.The proposed MAG-Res2Net model demonstrates robust multimodal performance, with each module successfully enhancing model capabilities. Additionally, our model surpasses current human activity recognition neural networks on both evaluation metrics and training efficiency. Source code of this work is available at:https://github.com/LHY1007/MAG-Res2Net.
Collapse
Affiliation(s)
- Hanyu Liu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| | - Boyang Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| | - Chubo Dai
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| | - Boxin Sun
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| | - Ang Li
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| | - Zhiqiong Wang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110819, People's Republic of China
| |
Collapse
|
6
|
Cao K, Wang M. Human behavior recognition based on sparse transformer with channel attention mechanism. Front Physiol 2023; 14:1239453. [PMID: 38028781 PMCID: PMC10653302 DOI: 10.3389/fphys.2023.1239453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 10/19/2023] [Indexed: 12/01/2023] Open
Abstract
Human activity recognition (HAR) has recently become a popular research field in the wearable sensor technology scene. By analyzing the human behavior data, some disease risks or potential health issues can be detected, and patients' rehabilitation progress can be evaluated. With the excellent performance of Transformer in natural language processing and visual tasks, researchers have begun to focus on its application in time series. The Transformer model models long-term dependencies between sequences through self-attention mechanisms, capturing contextual information over extended periods. In this paper, we propose a hybrid model based on the channel attention mechanism and Transformer model to improve the feature representation ability of sensor-based HAR tasks. Extensive experiments were conducted on three public HAR datasets, and the results show that our network achieved accuracies of 98.10%, 97.21%, and 98.82% on the HARTH, PAMAP2, and UCI-HAR datasets, respectively, The overall performance is at the level of the most advanced methods.
Collapse
Affiliation(s)
- Keyan Cao
- School of Computer Science and Engineering, Shenyang Jianzhu University, Shenyang, Liaoning, China
| | | |
Collapse
|
7
|
Takenaka K, Kondo K, Hasegawa T. Segment-Based Unsupervised Learning Method in Sensor-Based Human Activity Recognition. SENSORS (BASEL, SWITZERLAND) 2023; 23:8449. [PMID: 37896542 PMCID: PMC10610695 DOI: 10.3390/s23208449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/22/2023] [Accepted: 10/11/2023] [Indexed: 10/29/2023]
Abstract
Sensor-based human activity recognition (HAR) is a task to recognize human activities, and HAR has an important role in analyzing human behavior such as in the healthcare field. HAR is typically implemented using traditional machine learning methods. In contrast to traditional machine learning methods, deep learning models can be trained end-to-end with automatic feature extraction from raw sensor data. Therefore, deep learning models can adapt to various situations. However, deep learning models require substantial amounts of training data, and annotating activity labels to construct a training dataset is cost-intensive due to the need for human labor. In this study, we focused on the continuity of activities and propose a segment-based unsupervised deep learning method for HAR using accelerometer sensor data. We define segment data as sensor data measured at one time, and this includes only a single activity. To collect the segment data, we propose a measurement method where the users only need to annotate the starting, changing, and ending points of their activity rather than the activity label. We developed a new segment-based SimCLR, which uses pairs of segment data, and propose a method that combines segment-based SimCLR with SDFD. We investigated the effectiveness of feature representations obtained by training the linear layer with fixed weights obtained by unsupervised learning methods. As a result, we demonstrated that the proposed combined method acquires generalized feature representations. The results of transfer learning on different datasets suggest that the proposed method is robust to the sampling frequency of the sensor data, although it requires more training data than other methods.
Collapse
Affiliation(s)
- Koki Takenaka
- Graduate School of Engineering, University of Fukui, Fukui 910-8507, Japan
| | - Kei Kondo
- Graduate School of Engineering, University of Fukui, Fukui 910-8507, Japan
| | - Tatsuhito Hasegawa
- Graduate School of Engineering, University of Fukui, Fukui 910-8507, Japan
| |
Collapse
|
8
|
Qiu S, Fan T, Jiang J, Wang Z, Wang Y, Xu J, Sun T, Jiang N. A novel two-level interactive action recognition model based on inertial data fusion. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
9
|
Agac S, Durmaz Incel O. On the Use of a Convolutional Block Attention Module in Deep Learning-Based Human Activity Recognition with Motion Sensors. Diagnostics (Basel) 2023; 13:diagnostics13111861. [PMID: 37296713 DOI: 10.3390/diagnostics13111861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 05/21/2023] [Accepted: 05/23/2023] [Indexed: 06/12/2023] Open
Abstract
Sensor-based human activity recognition with wearable devices has captured the attention of researchers in the last decade. The possibility of collecting large sets of data from various sensors in different body parts, automatic feature extraction, and aiming to recognize more complex activities have led to a rapid increase in the use of deep learning models in the field. More recently, using attention-based models for dynamically fine-tuning the model features and, in turn, improving the model performance has been investigated. However, the impact of using channel, spatial, or combined attention methods of the convolutional block attention module (CBAM) on the high-performing DeepConvLSTM model, a hybrid model proposed for sensor-based human activity recognition, has yet to be studied. Additionally, since wearables have limited resources, analysing the parameter requirements of attention modules can serve as an indicator for optimizing resource consumption. In this study, we explored the performance of CBAM on the DeepConvLSTM architecture both in terms of recognition performance and the number of additional parameters required by attention modules. In this direction, the effect of channel and spatial attention, individually and in combination, were examined. To evaluate the model performance, the Pamap2 dataset containing 12 daily activities and the Opportunity dataset with its 18 micro activities were utilized. The results showed that the performance for Opportunity increased from 0.74 to 0.77 in the macro f1-score owing to spatial attention, while for Pamap2, the performance increased from 0.95 to 0.96 owing to the channel attention applied to DeepConvLSTM with a negligible number of additional parameters. Moreover, when the activity-based results were analysed, it was observed that the attention mechanism increased the performance of the activities with the worst performance in the baseline model without attention. We present a comparison with related studies that use the same datasets and show that we could achieve higher scores on both datasets by combining CBAM and DeepConvLSTM.
Collapse
Affiliation(s)
- Sumeyye Agac
- Department of Computer Engineering, Bogazici University, Istanbul 34342, Turkey
| | - Ozlem Durmaz Incel
- Department of Computer Engineering, Bogazici University, Istanbul 34342, Turkey
| |
Collapse
|
10
|
Sopidis G, Haslgrübler M, Ferscha A. Counting Activities Using Weakly Labeled Raw Acceleration Data: A Variable-Length Sequence Approach with Deep Learning to Maintain Event Duration Flexibility. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115057. [PMID: 37299784 DOI: 10.3390/s23115057] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 05/19/2023] [Accepted: 05/22/2023] [Indexed: 06/12/2023]
Abstract
This paper presents a novel approach for counting hand-performed activities using deep learning and inertial measurement units (IMUs). The particular challenge in this task is finding the correct window size for capturing activities with different durations. Traditionally, fixed window sizes have been used, which occasionally result in incorrectly represented activities. To address this limitation, we propose segmenting the time series data into variable-length sequences using ragged tensors to store and process the data. Additionally, our approach utilizes weakly labeled data to simplify the annotation process and reduce the time to prepare annotated data for machine learning algorithms. Thus, the model receives only partial information about the performed activity. Therefore, we propose an LSTM-based architecture, which takes into account both the ragged tensors and the weak labels. To the best of our knowledge, no prior studies attempted counting utilizing variable-size IMU acceleration data with relatively low computational requirements using the number of completed repetitions of hand-performed activities as a label. Hence, we present the data segmentation method we employed and the model architecture that we implemented to show the effectiveness of our approach. Our results are evaluated using the Skoda public dataset for Human activity recognition (HAR) and demonstrate a repetition error of ±1 even in the most challenging cases. The findings of this study have applications and can be beneficial for various fields, including healthcare, sports and fitness, human-computer interaction, robotics, and the manufacturing industry.
Collapse
Affiliation(s)
| | | | - Alois Ferscha
- Institute of Pervasive Computing, Johannes Kepler University, Altenberger Straße 69, 4040 Linz, Austria
| |
Collapse
|
11
|
Javeed M, Mudawi NA, Alabduallah BI, Jalal A, Kim W. A Multimodal IoT-Based Locomotion Classification System Using Features Engineering and Recursive Neural Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:4716. [PMID: 37430630 DOI: 10.3390/s23104716] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 05/05/2023] [Accepted: 05/10/2023] [Indexed: 07/12/2023]
Abstract
Locomotion prediction for human welfare has gained tremendous interest in the past few years. Multimodal locomotion prediction is composed of small activities of daily living and an efficient approach to providing support for healthcare, but the complexities of motion signals along with video processing make it challenging for researchers in terms of achieving a good accuracy rate. The multimodal internet of things (IoT)-based locomotion classification has helped in solving these challenges. In this paper, we proposed a novel multimodal IoT-based locomotion classification technique using three benchmarked datasets. These datasets contain at least three types of data, such as data from physical motion, ambient, and vision-based sensors. The raw data has been filtered through different techniques for each sensor type. Then, the ambient and physical motion-based sensor data have been windowed, and a skeleton model has been retrieved from the vision-based data. Further, the features have been extracted and optimized using state-of-the-art methodologies. Lastly, experiments performed verified that the proposed locomotion classification system is superior when compared to other conventional approaches, particularly when considering multimodal data. The novel multimodal IoT-based locomotion classification system has achieved an accuracy rate of 87.67% and 86.71% over the HWU-USP and Opportunity++ datasets, respectively. The mean accuracy rate of 87.0% is higher than the traditional methods proposed in the literature.
Collapse
Affiliation(s)
- Madiha Javeed
- Department of Computer Science, Air University, Islamabad 44000, Pakistan
| | - Naif Al Mudawi
- Department of Computer Science, College of Computer Science and Information System, Najran University, Najran 55461, Saudi Arabia
| | - Bayan Ibrahimm Alabduallah
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Ahmad Jalal
- Department of Computer Science, Air University, Islamabad 44000, Pakistan
| | - Wooseong Kim
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
12
|
Jameer S, Syed H. Deep SE-BiLSTM with IFPOA Fine-Tuning for Human Activity Recognition Using Mobile and Wearable Sensors. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094319. [PMID: 37177523 PMCID: PMC10181789 DOI: 10.3390/s23094319] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 02/18/2023] [Accepted: 04/07/2023] [Indexed: 05/15/2023]
Abstract
Pervasive computing, human-computer interaction, human behavior analysis, and human activity recognition (HAR) fields have grown significantly. Deep learning (DL)-based techniques have recently been effectively used to predict various human actions using time series data from wearable sensors and mobile devices. The management of time series data remains difficult for DL-based techniques, despite their excellent performance in activity detection. Time series data still has several problems, such as difficulties in heavily biased data and feature extraction. For HAR, an ensemble of Deep SqueezeNet (SE) and bidirectional long short-term memory (BiLSTM) with improved flower pollination optimization algorithm (IFPOA) is designed to construct a reliable classification model utilizing wearable sensor data in this research. The significant features are extracted automatically from the raw sensor data by multi-branch SE-BiLSTM. The model can learn both short-term dependencies and long-term features in sequential data due to SqueezeNet and BiLSTM. The different temporal local dependencies are captured effectively by the proposed model, enhancing the feature extraction process. The hyperparameters of the BiLSTM network are optimized by the IFPOA. The model performance is analyzed using three benchmark datasets: MHEALTH, KU-HAR, and PAMPA2. The proposed model has achieved 99.98%, 99.76%, and 99.54% accuracies on MHEALTH, KU-HAR, and PAMPA2 datasets, respectively. The proposed model performs better than other approaches from the obtained experimental results. The suggested model delivers competitive results compared to state-of-the-art techniques, according to experimental results on four publicly accessible datasets.
Collapse
Affiliation(s)
- Shaik Jameer
- School of Computer Science and Engineering, VIT AP University, Amaravati 522237, India
| | - Hussain Syed
- School of Computer Science and Engineering, VIT AP University, Amaravati 522237, India
| |
Collapse
|
13
|
Kumar P, Suresh S. Deep-HAR: an ensemble deep learning model for recognizing the simple, complex, and heterogeneous human activities. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:1-28. [PMID: 36851913 PMCID: PMC9946874 DOI: 10.1007/s11042-023-14492-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 04/28/2022] [Accepted: 01/31/2023] [Indexed: 06/18/2023]
Abstract
The recognition of human activities has become a dominant emerging research problem and widely covered application areas in surveillance, wellness management, healthcare, and many more. In real life, the activity recognition is a challenging issue because human beings are often performing the activities not only simple but also complex and heterogeneous in nature. Most of the existing approaches are addressing the problem of recognizing only simple straightforward activities (e.g. walking, running, standing, sitting, etc.). Recognizing the complex and heterogeneous human activities are a challenging research problem whereas only a limited number of existing works are addressing this issue. In this paper, we proposed a novel Deep-HAR model by ensembling the Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) for recognizing the simple, complex, and heterogeneous type activities. Here, the CNNs are used for extracting the features whereas RNNs are used for finding the useful patterns in time-series sequential data. The activities recognition performance of the proposed model was evaluated using three different publicly available datasets, namely WISDM, PAMAP2, and KU-HAR. Through extensive experiments, we have demonstrated that the proposed model performs well in recognizing all types of activities and has achieved an accuracy of 99.98%, 99.64%, and 99.98% for simple, complex, and heterogeneous activities respectively.
Collapse
Affiliation(s)
- Prabhat Kumar
- Department of Computer Science, Institute of Science, Banaras Hindu University, Varanasi, 221 005 India
| | - S Suresh
- Department of Computer Science, Institute of Science, Banaras Hindu University, Varanasi, 221 005 India
| |
Collapse
|
14
|
Bi-STAN: bilinear spatial-temporal attention network for wearable human activity recognition. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-023-01781-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
15
|
Ige AO, Mohd Noor MH. A lightweight deep learning with feature weighting for activity recognition. Comput Intell 2022. [DOI: 10.1111/coin.12565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
16
|
Al-qaness MAA, Helmi AM, Dahou A, Elaziz MA. The Applications of Metaheuristics for Human Activity Recognition and Fall Detection Using Wearable Sensors: A Comprehensive Analysis. BIOSENSORS 2022; 12:821. [PMID: 36290958 PMCID: PMC9599938 DOI: 10.3390/bios12100821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/25/2022] [Accepted: 09/28/2022] [Indexed: 06/16/2023]
Abstract
In this paper, we study the applications of metaheuristics (MH) optimization algorithms in human activity recognition (HAR) and fall detection based on sensor data. It is known that MH algorithms have been utilized in complex engineering and optimization problems, including feature selection (FS). Thus, in this regard, this paper used nine MH algorithms as FS methods to boost the classification accuracy of the HAR and fall detection applications. The applied MH were the Aquila optimizer (AO), arithmetic optimization algorithm (AOA), marine predators algorithm (MPA), artificial bee colony (ABC) algorithm, genetic algorithm (GA), slime mold algorithm (SMA), grey wolf optimizer (GWO), whale optimization algorithm (WOA), and particle swarm optimization algorithm (PSO). First, we applied efficient prepossessing and segmentation methods to reveal the motion patterns and reduce the time complexities. Second, we developed a light feature extraction technique using advanced deep learning approaches. The developed model was ResRNN and was composed of several building blocks from deep learning networks including convolution neural networks (CNN), residual networks, and bidirectional recurrent neural networks (BiRNN). Third, we applied the mentioned MH algorithms to select the optimal features and boost classification accuracy. Finally, the support vector machine and random forest classifiers were employed to classify each activity in the case of multi-classification and to detect fall and non-fall actions in the case of binary classification. We used seven different and complex datasets for the multi-classification case: the PAMMP2, Sis-Fall, UniMiB SHAR, OPPORTUNITY, WISDM, UCI-HAR, and KU-HAR datasets. In addition, we used the Sis-Fall dataset for the binary classification (fall detection). We compared the results of the nine MH optimization methods using different performance indicators. We concluded that MH optimization algorithms had promising performance in HAR and fall detection applications.
Collapse
Affiliation(s)
- Mohammed A. A. Al-qaness
- College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321004, China
| | - Ahmed M. Helmi
- College of Engineering and Information Technology, Buraydah Private Colleges, Buraydah 51418, Saudi Arabia
- Computer and Systems Engineering Department, Faculty of Engineering, Zagazig University, Zagazig 44519, Egypt
| | - Abdelghani Dahou
- Mathematics and Computer Science Department, University of Ahmed DRAIA, Adrar 01000, Algeria
- LDDI Laboratory, Faculty of Science and Technology, University of Ahmed DRAIA, Adrar 01000, Algeria
| | - Mohamed Abd Elaziz
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman 346, United Arab Emirates
- Faculty of Computer Science and Engineering, Galala University, Suez 435611, Egypt
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos 13-5053, Lebanon
| |
Collapse
|
17
|
Zhou B, Wang C, Huan Z, Li Z, Chen Y, Gao G, Li H, Dong C, Liang J. A Novel Segmentation Scheme with Multi-Probability Threshold for Human Activity Recognition Using Wearable Sensors. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22197446. [PMID: 36236542 PMCID: PMC9571277 DOI: 10.3390/s22197446] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/24/2022] [Accepted: 09/27/2022] [Indexed: 05/27/2023]
Abstract
In recent years, much research has been conducted on time series based human activity recognition (HAR) using wearable sensors. Most existing work for HAR is based on the manual labeling. However, the complete time serial signals not only contain different types of activities, but also include many transition and atypical ones. Thus, effectively filtering out these activities has become a significant problem. In this paper, a novel machine learning based segmentation scheme with a multi-probability threshold is proposed for HAR. Threshold segmentation (TS) and slope-area (SA) approaches are employed according to the characteristics of small fluctuation of static activity signals and typical peaks and troughs of periodic-like ones. In addition, a multi-label weighted probability (MLWP) model is proposed to estimate the probability of each activity. The HAR error can be significantly decreased, as the proposed model can solve the problem that the fixed window usually contains multiple kinds of activities, while the unknown activities can be accurately rejected to reduce their impacts. Compared with other existing schemes, computer simulation reveals that the proposed model maintains high performance using the UCI and PAMAP2 datasets. The average HAR accuracies are able to reach 97.71% and 95.93%, respectively.
Collapse
Affiliation(s)
- Bangwen Zhou
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou 213000, China
| | - Cheng Wang
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou 213000, China
| | - Zhan Huan
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213000, China
| | - Zhixin Li
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213000, China
| | - Ying Chen
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213000, China
| | - Ge Gao
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou 213000, China
| | - Huahao Li
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou 213000, China
| | - Chenhui Dong
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213000, China
| | - Jiuzhen Liang
- School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, School of Software, Changzhou University, Changzhou 213000, China
| |
Collapse
|
18
|
Zan H, Zhao G. Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07236-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
19
|
ConvNet-based performers attention and supervised contrastive learning for activity recognition. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03937-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
AbstractHuman activity recognition based on generated sensor data plays a major role in a large number of applications such as healthcare monitoring and surveillance system. Yet, accurately recognizing human activities is still challenging and active research due to people’s tendency to perform daily activities in a different and multitasking way. Existing approaches based on the recurrent setting for human activity recognition have some issues, such as the inability to process data parallelly, the requirement for more memory and high computational cost albeit they achieved reasonable results. Convolutional Neural Network processes data parallelly, but, it breaks the ordering of input data, which is significant to build an effective model for human activity recognition. To overcome these challenges, this study proposes causal convolution based on performers-attention and supervised contrastive learning to entirely forego recurrent architectures, efficiently maintain the ordering of human daily activities and focus more on important timesteps of the sensors’ data. Supervised contrastive learning is integrated to learn a discriminative representation of human activities and enhance predictive performance. The proposed network is extensively evaluated for human activities using multiple datasets including wearable sensor data and smart home environments data. The experiments on three wearable sensor datasets and five smart home public datasets of human activities reveal that our proposed network achieves better results and reduces the training time compared with the existing state-of-the-art methods and basic temporal models.
Collapse
|
20
|
|
21
|
Inception-LSTM Human Motion Recognition with Channel Attention Mechanism. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9173504. [PMID: 35734775 PMCID: PMC9208947 DOI: 10.1155/2022/9173504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/09/2022] [Accepted: 05/14/2022] [Indexed: 12/02/2022]
Abstract
An improved channel attention mechanism Inception-LSTM human motion recognition algorithm for inertial sensor signals is proposed to address the problems of high cost, many blind areas, and susceptibility to environmental effects in traditional video image-oriented human motion recognition algorithms. The proposed algorithm takes the inertial sensor signal as input, first extracts the spatial features of the sensor signal into the feature vector graph from multiple scales using the Inception parallel convolution structure, then uses the improved ECA (Efficient Channel Attention) channel attention module to extract the critical details of the feature vector graph of the sensor data, and finally uses the LSTM network to further extract the temporal features of the inertial sensor signals to achieve the classification and recognition of human motion posture. The experiment results demonstrate that 95.04% recognition accuracy on the public dataset PAMAP2 and 98.81% accuracy on the self-built dataset can be realized based on the algorithm model, indicating that the algorithm model has a superior recognition effect. In addition, the results of the visual analysis of channel attention weights show that the proposed model is interpretable for the recognition of human motions and is consistent with the living intuition.
Collapse
|
22
|
Issa ME, Helmi AM, Al-Qaness MAA, Dahou A, Abd Elaziz M, Damaševičius R. Human Activity Recognition Based on Embedded Sensor Data Fusion for the Internet of Healthcare Things. Healthcare (Basel) 2022; 10:healthcare10061084. [PMID: 35742136 PMCID: PMC9222808 DOI: 10.3390/healthcare10061084] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/05/2022] [Accepted: 06/09/2022] [Indexed: 12/31/2022] Open
Abstract
Nowadays, the emerging information technologies in smart handheld devices are motivating the research community to make use of embedded sensors in such devices for healthcare purposes. In particular, inertial measurement sensors such as accelerometers and gyroscopes embedded in smartphones and smartwatches can provide sensory data fusion for human activities and gestures. Thus, the concepts of the Internet of Healthcare Things (IoHT) paradigm can be applied to handle such sensory data and maximize the benefits of collecting and analyzing them. The application areas contain but are not restricted to the rehabilitation of elderly people, fall detection, smoking control, sportive exercises, and monitoring of daily life activities. In this work, a public dataset collected using two smartphones (in pocket and wrist positions) is considered for IoHT applications. Three-dimensional inertia signals of thirteen timestamped human activities such as Walking, Walking Upstairs, Walking Downstairs, Writing, Smoking, and others are registered. Here, an efficient human activity recognition (HAR) model is presented based on efficient handcrafted features and Random Forest as a classifier. Simulation results ensure the superiority of the applied model over others introduced in the literature for the same dataset. Moreover, different approaches to evaluating such models are considered, as well as implementation issues. The accuracy of the current model reaches 98.7% on average. The current model performance is also verified using the WISDM v1 dataset.
Collapse
Affiliation(s)
- Mohamed E. Issa
- Computer and Systems Engineering Department, Faculty of Engineering, Zagazig University, Zagazig 44519, Egypt; (M.E.I.); (A.M.H.)
| | - Ahmed M. Helmi
- Computer and Systems Engineering Department, Faculty of Engineering, Zagazig University, Zagazig 44519, Egypt; (M.E.I.); (A.M.H.)
- College of Engineering and Information Technology, Buraydah Private Colleges, Buraydah 51418, Saudi Arabia
| | - Mohammed A. A. Al-Qaness
- State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
- Faculty of Engineering, Sana’a University, Sana’a 12544, Yemen
- Correspondence: (M.A.A.A.-Q.); (R.D.)
| | - Abdelghani Dahou
- LDDI Laboratory, Faculty of Science and Technology, University of Ahmed DRAIA, Adrar 01000, Algeria;
| | - Mohamed Abd Elaziz
- Faculty of Computer Science and Engineering, Galala University, Suez 435611, Egypt;
- Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman 346, United Arab Emirates
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
| | - Robertas Damaševičius
- Department of Applied Informatics, Vytautas Magnus University, 44404 Kaunas, Lithuania
- Correspondence: (M.A.A.A.-Q.); (R.D.)
| |
Collapse
|
23
|
Ergonomics Risk Assessment for Manual Material Handling of Warehouse Activities Involving High Shelf and Low Shelf Binning Processes: Application of Marker-Based Motion Capture. SUSTAINABILITY 2022. [DOI: 10.3390/su14105767] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Lower back pain is a musculoskeletal disorder that is commonly reported among warehouse workers due to the nature of the work environment and manual handling activities. The objective of this study was to assess the ergonomic risks among warehouse workers carrying out high shelf (HS) and low shelf (LS) binning processes. A questionnaire was used to determine the prevalence of musculoskeletal symptoms, while a marker-based motion capture (MoCap) system worksheet was used to record the participants’ motion and determine the action risk level. A total of 33% of the participants reported lower back pain in the past seven days, based on the Cornell Musculoskeletal Discomfort Questionnaire (CMDQ) results. Analysis of the body velocities showed that the HS binning process had four major velocity peaks, defined as the initial, lowering, lifting, and final phases. In comparison, the LS binning process had two major peaks defined, the crouching and rising phases. There were significant differences between the mean velocities of the workers for the HS binning process, indicating that the workers have different movement patterns with varying velocities.
Collapse
|
24
|
|
25
|
Mathematical Formula Image Screening Based on Feature Correlation Enhancement. ELECTRONICS 2022. [DOI: 10.3390/electronics11050799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
There are mathematical formula images or other images in scientific and technical documents or on web pages, and mathematical formula images are classified as either containing only mathematical formulas or formulas interspersed with other elements, such as text and coordinate diagrams. To screen and collect images containing mathematical formulas for others to study or for further research, a model for screening images of mathematical formulas based on feature correlation enhancement is proposed. First, the Feature Correlation Enhancement (FCE) module was designed to improve the correlation degree of mathematical formula features and weaken other features. Then, the strip multi-scale pooling (SMP) module was designed to solve the problem of non-uniform image size, while enhancing the focus on horizontal formula features. Finally, the loss function was improved to balance the dataset. The accuracy of the experiment was 89.50%, which outperformed the existing model. Using the model to screen images enables the user to screen out images containing mathematical formulas. The screening of images containing mathematical formulas helps to speed up the creation of a database of mathematical formula images.
Collapse
|
26
|
Zhang S, Li Y, Zhang S, Shahabi F, Xia S, Deng Y, Alshurafa N. Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances. SENSORS (BASEL, SWITZERLAND) 2022; 22:1476. [PMID: 35214377 PMCID: PMC8879042 DOI: 10.3390/s22041476] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 01/30/2022] [Accepted: 01/31/2022] [Indexed: 02/04/2023]
Abstract
Mobile and wearable devices have enabled numerous applications, including activity tracking, wellness monitoring, and human-computer interaction, that measure and improve our daily lives. Many of these applications are made possible by leveraging the rich collection of low-power sensors found in many mobile and wearable devices to perform human activity recognition (HAR). Recently, deep learning has greatly pushed the boundaries of HAR on mobile and wearable devices. This paper systematically categorizes and summarizes existing work that introduces deep learning methods for wearables-based HAR and provides a comprehensive analysis of the current advancements, developing trends, and major challenges. We also present cutting-edge frontiers and future directions for deep learning-based HAR.
Collapse
Affiliation(s)
- Shibo Zhang
- Department of Computer Science, McCormick School of Engineering, Northwestern University, Mudd Hall, 2233 Tech Drive, Evanston, IL 60208, USA; (F.S.); (N.A.)
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 680 N. Lakeshore Dr., Suite 1400, Chicago, IL 60611, USA
| | - Yaxuan Li
- Electrical and Computer Engineering Department, McGill University, McConnell Engineering Building, 3480 Rue University, Montréal, QC H3A 0E9, Canada;
| | - Shen Zhang
- School of Electrical and Computer Engineering, Georgia Institute of Technology, 777 Atlantic Drive, Atlanta, GA 30332, USA;
| | - Farzad Shahabi
- Department of Computer Science, McCormick School of Engineering, Northwestern University, Mudd Hall, 2233 Tech Drive, Evanston, IL 60208, USA; (F.S.); (N.A.)
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 680 N. Lakeshore Dr., Suite 1400, Chicago, IL 60611, USA
| | - Stephen Xia
- Department of Electrical Engineering, Columbia University, Mudd 1310, 500 W. 120th Street, New York, NY 10027, USA;
| | - Yu Deng
- Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern University, 625 N Michigan Ave, Chicago, IL 60611, USA;
| | - Nabil Alshurafa
- Department of Computer Science, McCormick School of Engineering, Northwestern University, Mudd Hall, 2233 Tech Drive, Evanston, IL 60208, USA; (F.S.); (N.A.)
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 680 N. Lakeshore Dr., Suite 1400, Chicago, IL 60611, USA
| |
Collapse
|
27
|
Li Y, Wang L. Human Activity Recognition Based on Residual Network and BiLSTM. SENSORS 2022; 22:s22020635. [PMID: 35062604 PMCID: PMC8778132 DOI: 10.3390/s22020635] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 12/19/2021] [Accepted: 01/12/2022] [Indexed: 12/07/2022]
Abstract
Due to the wide application of human activity recognition (HAR) in sports and health, a large number of HAR models based on deep learning have been proposed. However, many existing models ignore the effective extraction of spatial and temporal features of human activity data. This paper proposes a deep learning model based on residual block and bi-directional LSTM (BiLSTM). The model first extracts spatial features of multidimensional signals of MEMS inertial sensors automatically using the residual block, and then obtains the forward and backward dependencies of feature sequence using BiLSTM. Finally, the obtained features are fed into the Softmax layer to complete the human activity recognition. The optimal parameters of the model are obtained by experiments. A homemade dataset containing six common human activities of sitting, standing, walking, running, going upstairs and going downstairs is developed. The proposed model is evaluated on our dataset and two public datasets, WISDM and PAMAP2. The experimental results show that the proposed model achieves the accuracy of 96.95%, 97.32% and 97.15% on our dataset, WISDM and PAMAP2, respectively. Compared with some existing models, the proposed model has better performance and fewer parameters.
Collapse
Affiliation(s)
- Yong Li
- School of Biomedical Engineering, Sun Yat-sen University, Guangzhou 510006, China;
| | - Luping Wang
- School of Electronics and Communication Engineering, Sun Yat-sen University, Guangzhou 510006, China
- Correspondence:
| |
Collapse
|
28
|
Tang Y, Zhang L, Teng Q, Min F, Song A. Triple Cross-Domain Attention on Human Activity Recognition Using Wearable Sensors. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2022. [DOI: 10.1109/tetci.2021.3136642] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
29
|
Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy. SENSORS 2021; 21:s21227743. [PMID: 34833819 PMCID: PMC8623838 DOI: 10.3390/s21227743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/08/2021] [Accepted: 11/17/2021] [Indexed: 11/24/2022]
Abstract
In sensor-based human activity recognition, many methods based on convolutional neural networks (CNNs) have been proposed. In the typical CNN-based activity recognition model, each class is treated independently of others. However, actual activity classes often have hierarchical relationships. It is important to consider an activity recognition model that uses the hierarchical relationship among classes to improve recognition performance. In image recognition, branch CNNs (B-CNNs) have been proposed for classification using class hierarchies. B-CNNs can easily perform classification using hand-crafted class hierarchies, but it is difficult to manually design an appropriate class hierarchy when the number of classes is large or there is little prior knowledge. Therefore, in our study, we propose a class hierarchy-adaptive B-CNN, which adds a method to the B-CNN for automatically constructing class hierarchies. Our method constructs the class hierarchy from training data automatically to effectively train the B-CNN without prior knowledge. We evaluated our method on several benchmark datasets for activity recognition. As a result, our method outperformed standard CNN models without considering the hierarchical relationship among classes. In addition, we confirmed that our method has performance comparable to a B-CNN model with a class hierarchy based on human prior knowledge.
Collapse
|
30
|
Recognition of Fine-Grained Walking Patterns Using a Smartwatch with Deep Attentive Neural Networks. SENSORS 2021; 21:s21196393. [PMID: 34640712 PMCID: PMC8511983 DOI: 10.3390/s21196393] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/16/2021] [Accepted: 09/16/2021] [Indexed: 11/16/2022]
Abstract
Generally, people do various things while walking. For example, people frequently walk while looking at their smartphones. Sometimes we walk differently than usual; for example, when walking on ice or snow, we tend to waddle. Understanding walking patterns could provide users with contextual information tailored to the current situation. To formulate this as a machine-learning problem, we defined 18 different everyday walking styles. Noting that walking strategies significantly affect the spatiotemporal features of hand motions, e.g., the speed and intensity of the swinging arm, we propose a smartwatch-based wearable system that can recognize these predefined walking styles. We developed a wearable system, suitable for use with a commercial smartwatch, that can capture hand motions in the form of multivariate timeseries (MTS) signals. Then, we employed a set of machine learning algorithms, including feature-based and recent deep learning algorithms, to learn the MTS data in a supervised fashion. Experimental results demonstrated that, with recent deep learning algorithms, the proposed approach successfully recognized a variety of walking patterns, using the smartwatch measurements. We analyzed the results with recent attention-based recurrent neural networks to understand the relative contributions of the MTS signals in the classification process.
Collapse
|
31
|
Hamad RA, Kimura M, Yang L, Woo WL, Wei B. Dilated causal convolution with multi-head self attention for sensor human activity recognition. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06007-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractSystems of sensor human
activity recognition are becoming increasingly popular in diverse fields such as healthcare and security. Yet, developing such systems poses inherent challenges due to the variations and complexity of human behaviors during the performance of physical activities. Recurrent neural networks, particularly long short-term memory have achieved promising results on numerous sequential learning problems, including sensor human activity recognition. However, parallelization is inhibited in recurrent networks due to sequential operation and computation that lead to slow training, occupying more memory and hard convergence. One-dimensional convolutional neural network processes input temporal sequential batches independently that lead to effectively executed operations in parallel. Despite that, a one-dimensional Convolutional Neural Network is not sensitive to the order of the time steps which is crucial for accurate and robust systems of sensor human activity recognition. To address this problem, we propose a network architecture based on dilated causal convolution and multi-head self-attention mechanisms that entirely dispense recurrent architectures to make efficient computation and maintain the ordering of the time steps. The proposed method is evaluated for human activities using smart home binary sensors data and wearable sensor data. Results of conducted extensive experiments on eight public and benchmark HAR data sets show that the proposed network outperforms the state-of-the-art models based on recurrent settings and temporal models.
Collapse
|