1
|
Ahmedt-Aristizabal D, Armin MA, Hayder Z, Garcia-Cairasco N, Petersson L, Fookes C, Denman S, McGonigal A. Deep learning approaches for seizure video analysis: A review. Epilepsy Behav 2024; 154:109735. [PMID: 38522192 DOI: 10.1016/j.yebeh.2024.109735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 02/06/2024] [Accepted: 03/03/2024] [Indexed: 03/26/2024]
Abstract
Seizure events can manifest as transient disruptions in the control of movements which may be organized in distinct behavioral sequences, accompanied or not by other observable features such as altered facial expressions. The analysis of these clinical signs, referred to as semiology, is subject to observer variations when specialists evaluate video-recorded events in the clinical setting. To enhance the accuracy and consistency of evaluations, computer-aided video analysis of seizures has emerged as a natural avenue. In the field of medical applications, deep learning and computer vision approaches have driven substantial advancements. Historically, these approaches have been used for disease detection, classification, and prediction using diagnostic data; however, there has been limited exploration of their application in evaluating video-based motion detection in the clinical epileptology setting. While vision-based technologies do not aim to replace clinical expertise, they can significantly contribute to medical decision-making and patient care by providing quantitative evidence and decision support. Behavior monitoring tools offer several advantages such as providing objective information, detecting challenging-to-observe events, reducing documentation efforts, and extending assessment capabilities to areas with limited expertise. The main applications of these could be (1) improved seizure detection methods; (2) refined semiology analysis for predicting seizure type and cerebral localization. In this paper, we detail the foundation technologies used in vision-based systems in the analysis of seizure videos, highlighting their success in semiology detection and analysis, focusing on work published in the last 7 years. We systematically present these methods and indicate how the adoption of deep learning for the analysis of video recordings of seizures could be approached. Additionally, we illustrate how existing technologies can be interconnected through an integrated system for video-based semiology analysis. Each module can be customized and improved by adapting more accurate and robust deep learning approaches as these evolve. Finally, we discuss challenges and research directions for future studies.
Collapse
Affiliation(s)
- David Ahmedt-Aristizabal
- Imaging and Computer Vision Group, CSIRO Data61, Australia; SAIVT Laboratory, Queensland University of Technology, Australia.
| | | | - Zeeshan Hayder
- Imaging and Computer Vision Group, CSIRO Data61, Australia.
| | - Norberto Garcia-Cairasco
- Physiology Department and Neuroscience and Behavioral Sciences Department, Ribeirão Preto Medical School, University of São Paulo, Brazil.
| | - Lars Petersson
- Imaging and Computer Vision Group, CSIRO Data61, Australia.
| | - Clinton Fookes
- SAIVT Laboratory, Queensland University of Technology, Australia.
| | - Simon Denman
- SAIVT Laboratory, Queensland University of Technology, Australia.
| | - Aileen McGonigal
- Neurosciences Centre, Mater Hospital, Australia; Queensland Brain Institute, The University of Queensland, Australia.
| |
Collapse
|
2
|
Camarena F, Gonzalez-Mendoza M, Chang L. Knowledge Distillation in Video-Based Human Action Recognition: An Intuitive Approach to Efficient and Flexible Model Training. J Imaging 2024; 10:85. [PMID: 38667983 PMCID: PMC11051277 DOI: 10.3390/jimaging10040085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 03/23/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Training a model to recognize human actions in videos is computationally intensive. While modern strategies employ transfer learning methods to make the process more efficient, they still face challenges regarding flexibility and efficiency. Existing solutions are limited in functionality and rely heavily on pretrained architectures, which can restrict their applicability to diverse scenarios. Our work explores knowledge distillation (KD) for enhancing the training of self-supervised video models in three aspects: improving classification accuracy, accelerating model convergence, and increasing model flexibility under regular and limited-data scenarios. We tested our method on the UCF101 dataset using differently balanced proportions: 100%, 50%, 25%, and 2%. We found that using knowledge distillation to guide the model's training outperforms traditional training without affecting the classification accuracy and while reducing the convergence rate of model training in standard settings and a data-scarce environment. Additionally, knowledge distillation enables cross-architecture flexibility, allowing model customization for various applications: from resource-limited to high-performance scenarios.
Collapse
Affiliation(s)
- Fernando Camarena
- School of Engineering and Science, Tecnologico de Monterrey, Nuevo León 64700, Mexico
| | | | | |
Collapse
|
3
|
Gao M, Ju B. Attention-enhanced gated recurrent unit for action recognition in tennis. PeerJ Comput Sci 2024; 10:e1804. [PMID: 38259901 PMCID: PMC10803087 DOI: 10.7717/peerj-cs.1804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 12/18/2023] [Indexed: 01/24/2024]
Abstract
Human Action Recognition (HAR) is an essential topic in computer vision and artificial intelligence, focused on the automatic identification and categorization of human actions or activities from video sequences or sensor data. The goal of HAR is to teach machines to comprehend and interpret human movements, gestures, and behaviors, allowing for a wide range of applications in areas such as surveillance, healthcare, sports analysis, and human-computer interaction. HAR systems utilize a variety of techniques, including deep learning, motion analysis, and feature extraction, to capture and analyze the spatiotemporal characteristics of human actions. These systems have the capacity to distinguish between various actions, whether they are simple actions like walking and waving or more complex activities such as playing a musical instrument or performing sports maneuvers. HAR continues to be an active area of research and development, with the potential to enhance numerous real-world applications by providing machines with the ability to understand and respond to human actions effectively. In our study, we developed a HAR system to recognize actions in tennis using an attention-based gated recurrent unit (GRU), a prevalent recurrent neural network. The combination of GRU architecture and attention mechanism showed a significant improvement in prediction power compared to two other deep learning models. Our models were trained on the THETIS dataset, one of the standard medium-sized datasets for fine-grained tennis actions. The effectiveness of the proposed model was confirmed by three different types of image encoders: InceptionV3, DenseNet, and EfficientNetB5. The models developed with InceptionV3, DenseNet, and EfficientNetB5 achieved average ROC-AUC values of 0.97, 0.98, and 0.81, respectively. While, the models obtained average PR-AUC values of 0.84, 0.87, and 0.49 for InceptionV3, DenseNet, and EfficientNetB5 features, respectively. The experimental results confirmed the applicability of our proposed method in recognizing action in tennis and may be applied to other HAR problems.
Collapse
Affiliation(s)
- Meng Gao
- College of Sports and Health Management, Henan Finance University, Zhengzhou, China
| | - Bingchun Ju
- College of Sports, Zhengzhou University of Light Industry, Zhengzhou, China
| |
Collapse
|
4
|
Guerra BMV, Torti E, Marenzi E, Schmid M, Ramat S, Leporati F, Danese G. Ambient assisted living for frail people through human activity recognition: state-of-the-art, challenges and future directions. Front Neurosci 2023; 17:1256682. [PMID: 37849892 PMCID: PMC10577184 DOI: 10.3389/fnins.2023.1256682] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/18/2023] [Indexed: 10/19/2023] Open
Abstract
Ambient Assisted Living is a concept that focuses on using technology to support and enhance the quality of life and well-being of frail or elderly individuals in both indoor and outdoor environments. It aims at empowering individuals to maintain their independence and autonomy while ensuring their safety and providing assistance when needed. Human Activity Recognition is widely regarded as the most popular methodology within the field of Ambient Assisted Living. Human Activity Recognition involves automatically detecting and classifying the activities performed by individuals using sensor-based systems. Researchers have employed various methodologies, utilizing wearable and/or non-wearable sensors, and employing algorithms ranging from simple threshold-based techniques to more advanced deep learning approaches. In this review, literature from the past decade is critically examined, specifically exploring the technological aspects of Human Activity Recognition in Ambient Assisted Living. An exhaustive analysis of the methodologies adopted, highlighting their strengths and weaknesses is provided. Finally, challenges encountered in the field of Human Activity Recognition for Ambient Assisted Living are thoroughly discussed. These challenges encompass issues related to data collection, model training, real-time performance, generalizability, and user acceptance. Miniaturization, unobtrusiveness, energy harvesting and communication efficiency will be the crucial factors for new wearable solutions.
Collapse
Affiliation(s)
- Bruna Maria Vittoria Guerra
- Bioengineering Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Emanuele Torti
- Custom Computing and Programmable Systems Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Elisa Marenzi
- Custom Computing and Programmable Systems Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Micaela Schmid
- Bioengineering Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Stefano Ramat
- Bioengineering Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Francesco Leporati
- Custom Computing and Programmable Systems Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Giovanni Danese
- Custom Computing and Programmable Systems Laboratory, Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| |
Collapse
|
5
|
Maudsley-Barton S, Yap MH. KINECAL: A Dataset for Falls-Risk Assessment and Balance Impairment Analysis. Sci Data 2023; 10:633. [PMID: 37723189 PMCID: PMC10507078 DOI: 10.1038/s41597-023-02375-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 07/11/2023] [Indexed: 09/20/2023] Open
Abstract
The field of human action recognition has made great strides in recent years, much helped by the availability of a wide variety of datasets that use Kinect to record human movement. Conversely, progress towards the use of Kinect in clinical practice has been hampered by the lack of appropriate data. In particular, datasets that contain clinically significant movements and appropriate metadata. This paper proposes a dataset to address this issue, namely KINECAL. It contains the recordings of 90 individuals carrying out 11 movements, commonly used in the clinical assessment of balance. The dataset contains relevant metadata, including clinical labelling, falls history labelling and postural sway metrics. KINECAL should be of interest to researchers interested in the clinical use of motion capture and motion analysis.
Collapse
Affiliation(s)
- Sean Maudsley-Barton
- Department of Computing and Mathematics, Manchester Metropolitan University, Faculty of Science and Engineering, Manchester, M1 5GD, UK.
| | - Moi Hoon Yap
- Department of Computing and Mathematics, Manchester Metropolitan University, Faculty of Science and Engineering, Manchester, M1 5GD, UK
| |
Collapse
|
6
|
Muaaz M, Waqar S, Pätzold M. Orientation-Independent Human Activity Recognition Using Complementary Radio Frequency Sensing. SENSORS (BASEL, SWITZERLAND) 2023; 23:5810. [PMID: 37447660 DOI: 10.3390/s23135810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/12/2023] [Accepted: 06/19/2023] [Indexed: 07/15/2023]
Abstract
RF sensing offers an unobtrusive, user-friendly, and privacy-preserving method for detecting accidental falls and recognizing human activities. Contemporary RF-based HAR systems generally employ a single monostatic radar to recognize human activities. However, a single monostatic radar cannot detect the motion of a target, e.g., a moving person, orthogonal to the boresight axis of the radar. Owing to this inherent physical limitation, a single monostatic radar fails to efficiently recognize orientation-independent human activities. In this work, we present a complementary RF sensing approach that overcomes the limitation of existing single monostatic radar-based HAR systems to robustly recognize orientation-independent human activities and falls. Our approach used a distributed mmWave MIMO radar system that was set up as two separate monostatic radars placed orthogonal to each other in an indoor environment. These two radars illuminated the moving person from two different aspect angles and consequently produced two time-variant micro-Doppler signatures. We first computed the mean Doppler shifts (MDSs) from the micro-Doppler signatures and then extracted statistical and time- and frequency-domain features. We adopted feature-level fusion techniques to fuse the extracted features and a support vector machine to classify orientation-independent human activities. To evaluate our approach, we used an orientation-independent human activity dataset, which was collected from six volunteers. The dataset consisted of more than 1350 activity trials of five different activities that were performed in different orientations. The proposed complementary RF sensing approach achieved an overall classification accuracy ranging from 98.31 to 98.54%. It overcame the inherent limitations of a conventional single monostatic radar-based HAR and outperformed it by 6%.
Collapse
Affiliation(s)
- Muhammad Muaaz
- Faculty of Engineering and Science, University of Agder, 4898 Grimstad, Norway
| | - Sahil Waqar
- Faculty of Engineering and Science, University of Agder, 4898 Grimstad, Norway
| | - Matthias Pätzold
- Faculty of Engineering and Science, University of Agder, 4898 Grimstad, Norway
| |
Collapse
|
7
|
Kulbacki M, Segen J, Chaczko Z, Rozenblit JW, Kulbacki M, Klempous R, Wojciechowski K. Intelligent Video Analytics for Human Action Recognition: The State of Knowledge. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094258. [PMID: 37177461 PMCID: PMC10181781 DOI: 10.3390/s23094258] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 04/09/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023]
Abstract
The paper presents a comprehensive overview of intelligent video analytics and human action recognition methods. The article provides an overview of the current state of knowledge in the field of human activity recognition, including various techniques such as pose-based, tracking-based, spatio-temporal, and deep learning-based approaches, including visual transformers. We also discuss the challenges and limitations of these techniques and the potential of modern edge AI architectures to enable real-time human action recognition in resource-constrained environments.
Collapse
Affiliation(s)
- Marek Kulbacki
- Polish-Japanese Academy of Information Technology, 02-008 Warsaw, Poland
- DIVE IN AI, 53-307 Wroclaw, Poland
| | - Jakub Segen
- Polish-Japanese Academy of Information Technology, 02-008 Warsaw, Poland
- DIVE IN AI, 53-307 Wroclaw, Poland
| | - Zenon Chaczko
- DIVE IN AI, 53-307 Wroclaw, Poland
- School of Electrical and Data Engineering, University of Technology Sydney, Ultimo 2007, Australia
| | - Jerzy W Rozenblit
- Department of Electrical and Computer Engineering, The University of Arizona, Tucson, AZ 85721, USA
| | | | - Ryszard Klempous
- Wrocław University of Science and Technology, 50-370 Wroclaw, Poland
| | | |
Collapse
|
8
|
Jamshed A, Mallick B, Kumar Bharti R. Grey wolf optimization (GWO) with the convolution neural network (CNN)-based pattern recognition system. THE IMAGING SCIENCE JOURNAL 2023. [DOI: 10.1080/13682199.2023.2166193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Affiliation(s)
- Aatif Jamshed
- Department of Computer Science and Engineering, Veer Madho Singh Bhandari Uttarakhand Technical University, Dehradun, Uttarakhand, India
| | - Bhawna Mallick
- Meerut Institute of Engineering and Technology, Meerut, Uttar Pradesh, India
| | | |
Collapse
|
9
|
Toward human activity recognition: a survey. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07937-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
10
|
A Comprehensive Review of Recent Deep Learning Techniques for Human Activity Recognition. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8323962. [PMID: 35498187 PMCID: PMC9045967 DOI: 10.1155/2022/8323962] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 01/22/2022] [Accepted: 03/31/2022] [Indexed: 01/21/2023]
Abstract
Human action recognition is an important field in computer vision that has attracted remarkable attention from researchers. This survey aims to provide a comprehensive overview of recent human action recognition approaches based on deep learning using RGB video data. Our work divides recent deep learning-based methods into five different categories to provide a comprehensive overview for researchers who are interested in this field of computer vision. Moreover, a pure-transformer architecture (convolution-free) has outperformed its convolutional counterparts in many fields of computer vision recently. Our work also provides recent convolution-free-based methods which replaced convolution networks with the transformer networks that achieved state-of-the-art results on many human action recognition datasets. Firstly, we discuss proposed methods based on a 2D convolutional neural network. Then, methods based on a recurrent neural network which is used to capture motion information are discussed. 3D convolutional neural network-based methods are used in many recent approaches to capture both spatial and temporal information in videos. However, with long action videos, multistream approaches with different streams to encode different features are reviewed. We also compare the performance of recently proposed methods on four popular benchmark datasets. We review 26 benchmark datasets for human action recognition. Some potential research directions are discussed to conclude this survey.
Collapse
|
11
|
Neural Networks for Automatic Posture Recognition in Ambient-Assisted Living. SENSORS 2022; 22:s22072609. [PMID: 35408224 PMCID: PMC9003043 DOI: 10.3390/s22072609] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 03/25/2022] [Accepted: 03/27/2022] [Indexed: 12/29/2022]
Abstract
Human Action Recognition (HAR) is a rapidly evolving field impacting numerous domains, among which is Ambient Assisted Living (AAL). In such a context, the aim of HAR is meeting the needs of frail individuals, whether elderly and/or disabled and promoting autonomous, safe and secure living. To this goal, we propose a monitoring system detecting dangerous situations by classifying human postures through Artificial Intelligence (AI) solutions. The developed algorithm works on a set of features computed from the skeleton data provided by four Kinect One systems simultaneously recording the scene from different angles and identifying the posture of the subject in an ecological context within each recorded frame. Here, we compare the recognition abilities of Multi-Layer Perceptron (MLP) and Long-Short Term Memory (LSTM) Sequence networks. Starting from the set of previously selected features we performed a further feature selection based on an SVM algorithm for the optimization of the MLP network and used a genetic algorithm for selecting the features for the LSTM sequence model. We then optimized the architecture and hyperparameters of both models before comparing their performances. The best MLP model (3 hidden layers and a Softmax output layer) achieved 78.4%, while the best LSTM (2 bidirectional LSTM layers, 2 dropout and a fully connected layer) reached 85.7%. The analysis of the performances on individual classes highlights the better suitability of the LSTM approach.
Collapse
|
12
|
Implementation of Sequence-Based Classification Methods for Motion Assessment and Recognition in a Traditional Chinese Sport (Baduanjin). INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19031744. [PMID: 35162767 PMCID: PMC8834705 DOI: 10.3390/ijerph19031744] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/30/2022] [Accepted: 01/30/2022] [Indexed: 11/30/2022]
Abstract
This study aimed to assess the motion accuracy of Baduanjin and recognise the motions of Baduanjin based on sequence-based methods. Motion data of Baduanjin were measured by the inertial sensor measurement system (IMU). Fifty-four participants were recruited to capture motion data. Based on the motion data, various sequence-based methods, namely dynamic time warping (DTW) combined with classifiers, hidden Markov model (HMM), and recurrent neural networks (RNNs), were applied to assess motion accuracy and recognise the motions of Baduanjin. To assess motion accuracy, the scores for motion accuracies from teachers were used as the standard to train the models on the different sequence-based methods. The effectiveness of Baduanjin motion recognition with different sequence-based methods was verified. Among the methods, DTW + k-NN had the highest average accuracy (83.03%) and shortest average processing time (3.810 s) during assessing. In terms of motion reorganisation, three methods (DTW + k-NN, DTW + SVM, and HMM) had the highest accuracies (over 99%), which were not significantly different from each other. However, the processing time of DTW + k-NN was the shortest (3.823 s) compared to the other two methods. The results show that the motions of Baduanjin could be recognised, and the accuracy can be assessed through an appropriate sequence-based method with the motion data captured by IMU.
Collapse
|
13
|
Bossavit B, Arnedillo-Sánchez I. Using motion capture technology to assess locomotor development in children. Digit Health 2022; 8:20552076221144201. [PMID: 36532118 PMCID: PMC9756361 DOI: 10.1177/20552076221144201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 11/21/2022] [Indexed: 08/01/2023] Open
Abstract
OBJECTIVE Motor and cognitive development share biological background within the prefrontal cortex and cerebellum. Monitoring motor development is relevant to identify children at risk of developmental delays. However, access to timely assessment is prevented by its availability and cost. Affordable motion capture technology may provide an alternative to human assessment. METHODS MotorSense uses this technology to guide and assess children executing age-related developmental motor tasks. It incorporates advanced heuristics informed by pattern recognition principles based on the developmental sequences of motor skills. MotorSense was evaluated with 16 4-6 year-old children from a rural primary school. RESULTS A total of 506 jumps, 2415 steps and 831 hops were analysed. The analysis illustrates MotorSense Accuracy (MA), recognising jump forward (89.96%), jump high (83.34%), jump sideway (85.63%), hop (74.58%) and jog (92.34%), is as good as the sensor's precision. The analysis of the tasks' execution shows a high level of agreement between human and MotorSense's assessment on jump forward (91%), jump high (99%), jump sideway (93%), hop (94%) and jog (92%). CONCLUSIONS MotorSense helps address the shortage of affordable technologies to support the assessment of motor development using graded age-related developmental motor tasks. Furthermore, it could contribute towards the tele-detection of motor developmental delays.
Collapse
Affiliation(s)
- Benoit Bossavit
- School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
- School of Computer Science and Languages, Universidad de Malaga, Malaga, Spain
| | | |
Collapse
|
14
|
Ramos RG, Domingo JD, Zalama E, Gómez-García-Bermejo J. Daily Human Activity Recognition Using Non-Intrusive Sensors. SENSORS 2021; 21:s21165270. [PMID: 34450709 PMCID: PMC8401661 DOI: 10.3390/s21165270] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 07/26/2021] [Accepted: 07/31/2021] [Indexed: 11/16/2022]
Abstract
In recent years, Artificial Intelligence Technologies (AIT) have been developed to improve the quality of life of the elderly and their safety in the home. This work focuses on developing a system capable of recognising the most usual activities in the daily life of an elderly person in real-time to enable a specialist to monitor the habits of this person, such as taking medication or eating the correct meals of the day. To this end, a prediction model has been developed based on recurrent neural networks, specifically on bidirectional LSTM networks, to obtain in real-time the activity being carried out by the individuals in their homes, based on the information provided by a set of different sensors installed at each person’s home. The prediction model developed in this paper provides a 95.42% accuracy rate, improving the results of similar models currently in use. In order to obtain a reliable model with a high accuracy rate, a series of processing and filtering processes have been carried out on the data, such as a method based on a sliding window or a stacking and re-ordering algorithm, that are subsequently used to train the neural network, obtained from the public database CASAS.
Collapse
Affiliation(s)
- Raúl Gómez Ramos
- CARTIF Technological Center, 47151 Valladolid, Spain; (J.D.D.); (E.Z.); (J.G.-G.-B.)
- ITAP-DISA, University of Valladolid, 47002 Valladolid, Spain
- Correspondence:
| | - Jaime Duque Domingo
- CARTIF Technological Center, 47151 Valladolid, Spain; (J.D.D.); (E.Z.); (J.G.-G.-B.)
| | - Eduardo Zalama
- CARTIF Technological Center, 47151 Valladolid, Spain; (J.D.D.); (E.Z.); (J.G.-G.-B.)
- ITAP-DISA, University of Valladolid, 47002 Valladolid, Spain
| | - Jaime Gómez-García-Bermejo
- CARTIF Technological Center, 47151 Valladolid, Spain; (J.D.D.); (E.Z.); (J.G.-G.-B.)
- ITAP-DISA, University of Valladolid, 47002 Valladolid, Spain
| |
Collapse
|
15
|
Evaluating the Performance of Eigenface, Fisherface, and Local Binary Pattern Histogram-Based Facial Recognition Methods under Various Weather Conditions. TECHNOLOGIES 2021. [DOI: 10.3390/technologies9020031] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Facial recognition (FR) in unconstrained weather is still challenging and surprisingly ignored by many researchers and practitioners over the past few decades. Therefore, this paper aims to evaluate the performance of three existing popular facial recognition methods considering different weather conditions. As a result, a new face dataset (Lamar University database (LUDB)) was developed that contains face images captured under various weather conditions such as foggy, cloudy, rainy, and sunny. Three very popular FR methods—Eigenface (EF), Fisherface (FF), and Local binary pattern histogram (LBPH)—were evaluated considering two other face datasets, AT&T and 5_Celebrity, along with LUDB in term of accuracy, precision, recall, and F1 score with 95% confidence interval (CI). Computational results show a significant difference among the three FR techniques in terms of overall time complexity and accuracy. LBPH outperforms the other two FR algorithms on both LUDB and 5_Celebrity datasets by achieving 40% and 95% accuracy, respectively. On the other hand, with minimum execution time of 1.37, 1.37, and 1.44 s per image on AT&T,5_Celebrity, and LUDB, respectively, Fisherface achieved the best result.
Collapse
|
16
|
Liang JM, Chung PL, Ye YJ, Mishra S. Applying Machine Learning Technologies Based on Historical Activity Features for Multi-Resident Activity Recognition. SENSORS 2021; 21:s21072520. [PMID: 33916549 PMCID: PMC8038457 DOI: 10.3390/s21072520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/18/2021] [Accepted: 03/26/2021] [Indexed: 11/16/2022]
Abstract
Due to the aging population, home care for the elderly has become very important. Currently, there are many studies focusing on the deployment of various sensors in the house to recognize the home activities of the elderly, especially for the elderly living alone. Through these, we can detect the home situation of the single person and ensure his/her living safety. However, the living environment of the elderly includes, not only the person living alone, but also multiple people living together. By applying the traditional methods for a multi-resident environment, the “individual” activities of each person could not be accurately identified. This resulted in an inability to distinguish which person was involved in what activities, and thus, failed to provide personal care. Therefore, this research tries to investigate how to recognize home activities in multi-resident living environments, in order to accurately distinguish the association between residents and home activities. Specifically, we propose to use the special characteristics of historical activity of residents in a multi-person environment, including activity interaction, activity frequency, activity period length, and residential behaviors, and then apply a suite of machine learning methods to train and test. Five traditional models of supervised learning and two deep learning methods are explored to tackle this problem. Through the experiments with real datasets, the proposed methods were found to achieve higher precision, recall and accuracy with less training time. The best accuracy can reach up to 91% and 95%, by J48DT, and LSTM, respectively, in different living environments.
Collapse
Affiliation(s)
- Jia-Ming Liang
- Department of Electrical Engineering, National University of Tainan, Tainan 70005, Taiwan;
- Correspondence: ; Tel.: +886-6-260-6123
| | - Ping-Lin Chung
- Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan 33302, Taiwan; (P.-L.C.); (Y.-J.Y.)
| | - Yi-Jyun Ye
- Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan 33302, Taiwan; (P.-L.C.); (Y.-J.Y.)
| | - Shashank Mishra
- Department of Electrical Engineering, National University of Tainan, Tainan 70005, Taiwan;
| |
Collapse
|
17
|
Abstract
A reliable environment perception is a crucial task for autonomous driving, especially in dense traffic areas. Recent improvements and breakthroughs in scene understanding for intelligent transportation systems are mainly based on deep learning and the fusion of different modalities. In this context, we introduce OLIMP: A heterOgeneous Multimodal Dataset for Advanced EnvIronMent Perception. This is the first public, multimodal and synchronized dataset that includes UWB radar data, acoustic data, narrow-band radar data and images. OLIMP comprises 407 scenes and 47,354 synchronized frames, presenting four categories: pedestrian, cyclist, car and tram. The dataset includes various challenges related to dense urban traffic such as cluttered environment and different weather conditions. To demonstrate the usefulness of the introduced dataset, we propose a fusion framework that combines the four modalities for multi object detection. The obtained results are promising and spur for future research.
Collapse
|