1
|
Zaher M, Ghoneim AS, Abdelhamid L, Atia A. Fusing CNNs and attention-mechanisms to improve real-time indoor Human Activity Recognition for classifying home-based physical rehabilitation exercises. Comput Biol Med 2025; 184:109399. [PMID: 39591669 DOI: 10.1016/j.compbiomed.2024.109399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 10/20/2024] [Accepted: 11/07/2024] [Indexed: 11/28/2024]
Abstract
Physical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNN-based models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers real-time performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.
Collapse
Affiliation(s)
- Moamen Zaher
- Faculty of Computer Science, October University for Modern Sciences and Arts (MSA), Egypt; Human-Computer Interaction (HCI-LAB), Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Amr S Ghoneim
- Computer Science Department, Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Laila Abdelhamid
- Information Systems Department, Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| | - Ayman Atia
- Faculty of Computer Science, October University for Modern Sciences and Arts (MSA), Egypt; Human-Computer Interaction (HCI-LAB), Faculty of Computing and Artificial Intelligence, Helwan University, Egypt.
| |
Collapse
|
2
|
Khalifa NEM, Hamed N. Taha M, M. Khalil H, Malik MH. ONDL: An optimized Neutrosophic Deep Learning model for classifying waste for sustainability. PLoS One 2024; 19:e0313327. [PMID: 39514594 PMCID: PMC11548776 DOI: 10.1371/journal.pone.0313327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
Sustainability has become a key factor on our planet. If this concept is applied correctly, our planet will be greener and more eco-friendly. Nowadays, waste classification and management practices have become more evident than ever. It plays a crucial role in the sustainability ecosystem. Computer algorithms and deep learning can help in this sustainability challenge. In this paper, An Optimized Neutrosophic Deep Learning (ONDL) model was proposed to classify waste objects. Two datasets were tested in this research {Dataset for Waste Management 1 (DSWM1), and Dataset for Waste Management 2 (DSWM2)}. DSWM1 consists of two classes (Organic or Recycled) objects. The DSWM2 consists of three classes (Organic, Recycled, or Non-Recyclable) objects. Both datasets exist publicly on the internet. The ONDL model architecture is constructed based on Alexnet as a Deep Transfer Learning (DTL) model and the conversion of images to True (T) neutrosophic domain and Grey Wolf Optimization (GWO) for the image features selection. The selection process of the building components of the ONDL model is comprehensive as different DTL models (Alexnet, Googlenet, and Resnet18) are tested, and three neutrosophic domains (T, I, and F) domain are included. The ONDL model proved its efficiency against all the tested models, moreover, it achieves competitive results with related works in terms of testing accuracy and performance metrics. In DSWM1, the ONDL model achieved 0.9189, 0.9177, 0.9176, and 0.9177 in Testing Accuracy (TA), Precision (P), Recall (R), and F1 score. In DSWM2, it achieved 0.8532, 0.7728, 0.7944, and 0.7835 in TA, P, R, and F1 Score consequently.
Collapse
Affiliation(s)
- Nour Eldeen Mahmoud Khalifa
- Information Technology Department, Faculty of Computers and Artificial Intelligence, Cairo University, Giza, Egypt
| | - Mohamed Hamed N. Taha
- Information Technology Department, Faculty of Computers and Artificial Intelligence, Cairo University, Giza, Egypt
| | - Heba M. Khalil
- Computer Science Department, Faculty of Computers & Artificial Intelligence, Benha University, Benha, Egypt
| | - Mazhar Hussain Malik
- School of Computing and Creative Technologies College of Arts, Technology and Environment (CATE), University of the West of England Frenchay Campus, Bristol, United Kingdom
| |
Collapse
|
3
|
Boyd L, Nnamoko N, Lopes R. Fine-Grained Food Image Recognition: A Study on Optimising Convolutional Neural Networks for Improved Performance. J Imaging 2024; 10:126. [PMID: 38921603 PMCID: PMC11205013 DOI: 10.3390/jimaging10060126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/06/2024] [Accepted: 05/15/2024] [Indexed: 06/27/2024] Open
Abstract
Addressing the pressing issue of food waste is vital for environmental sustainability and resource conservation. While computer vision has been widely used in food waste reduction research, existing food image datasets are typically aggregated into broad categories (e.g., fruits, meat, dairy, etc.) rather than the fine-grained singular food items required for this research. The aim of this study is to develop a model capable of identifying individual food items to be integrated into a mobile application that allows users to photograph their food items, identify them, and offer suggestions for recipes. This research bridges the gap in available datasets and contributes to a more fine-grained approach to utilising existing technology for food waste reduction, emphasising both environmental and research significance. This study evaluates various (n = 7) convolutional neural network architectures for multi-class food image classification, emphasising the nuanced impact of parameter tuning to identify the most effective configurations. The experiments were conducted with a custom dataset comprising 41,949 food images categorised into 20 food item classes. Performance evaluation was based on accuracy and loss. DenseNet architecture emerged as the top-performing out of the seven examined, establishing a baseline performance (training accuracy = 0.74, training loss = 1.25, validation accuracy = 0.68, and validation loss = 2.89) on a predetermined set of parameters, including the RMSProp optimiser, ReLU activation function, '0.5' dropout rate, and a 160×160 image size. Subsequent parameter tuning involved a comprehensive exploration, considering six optimisers, four image sizes, two dropout rates, and five activation functions. The results show the superior generalisation capabilities of the optimised DenseNet, showcasing performance improvements over the established baseline across key metrics. Specifically, the optimised model demonstrated a training accuracy of 0.99, a training loss of 0.01, a validation accuracy of 0.79, and a validation loss of 0.92, highlighting its improved performance compared to the baseline configuration. The optimal DenseNet has been integrated into a mobile application called FridgeSnap, designed to recognise food items and suggest possible recipes to users, thus contributing to the broader mission of minimising food waste.
Collapse
Affiliation(s)
- Liam Boyd
- Apadmi Ltd., Anchorage 2 Salford, The Quays, Manchester M50 3XE, UK
| | - Nonso Nnamoko
- Department of Computer Science, Edge Hill University, St Helens Road, Ormskirk L39 4QP, UK
| | - Ricardo Lopes
- Department of Computer Science, Edge Hill University, St Helens Road, Ormskirk L39 4QP, UK
| |
Collapse
|
4
|
Rahman W, Akter M, Sultana N, Farjana M, Uddin A, Mazrur MB, Rahman MM. BDWaste: A comprehensive image dataset of digestible and indigestible waste in Bangladesh. Data Brief 2024; 53:110153. [PMID: 38384312 PMCID: PMC10879810 DOI: 10.1016/j.dib.2024.110153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 02/23/2024] Open
Abstract
The "BDWaste" dataset contains two significant categories of waste, namely digestible and indigestible, in Bangladesh. Each category represents 10 distinct species of waste. The digestible categories are sugarcane husk, fish ash, potato peel, paper, mango peel, rice, shell of malta, lemon peel, banana peel, and egg shell. On the other hand, the indigestible species are polythene, cans, plastic, glass, wire, gloves, empty medicine packets, chip packets, bottles, and masks. The research uploaded the primarily collected dataset on Mendeley, and the dataset contains a total of 2497 raw images, of which 1234 were digestible and 1263 belonged to indigestible species. Each species is stored in a fixed file based on its name and categories. Also, each species contains an indoor (with a visible surface) and an outdoor (with a surface that can be seen generally) image. The dataset is free from any blurry, dark, noisy, or invisible images. The research also performed waste classification with pre-trained convolutional neural network models such as MobileNetV2 and InceptionV3. The research found the highest accuracy of 96.70% in the indigestible waste classification and 99.70% in the digestible waste classification. The researchers presume that this data can be used in the future in different types of research, such as sustainable development, sustainable environments, agricultural development, recycling processes, and other computer vision-based applications.
Collapse
Affiliation(s)
- Wahidur Rahman
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
- Department of Computer Science and Engineering, Mawlana Bhashani Science & Technology University, Tangail, Bangladesh
| | - Mohona Akter
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
| | - Nahida Sultana
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
| | - Maisha Farjana
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
| | - Arfan Uddin
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
| | - Md. Bakhtiar Mazrur
- Department of Computer Science and Engineering, Uttara University, Dhaka, Bangladesh
| | - Mohammad Motiur Rahman
- Department of Computer Science and Engineering, Mawlana Bhashani Science & Technology University, Tangail, Bangladesh
| |
Collapse
|