1
|
Al Mudawi N, Ansar H, Alazeb A, Aljuaid H, AlQahtani Y, Algarni A, Jalal A, Liu H. Innovative healthcare solutions: robust hand gesture recognition of daily life routines using 1D CNN. Front Bioeng Biotechnol 2024; 12:1401803. [PMID: 39144478 PMCID: PMC11322365 DOI: 10.3389/fbioe.2024.1401803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 06/26/2024] [Indexed: 08/16/2024] Open
Abstract
Introduction Hand gestures are an effective communication tool that may convey a wealth of information in a variety of sectors, including medical and education. E-learning has grown significantly in the last several years and is now an essential resource for many businesses. Still, there has not been much research conducted on the use of hand gestures in e-learning. Similar to this, gestures are frequently used by medical professionals to help with diagnosis and treatment. Method We aim to improve the way instructors, students, and medical professionals receive information by introducing a dynamic method for hand gesture monitoring and recognition. Six modules make up our approach: video-to-frame conversion, preprocessing for quality enhancement, hand skeleton mapping with single shot multibox detector (SSMD) tracking, hand detection using background modeling and convolutional neural network (CNN) bounding box technique, feature extraction using point-based and full-hand coverage techniques, and optimization using a population-based incremental learning algorithm. Next, a 1D CNN classifier is used to identify hand motions. Results After a lot of trial and error, we were able to obtain a hand tracking accuracy of 83.71% and 85.71% over the Indian Sign Language and WLASL datasets, respectively. Our findings show how well our method works to recognize hand motions. Discussion Teachers, students, and medical professionals can all efficiently transmit and comprehend information by utilizing our suggested system. The obtained accuracy rates highlight how our method might improve communication and make information exchange easier in various domains.
Collapse
Affiliation(s)
- Naif Al Mudawi
- Department of Computer Science, College of Computer Science and Information System, Najran University, Najran, Saudi Arabia
| | - Hira Ansar
- Department of Computer Science, Air University, Islamabad, Pakistan
| | - Abdulwahab Alazeb
- Department of Computer Science, College of Computer Science and Information System, Najran University, Najran, Saudi Arabia
| | - Hanan Aljuaid
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Yahay AlQahtani
- Department of Computer Science, King Khalid University, Abha, Saudi Arabia
| | - Asaad Algarni
- Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
| | - Ahmad Jalal
- Department of Computer Science, Air University, Islamabad, Pakistan
| | - Hui Liu
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| |
Collapse
|
2
|
Lian D, Chen X, Li J, Luo W, Gao S. Locating and Counting Heads in Crowds With a Depth Prior. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9056-9072. [PMID: 34735337 DOI: 10.1109/tpami.2021.3124956] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
To simultaneously estimate the number of heads and locate heads with bounding boxes, we resort to detection-based crowd counting by leveraging RGB-D data and design a dual-path guided detection network (DPDNet). Specifically, to improve the performance of detection-based approaches for dense/tiny heads, we propose a density map guided detection module, which leverages density map to improve the head/non-head classification in detection network where the density implies the probability of a pixel being a head, and a depth-adaptive kernel that considers the variances in head sizes is also introduced to generate high-fidelity density map for more robust density map regression. In order to prevent dense heads from being filtered out during post-processing, we utilize such a density map for post-processing of head detection and propose a density map guided NMS strategy. Meanwhile, to improve the ability of detecting small heads, we also propose a depth-guided detection module to generate a dynamic dilated convolution to extract features of heads of different scales, and a depth-aware anchor is further designed for better initialization of anchor sizes in the detection framework. Then we use the bounding boxes whose sizes are generated with depth to train our DPDNet. Considering that existing RGB-D datasets are too small and not suitable for performance evaluation of data-driven based approaches, we collect two large-scale RGB-D crowd counting datasets, which comprise a synthetic dataset and a real-world dataset, respectively. Since the depth value at long-distance positions cannot be obtained in the real-world dataset, we further propose a depth completion method with meta learning, which fully utilizes the synthetic depth data to complete the depth value at long-distance positions. Extensive experiments on our proposed two RGB-D datasets and the MICC RGB-D counting dataset show that our method achieves the best performance for RGB-D crowd counting and localization. Further, our method can be easily extended to RGB image based crowd counting and achieves comparable or even better performance on the RGB datasets for both head counting and localization.
Collapse
|
3
|
Mo H, Ren W, Zhang X, Yan F, Zhou Z, Cao X, Wu W. Attention-Guided Collaborative Counting. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:6306-6319. [PMID: 36178989 DOI: 10.1109/tip.2022.3207584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Existing crowd counting designs usually exploit multi-branch structures to address the scale diversity problem. However, branches in these structures work in a competitive rather than collaborative way. In this paper, we focus on promoting collaboration between branches. Specifically, we propose an attention-guided collaborative counting module (AGCCM) comprising an attention-guided module (AGM) and a collaborative counting module (CCM). The CCM promotes collaboration among branches by recombining each branch's output into an independent count and joint counts with other branches. The AGM capturing the global attention map through a transformer structure with a pair of foreground-background related loss functions can distinguish the advantages of different branches. The loss functions do not require additional labels and crowd division. In addition, we design two kinds of bidirectional transformers (Bi-Transformers) to decouple the global attention to row attention and column attention. The proposed Bi-Transformers are able to reduce the computational complexity and handle images in any resolution without cropping the image into small patches. Extensive experiments on several public datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art crowd counting methods.
Collapse
|
4
|
Semantic Segmentation Based Crowd Tracking and Anomaly Detection via Neuro-fuzzy Classifier in Smart Surveillance System. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07092-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
5
|
|
6
|
Zhou Y, Yang J, Li H, Cao T, Kung SY. Adversarial Learning for Multiscale Crowd Counting Under Complex Scenes. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:5423-5432. [PMID: 31905157 DOI: 10.1109/tcyb.2019.2956091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, a multiscale generative adversarial network (MS-GAN) is proposed for generating high-quality crowd density maps of arbitrary crowd density scenes. The task of crowd counting has many challenges, such as severe occlusions in extremely dense crowd scenes, perspective distortion, and high visual similarity between the pedestrians and background elements. To address these problems, the proposed MS-GAN combines a multiscale convolutional neural network (generator) and an adversarial network (discriminator) to generate a high-quality density map and accurately estimate the crowd count in complex crowd scenes. The multiscale generator utilizes the fusion features from multiple hierarchical layers to detect people with large-scale variation. The resulting density map produced by the multiscale generator is processed by a discriminator network trained to solve a binary classification task between a poor quality density map and real ground-truth ones. The additional adversarial loss can improve the quality of the density map, which is critical to accurately estimate the crowd counts. The experiments were conducted on multiple datasets with different crowd scenes and densities. The results showed that the proposed method provided better performance compared to current state-of-the-art methods.
Collapse
|
7
|
Liu L, Zhang H, Liu J, Liu S, Chen W, Man J. Visual exploration of urban functional zones based on augmented nonnegative tensor factorization. J Vis (Tokyo) 2021. [DOI: 10.1007/s12650-020-00713-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Mo H, Ren W, Xiong Y, Pan X, Zhou Z, Cao X, Wu W. Background Noise Filtering and Distribution Dividing for Crowd Counting. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8199-8212. [PMID: 32759083 DOI: 10.1109/tip.2020.3009030] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Crowd counting is a challenging problem due to the diverse crowd distribution and background interference. In this paper, we propose a new approach for head size estimation to reduce the impact of different crowd scale and background noise. Different from just using local information of distance between human heads, the global information of the people distribution in the whole image is also under consideration. We obey the order of far- to near-region (small to large) to spread head size, and ensure that the propagation is uninterrupted by inserting dummy head points. The estimated head size is further exploited, such as dividing the crowd into parts of different densities and generating a high-fidelity head mask. On the other hand, we design three different head mask usage mechanisms and the corresponding head masks to analyze where and which mask could lead to better background filtering1. Based on the learned masks, two competitive models are proposed which can perform robust crowd estimation against background noise and diverse crowd scale. We evaluate the proposed method on three public crowd counting datasets of ShanghaiTech [2], UCFQNRF [3] and UCFCC_50 [4]. Experimental results demonstrate that the proposed algorithm performs favorably against the state-of-the-art crowd counting approaches.
Collapse
|
9
|
Jiang X, Zhang L, Lv P, Guo Y, Zhu R, Li Y, Pang Y, Li X, Zhou B, Xu M. Learning Multi-Level Density Maps for Crowd Counting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2705-2715. [PMID: 31562106 DOI: 10.1109/tnnls.2019.2933920] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
People in crowd scenes often exhibit the characteristic of imbalanced distribution. On the one hand, people size varies largely due to the camera perspective. People far away from the camera look smaller and are likely to occlude each other, whereas people near to the camera look larger and are relatively sparse. On the other hand, the number of people also varies greatly in the same or different scenes. This article aims to develop a novel model that can accurately estimate the crowd count from a given scene with imbalanced people distribution. To this end, we have proposed an effective multi-level convolutional neural network (MLCNN) architecture that first adaptively learns multi-level density maps and then fuses them to predict the final output. Density map of each level focuses on dealing with people of certain sizes. As a result, the fusion of multi-level density maps is able to tackle the large variation in people size. In addition, we introduce a new loss function named balanced loss (BL) to impose relatively BL feedback during training, which helps further improve the performance of the proposed network. Furthermore, we introduce a new data set including 1111 images with a total of 49 061 head annotations. MLCNN is easy to train with only one end-to-end training stage. Experimental results demonstrate that our MLCNN achieves state-of-the-art performance. In particular, our MLCNN reaches a mean absolute error (MAE) of 242.4 on the UCF_CC_50 data set, which is 37.2 lower than the second-best result.
Collapse
|
10
|
Crowd Monitoring and Localization Using Deep Convolutional Neural Network: A Review. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10144781] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Crowd management and monitoring is crucial for maintaining public safety and is an important research topic. Developing a robust crowd monitoring system (CMS) is a challenging task as it involves addressing many key issues such as density variation, irregular distribution of objects, occlusions, pose estimation, etc. Crowd gathering at various places like hospitals, parks, stadiums, airports, cultural and religious points are usually monitored by Close Circuit Television (CCTV) cameras. The drawbacks of CCTV cameras are: limited area coverage, installation problems, movability, high power consumption and constant monitoring by the operators. Therefore, many researchers have turned towards computer vision and machine learning that have overcome these issues by minimizing the need of human involvement. This review is aimed to categorize, analyze as well as provide the latest development and performance evolution in crowd monitoring using different machine learning techniques and methods that are published in journals and conferences over the past five years.
Collapse
|
11
|
Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L. Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.045] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
12
|
Multi-level feature fusion based Locality-Constrained Spatial Transformer network for video crowd counting. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.087] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
13
|
Ilyas N, Shahzad A, Kim K. Convolutional-Neural Network-Based Image Crowd Counting: Review, Categorization, Analysis, and Performance Evaluation. SENSORS 2019; 20:s20010043. [PMID: 31861734 PMCID: PMC6983207 DOI: 10.3390/s20010043] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Revised: 12/10/2019] [Accepted: 12/13/2019] [Indexed: 12/18/2022]
Abstract
Traditional handcrafted crowd-counting techniques in an image are currently transformed via machine-learning and artificial-intelligence techniques into intelligent crowd-counting techniques. This paradigm shift offers many advanced features in terms of adaptive monitoring and the control of dynamic crowd gatherings. Adaptive monitoring, identification/recognition, and the management of diverse crowd gatherings can improve many crowd-management-related tasks in terms of efficiency, capacity, reliability, and safety. Despite many challenges, such as occlusion, clutter, and irregular object distribution and nonuniform object scale, convolutional neural networks are a promising technology for intelligent image crowd counting and analysis. In this article, we review, categorize, analyze (limitations and distinctive features), and provide a detailed performance evaluation of the latest convolutional-neural-network-based crowd-counting techniques. We also highlight the potential applications of convolutional-neural-network-based crowd-counting techniques. Finally, we conclude this article by presenting our key observations, providing strong foundation for future research directions while designing convolutional-neural-network-based crowd-counting techniques. Further, the article discusses new advancements toward understanding crowd counting in smart cities using the Internet of Things (IoT).
Collapse
Affiliation(s)
- Naveed Ilyas
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea;
| | - Ahsan Shahzad
- Department of Computer and Software Engineering (DCSE), College of Electrical and Mechanical Engineering (EME), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan;
| | - Kiseon Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea;
- Correspondence:
| |
Collapse
|