1
|
Pereira R, Mendes C, Ribeiro J, Ribeiro R, Miragaia R, Rodrigues N, Costa N, Pereira A. Systematic Review of Emotion Detection with Computer Vision and Deep Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:3484. [PMID: 38894274 PMCID: PMC11175284 DOI: 10.3390/s24113484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 05/20/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024]
Abstract
Emotion recognition has become increasingly important in the field of Deep Learning (DL) and computer vision due to its broad applicability by using human-computer interaction (HCI) in areas such as psychology, healthcare, and entertainment. In this paper, we conduct a systematic review of facial and pose emotion recognition using DL and computer vision, analyzing and evaluating 77 papers from different sources under Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our review covers several topics, including the scope and purpose of the studies, the methods employed, and the used datasets. The scope of this work is to conduct a systematic review of facial and pose emotion recognition using DL methods and computer vision. The studies were categorized based on a proposed taxonomy that describes the type of expressions used for emotion detection, the testing environment, the currently relevant DL methods, and the datasets used. The taxonomy of methods in our review includes Convolutional Neural Network (CNN), Faster Region-based Convolutional Neural Network (R-CNN), Vision Transformer (ViT), and "Other NNs", which are the most commonly used models in the analyzed studies, indicating their trendiness in the field. Hybrid and augmented models are not explicitly categorized within this taxonomy, but they are still important to the field. This review offers an understanding of state-of-the-art computer vision algorithms and datasets for emotion recognition through facial expressions and body poses, allowing researchers to understand its fundamental components and trends.
Collapse
Affiliation(s)
- Rafael Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Carla Mendes
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - José Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Roberto Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Rolando Miragaia
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Rodrigues
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - António Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
- INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901 Leiria, Portugal
| |
Collapse
|
2
|
Cîrneanu AL, Popescu D, Iordache D. New Trends in Emotion Recognition Using Image Analysis by Neural Networks, A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:7092. [PMID: 37631629 PMCID: PMC10458371 DOI: 10.3390/s23167092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/29/2023] [Accepted: 08/02/2023] [Indexed: 08/27/2023]
Abstract
Facial emotion recognition (FER) is a computer vision process aimed at detecting and classifying human emotional expressions. FER systems are currently used in a vast range of applications from areas such as education, healthcare, or public safety; therefore, detection and recognition accuracies are very important. Similar to any computer vision task based on image analyses, FER solutions are also suitable for integration with artificial intelligence solutions represented by different neural network varieties, especially deep neural networks that have shown great potential in the last years due to their feature extraction capabilities and computational efficiency over large datasets. In this context, this paper reviews the latest developments in the FER area, with a focus on recent neural network models that implement specific facial image analysis algorithms to detect and recognize facial emotions. This paper's scope is to present from historical and conceptual perspectives the evolution of the neural network architectures that proved significant results in the FER area. This paper endorses convolutional neural network (CNN)-based architectures against other neural network architectures, such as recurrent neural networks or generative adversarial networks, highlighting the key elements and performance of each architecture, and the advantages and limitations of the proposed models in the analyzed papers. Additionally, this paper presents the available datasets that are currently used for emotion recognition from facial expressions and micro-expressions. The usage of FER systems is also highlighted in various domains such as healthcare, education, security, or social IoT. Finally, open issues and future possible developments in the FER area are identified.
Collapse
Affiliation(s)
- Andrada-Livia Cîrneanu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania;
| | - Dan Popescu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania;
| | - Dragoș Iordache
- The National Institute for Research & Development in Informatics-ICI Bucharest, 011455 Bucharest, Romania;
| |
Collapse
|
3
|
Shah SM, Sun Z, Zaman K, Hussain A, Ullah I, Ghadi YY, Khan MA, Nasimov R. Advancements in Neighboring-Based Energy-Efficient Routing Protocol (NBEER) for Underwater Wireless Sensor Networks. SENSORS (BASEL, SWITZERLAND) 2023; 23:6025. [PMID: 37447872 DOI: 10.3390/s23136025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/27/2023] [Accepted: 06/28/2023] [Indexed: 07/15/2023]
Abstract
Underwater wireless sensor networks (UWSNs) have gained prominence in wireless sensor technology, featuring resource-limited sensor nodes deployed in challenging underwater environments. To address challenges like power consumption, network lifetime, node deployment, topology, and propagation delays, cooperative transmission protocols like co-operative (Co-UWSN) and co-operative energy-efficient routing (CEER) have been proposed. These protocols utilize broadcast capabilities and neighbor head node (NHN) selection for cooperative routing. This research introduces NBEER, a novel neighbor-based energy-efficient routing protocol tailored for UWSNs. NBEER aims to surpass the limitations of Co-UWSN and CEER by optimizing NHNS and cooperative mechanisms to achieve load balancing and enhance network performance. Through comprehensive MATLAB simulations, we evaluated NBEER against Co-UWSN and CEER, demonstrating its superior performance across various metrics. NBEER significantly maximizes end-to-end delay, reduces energy consumption, improves packet delivery ratio, extends network lifetime, and enhances total received packets analysis compared to the existing protocols.
Collapse
Affiliation(s)
| | - Zhaoyun Sun
- Information Engineering School, Chang'an University, Xi'an 710061, China
| | - Khalid Zaman
- Information Engineering School, Chang'an University, Xi'an 710061, China
| | - Altaf Hussain
- School of Computer Science & Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Inam Ullah
- Department of Computer Engineering, Gachon University, Seongnam 13120, Sujeong-gu, Republic of Korea
| | - Yazeed Yasin Ghadi
- Department of Computer Science, Al Ain University, Abu Dhabi P.O. Box 112612, United Arab Emirates
| | - Muhammad Abbas Khan
- Department of Electrical Engineering, Balochistan University of Information Technology, Engineering and Management Sciences, Quetta 87300, Pakistan
| | - Rashid Nasimov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| |
Collapse
|
4
|
Deep Transfer Learning Enabled Intelligent Object Detection for Crowd Density Analysis on Video Surveillance Systems. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136665] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Object detection is a computer vision based technique which is used to detect instances of semantic objects of a particular class in digital images and videos. Crowd density analysis is one of the commonly utilized applications of object detection. Since crowd density classification techniques face challenges like non-uniform density, occlusion, inter-scene, and intra-scene deviations, convolutional neural network (CNN) models are useful. This paper presents a Metaheuristics with Deep Transfer Learning Enabled Intelligent Crowd Density Detection and Classification (MDTL-ICDDC) model for video surveillance systems. The proposed MDTL-ICDDC technique mostly concentrates on the effective identification and classification of crowd density on video surveillance systems. In order to achieve this, the MDTL-ICDDC model primarily leverages a Salp Swarm Algorithm (SSA) with NASNetLarge model as a feature extraction in which the hyperparameter tuning process is performed by the SSA. Furthermore, a weighted extreme learning machine (WELM) method was utilized for crowd density and classification process. Finally, the krill swarm algorithm (KSA) is applied for an effective parameter optimization process and thereby improves the classification results. The experimental validation of the MDTL-ICDDC approach was carried out with a benchmark dataset, and the outcomes are examined under several aspects. The experimental values indicated that the MDTL-ICDDC system has accomplished enhanced performance over other models such as Gabor, BoW-SRP, Bow-LBP, GLCM-SVM, GoogleNet, and VGGNet.
Collapse
|