1
|
Alsubai S, Alqahtani A, Alanazi A, Sha M, Gumaei A. Facial emotion recognition using deep quantum and advanced transfer learning mechanism. Front Comput Neurosci 2024; 18:1435956. [PMID: 39539995 PMCID: PMC11557492 DOI: 10.3389/fncom.2024.1435956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 10/04/2024] [Indexed: 11/16/2024] Open
Abstract
Introduction Facial expressions have become a common way for interaction among humans. People cannot comprehend and predict the emotions or expressions of individuals through simple vision. Thus, in psychology, detecting facial expressions or emotion analysis demands an assessment and evaluation of decisions for identifying the emotions of a person or any group during communication. With the recent evolution of technology, AI (Artificial Intelligence) has gained significant usage, wherein DL (Deep Learning) based algorithms are employed for detecting facial expressions. Methods The study proposes a system design that detects facial expressions by extracting relevant features using a Modified ResNet model. The proposed system stacks building-blocks with residual connections and employs an advanced extraction method with quantum computing, which significantly reduces computation time compared to conventional methods. The backbone stem utilizes a quantum convolutional layer comprised of several parameterized quantum-filters. Additionally, the research integrates residual connections in the ResNet-18 model with the Modified up Sampled Bottle Neck Process (MuS-BNP), retaining computational efficacy while benefiting from residual connections. Results The proposed model demonstrates superior performance by overcoming the issue of maximum similarity within varied facial expressions. The system's ability to accurately detect and differentiate between expressions is measured using performance metrics such as accuracy, F1-score, recall, and precision. Discussion This performance analysis confirms the efficacy of the proposed system, highlighting the advantages of quantum computing in feature extraction and the integration of residual connections. The model achieves quantum superiority, providing faster and more accurate computations compared to existing methodologies. The results suggest that the proposed approach offers a promising solution for facial expression recognition tasks, significantly improving both speed and accuracy.
Collapse
Affiliation(s)
- Shtwai Alsubai
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Abdullah Alqahtani
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Abed Alanazi
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Mohemmed Sha
- Department of Software Engineering, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Abdu Gumaei
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| |
Collapse
|
2
|
Pereira R, Mendes C, Ribeiro J, Ribeiro R, Miragaia R, Rodrigues N, Costa N, Pereira A. Systematic Review of Emotion Detection with Computer Vision and Deep Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:3484. [PMID: 38894274 PMCID: PMC11175284 DOI: 10.3390/s24113484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 05/20/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024]
Abstract
Emotion recognition has become increasingly important in the field of Deep Learning (DL) and computer vision due to its broad applicability by using human-computer interaction (HCI) in areas such as psychology, healthcare, and entertainment. In this paper, we conduct a systematic review of facial and pose emotion recognition using DL and computer vision, analyzing and evaluating 77 papers from different sources under Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our review covers several topics, including the scope and purpose of the studies, the methods employed, and the used datasets. The scope of this work is to conduct a systematic review of facial and pose emotion recognition using DL methods and computer vision. The studies were categorized based on a proposed taxonomy that describes the type of expressions used for emotion detection, the testing environment, the currently relevant DL methods, and the datasets used. The taxonomy of methods in our review includes Convolutional Neural Network (CNN), Faster Region-based Convolutional Neural Network (R-CNN), Vision Transformer (ViT), and "Other NNs", which are the most commonly used models in the analyzed studies, indicating their trendiness in the field. Hybrid and augmented models are not explicitly categorized within this taxonomy, but they are still important to the field. This review offers an understanding of state-of-the-art computer vision algorithms and datasets for emotion recognition through facial expressions and body poses, allowing researchers to understand its fundamental components and trends.
Collapse
Affiliation(s)
- Rafael Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Carla Mendes
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - José Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Roberto Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Rolando Miragaia
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Rodrigues
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - António Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
- INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901 Leiria, Portugal
| |
Collapse
|
3
|
Ding J, Hou C, Zhao Y, Liu H, Hu Z, Meng F, Liang S. Virtual draw of microstructured optical fiber based on physics-informed neural networks. OPTICS EXPRESS 2024; 32:9316-9331. [PMID: 38571169 DOI: 10.1364/oe.518238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 02/15/2024] [Indexed: 04/05/2024]
Abstract
The implementation of microstructured optical fibers (MOFs) with novel micro-structures and perfect performance is challenging due to the complex fabrication processes. Physics-informed neural networks (PINNs) offer what we believe to be a new approach to solving complex partial differential equations within the virtual fabrication model of MOFs. This study, for what appears to be the first time, integrates the complex partial differential equations and boundary conditions describing the fiber drawing process into the loss function of a neural network. To more accurately solve the free boundary of the fiber's inner and outer diameters, we additionally construct a neural network to describe the free boundary conditions. This model not only captures the evolution of the fiber's inner and outer diameters but also provides the velocity distribution and pressure distribution within the molten glass, thus laying the foundation for a quantitative analysis of capillary collapse. Furthermore, results indicate that the trends in the effects of temperature, feed speed, and draw speed on the fiber drawing process align with actual fabrication conditions, validating the feasibility of the model. The methodology proposed in this study offers what we believe to be a novel approach to simulating the fiber drawing process and holds promise for advancing the practical applications of MOFs.
Collapse
|
4
|
Cross MP, Acevedo AM, Hunter JF. A Critique of Automated Approaches to Code Facial Expressions: What Do Researchers Need to Know? AFFECTIVE SCIENCE 2023; 4:500-505. [PMID: 37744972 PMCID: PMC10514002 DOI: 10.1007/s42761-023-00195-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 06/03/2023] [Indexed: 09/26/2023]
Abstract
Facial expression recognition software is becoming more commonly used by affective scientists to measure facial expressions. Although the use of this software has exciting implications, there are persistent and concerning issues regarding the validity and reliability of these programs. In this paper, we highlight three of these issues: biases of the programs against certain skin colors and genders; the common inability of these programs to capture facial expressions made in non-idealized conditions (e.g., "in the wild"); and programs being forced to adopt the underlying assumptions of the specific theory of emotion on which each software is based. We then discuss three directions for the future of affective science in the area of automated facial coding. First, researchers need to be cognizant of exactly how and on which data sets the machine learning algorithms underlying these programs are being trained. In addition, there are several ethical considerations, such as privacy and data storage, surrounding the use of facial expression recognition programs. Finally, researchers should consider collecting additional emotion data, such as body language, and combine these data with facial expression data in order to achieve a more comprehensive picture of complex human emotions. Facial expression recognition programs are an excellent method of collecting facial expression data, but affective scientists should ensure that they recognize the limitations and ethical implications of these programs.
Collapse
Affiliation(s)
- Marie P. Cross
- Department of Biobehavioral Health, Pennsylvania State University, University Park, PA USA
| | - Amanda M. Acevedo
- Basic Biobehavioral and Psychological Sciences Branch, National Cancer Institute, Rockville, MD USA
| | - John F. Hunter
- Department of Psychology, Chapman University, Orange, CA USA
| |
Collapse
|
5
|
Pushpalatha MN, Meherishi H, Vaishnav A, Anurag Pillai R, Gupta A. Facial emotion recognition and encoding application for the visually impaired. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07807-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Saurav S, Saini R, Singh S. Fast facial expression recognition using Boosted Histogram of Oriented Gradient (BHOG) features. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01112-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
A cascaded spatiotemporal attention network for dynamic facial expression recognition. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03781-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Mehta NK, Prasad SS, Saurav S, Saini R, Singh S. Three-dimensional DenseNet self-attention neural network for automatic detection of student's engagement. APPL INTELL 2022; 52:13803-13823. [PMID: 35340984 PMCID: PMC8932470 DOI: 10.1007/s10489-022-03200-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2022] [Indexed: 01/17/2023]
Abstract
Today, due to the widespread outbreak of the deadly coronavirus, popularly known as COVID-19, the traditional classroom education has been shifted to computer-based learning. Students of various cognitive and psychological abilities participate in the learning process. However, most students are hesitant to provide regular and honest feedback on the comprehensiveness of the course, making it difficult for the instructor to ensure that all students are grasping the information at the same rate. The students' understanding of the course and their emotional engagement, as indicated via facial expressions, are intertwined. This paper attempts to present a three-dimensional DenseNet self-attention neural network (DenseAttNet) used to identify and evaluate student participation in modern and traditional educational programs. With the Dataset for Affective States in E-Environments (DAiSEE), the proposed DenseAttNet model outperformed all other existing methods, achieving baseline accuracy of 63.59% for engagement classification and 54.27% for boredom classification, respectively. Besides, DenseAttNet trained on all four multi-labels, namely boredom, engagement, confusion, and frustration has registered an accuracy of 81.17%, 94.85%, 90.96%, and 95.85%, respectively. In addition, we performed a regression experiment on DAiSEE and obtained the lowest Mean Square Error (MSE) value of 0.0347. Finally, the proposed approach achieves a competitive MSE of 0.0877 when validated on the Emotion Recognition in the Wild Engagement Prediction (EmotiW-EP) dataset.
Collapse
Affiliation(s)
- Naval Kishore Mehta
- Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
- CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
| | - Shyam Sunder Prasad
- Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
- CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
| | - Sumeet Saurav
- Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
- CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
| | - Ravi Saini
- Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
- CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
| | - Sanjay Singh
- Academy of Scientific and Innovative Research(AcSIR), Ghaziabad, India
- CSIR-Central Electronics Engineering Research Institute(CSIR-CEERI), Pilani, India
| |
Collapse
|
9
|
Wu P, Pan K, Ji L, Gong S, Feng W, Yuan W, Pain C. Navier–stokes Generative Adversarial Network: a physics-informed deep learning model for fluid flow generation. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07042-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|