1
|
Zhang D, Zhang T, Sun H, Tang Y, Liu Q. MCCA-VNet: A Vit-Based Deep Learning Approach for Micro-Expression Recognition Based on Facial Coding. SENSORS (BASEL, SWITZERLAND) 2024; 24:7549. [PMID: 39686086 DOI: 10.3390/s24237549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 11/19/2024] [Accepted: 11/24/2024] [Indexed: 12/18/2024]
Abstract
In terms of facial expressions, micro-expressions are more realistic than macro-expressions and provide more valuable information, which can be widely used in psychological counseling and clinical diagnosis. In the past few years, deep learning methods based on optical flow and Transformer have achieved excellent results in this field, but most of the current algorithms are mainly concentrated on establishing a serialized token through the self-attention model, and they do not take into account the spatial relationship between facial landmarks. For the locality and changes in the micro-facial conditions themselves, we propose the deep learning model MCCA-VNET on the basis of Transformer. We effectively extract the changing features as the input of the model, fusing channel attention and spatial attention into Vision Transformer to capture correlations between features in different dimensions, which enhances the accuracy of the identification of micro-expressions. In order to verify the effectiveness of the algorithm mentioned, we conduct experimental testing in the SAMM, CAS (ME) II, and SMIC datasets and compared the results with other former best algorithms. Our algorithms can improve the UF1 score and UAR score to, respectively, 0.8676 and 0.8622 for the composite dataset, and they are better than other algorithms on multiple indicators, achieving the best comprehensive performance.
Collapse
Affiliation(s)
- Dehao Zhang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tao Zhang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haijiang Sun
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Yanhui Tang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Qiaoyuan Liu
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| |
Collapse
|
2
|
Mnif M, Chikh S, Watelain E, Jarraya M. Sex of an Observer Effects on Adults' Motor, Cognitive, and Affective Dart-Shooting Performance. Percept Mot Skills 2024; 131:1788-1813. [PMID: 39129218 DOI: 10.1177/00315125241272509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Men and women are characterized by specific physiological, cerebral and emotional characteristics, as well as by the differing nature of their gestures and behaviors. Here, we examined the effects of an observer's sex on motor, cognitive and affective behaviors during dart-shooting. We compared men and women's kinematic and affective parameters when perfoming alone or when performing in the presence of an observer of a different sex. We found a sex effect on motor and cognitive performance in interaction with participants' emotional states. We observed improved accuracy and reaction time in men compared to women, which we attributed to (a) differences in emotional sensitivities between the two sexes and (b) men's superiority on precision tasks linked to men's higher proportion of cerebral white matter. Our findings also suggested a sex difference in the social effect of an observer's sex on motor and cognitive performance. Although there was no effect on affective aspects of performance, emotional state seemed to interact strongly with this social effect.
Collapse
Affiliation(s)
- Maha Mnif
- Research Unit: Education, Motor Skills, Sport and Health (EM2S), UR15JS01, High Institute of Sport and Physical Education, University of Sfax, Sfax, Tunisia
| | - Soufien Chikh
- Research Unit: Education, Motor Skills, Sport and Health (EM2S), UR15JS01, High Institute of Sport and Physical Education, University of Sfax, Sfax, Tunisia
| | - Eric Watelain
- Université of Toulon, Laboratory UR J-AP2S 201723207F, Toulon, France
| | - Mohamed Jarraya
- Research Unit: Education, Motor Skills, Sport and Health (EM2S), UR15JS01, High Institute of Sport and Physical Education, University of Sfax, Sfax, Tunisia
| |
Collapse
|
3
|
Wu F, Xia Y, Hu T, Ma B, Yang J, Li H. Facial micro-expression recognition based on motion magnification network and graph attention mechanism. Heliyon 2024; 10:e35964. [PMID: 39224303 PMCID: PMC11367415 DOI: 10.1016/j.heliyon.2024.e35964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 08/05/2024] [Accepted: 08/07/2024] [Indexed: 09/04/2024] Open
Abstract
Micro-expression is extensively studied due to their ability to fully reflect individuals' genuine emotions. However, accurate micro-expression recognition is a challenging task due to the subtle motion of facial muscle. Therefore, this paper introduces a Graph Attention Mechanism-based Motion Magnification Guided Micro-Expression Recognition Network (GAM-MM-MER) to amplify delicate muscle motions and focus on key facial landmarks. First, we propose a Swin Transformer-based network for micro-expression motion magnification (ST-MEMM) to enhance the subtle motions in micro-expression videos, thereby unveiling imperceptible facial muscle movements. Then, we propose a graph attention mechanism-based network for micro-expression recognition (GAM-MER), which optimizes facial key area maps and prioritizes adjacent nodes crucial for mitigating the influence of noisy neighbors, while attending to key feature information. Finally, experimental evaluations conducted on the CASME II and SAMM datasets demonstrate the high accuracy and effectiveness of the proposed network compared to state-of-the-art approaches. The results of our network exhibit significant superiority over existing methods. Furthermore, ablation studies provide compelling evidence of the robustness of our proposed network, substantiating its efficacy in micro-expression recognition.
Collapse
Affiliation(s)
- Falin Wu
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| | - Yu Xia
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| | - Tiangyang Hu
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| | - Boyi Ma
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| | - Jingyao Yang
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| | - Haoxin Li
- SNARS Laboratory, School of Instrumentation and Optoelectronic Engineering, Beihang University, No. 37, XueYuan Road, HaiDian District, Beijing, 100191, China
| |
Collapse
|
4
|
Yang P, Liu N, Liu X, Shu Y, Ji W, Ren Z, Sheng J, Yu M, Yi R, Zhang D, Liu YJ. A Multimodal Dataset for Mixed Emotion Recognition. Sci Data 2024; 11:847. [PMID: 39103399 DOI: 10.1038/s41597-024-03676-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 07/23/2024] [Indexed: 08/07/2024] Open
Abstract
Mixed emotions have attracted increasing interest recently, but existing datasets rarely focus on mixed emotion recognition from multimodal signals, hindering the affective computing of mixed emotions. On this basis, we present a multimodal dataset with four kinds of signals recorded while watching mixed and non-mixed emotion videos. To ensure effective emotion induction, we first implemented a rule-based video filtering step to select the videos that could elicit stronger positive, negative, and mixed emotions. Then, an experiment with 80 participants was conducted, in which the data of EEG, GSR, PPG, and frontal face videos were recorded while they watched the selected video clips. We also recorded the subjective emotional rating on PANAS, VAD, and amusement-disgust dimensions. In total, the dataset consists of multimodal signal data and self-assessment data from 73 participants. We also present technical validations for emotion induction and mixed emotion classification from physiological signals and face videos. The average accuracy of the 3-class classification (i.e., positive, negative, and mixed) can reach 80.96% when using SVM and features from all modalities, which indicates the possibility of identifying mixed emotional states.
Collapse
Affiliation(s)
- Pei Yang
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Niqi Liu
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Xinge Liu
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Yezhi Shu
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Wenqi Ji
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Ziqi Ren
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Jenny Sheng
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China
| | - Minjing Yu
- Tianjin University, College of Intelligence and Computing, Tianjin, 300350, China
| | - Ran Yi
- Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, 200240, China
| | - Dan Zhang
- Tsinghua University, Department of Psychology, Beijing, 100084, China
| | - Yong-Jin Liu
- Tsinghua University, Department of Computer Science and Technology, Beijing, 100084, China.
| |
Collapse
|
5
|
Crawford MT, Maymon C, Miles NL, Blackburne K, Tooley M, Grimshaw GM. Emotion in motion: perceiving fear in the behaviour of individuals from minimal motion capture displays. Cogn Emot 2024; 38:451-462. [PMID: 38354068 DOI: 10.1080/02699931.2023.2300748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 12/21/2023] [Indexed: 02/16/2024]
Abstract
The ability to quickly and accurately recognise emotional states is adaptive for numerous social functions. Although body movements are a potentially crucial cue for inferring emotions, few studies have studied the perception of body movements made in naturalistic emotional states. The current research focuses on the use of body movement information in the perception of fear expressed by targets in a virtual heights paradigm. Across three studies, participants made judgments about the emotional states of others based on motion-capture body movement recordings of those individuals actively engaged in walking a virtual plank at ground-level or 80 stories above a city street. Results indicated that participants were reliably able to differentiate between height and non-height conditions (Studies 1 & 2), were more likely to spontaneously describe target behaviour in the height condition as fearful (Study 2) and their fear estimates were highly calibrated with the fear ratings from the targets (Studies 1-3). Findings show that VR height scenarios can induce fearful behaviour and that people can perceive fear in minimal representations of body movement.
Collapse
Affiliation(s)
- Matthew T Crawford
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| | - Christopher Maymon
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| | - Nicola L Miles
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| | - Katie Blackburne
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| | - Michael Tooley
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| | - Gina M Grimshaw
- School of Psychology, Victoria University of Wellington, Wellington, New Zealand
| |
Collapse
|
6
|
Tian H, Gong W, Li W, Qian Y. PASTFNet: a paralleled attention spatio-temporal fusion network for micro-expression recognition. Med Biol Eng Comput 2024; 62:1911-1924. [PMID: 38413518 DOI: 10.1007/s11517-024-03041-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 02/06/2024] [Indexed: 02/29/2024]
Abstract
Micro-expressions (MEs) play such an important role in predicting a person's genuine emotions, as to make micro-expression recognition such an important resea rch focus in recent years. Most recent researchers have made efforts to recognize MEs with spatial and temporal information of video clips. However, because of their short duration and subtle intensity, capturing spatio-temporal features of micro-expressions remains challenging. To effectively promote the recognition performance, this paper presents a novel paralleled dual-branch attention-based spatio-temporal fusion network (PASTFNet). We jointly extract short- and long-range spatial relationships in spatial branch. Inspired by the composite architecture of the convolutional neural network (CNN) and long short-term memory (LSTM) for temporal modeling, we propose a novel attention-based multi-scale feature fusion network (AMFNet) to encode features of sequential frames, which can learn more expressive facial-detailed features for it implements the integrated use of attention and multi-scale feature fusion, then design an aggregation block to aggregate and acquire temporal features. At last, the features learned by the above two branches are fused to accomplish expression recognition with outstanding effect. Experiments on two MER datasets (CASMEII and SAMM) show that the PASTFNet model achieves promising ME recognition performance compared with other methods.
Collapse
Affiliation(s)
- Haichen Tian
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Weijun Gong
- School of Information Science and Engineering, Xinjiang University, Urumqi, China
| | - Wei Li
- School of Software, Xinjiang University, Urumqi, China
| | - Yurong Qian
- School of Information Science and Engineering, Xinjiang University, Urumqi, China.
- School of Software, Xinjiang University, Urumqi, China.
- Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Urumqi, China.
| |
Collapse
|
7
|
Pereira R, Mendes C, Ribeiro J, Ribeiro R, Miragaia R, Rodrigues N, Costa N, Pereira A. Systematic Review of Emotion Detection with Computer Vision and Deep Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:3484. [PMID: 38894274 PMCID: PMC11175284 DOI: 10.3390/s24113484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 05/20/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024]
Abstract
Emotion recognition has become increasingly important in the field of Deep Learning (DL) and computer vision due to its broad applicability by using human-computer interaction (HCI) in areas such as psychology, healthcare, and entertainment. In this paper, we conduct a systematic review of facial and pose emotion recognition using DL and computer vision, analyzing and evaluating 77 papers from different sources under Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our review covers several topics, including the scope and purpose of the studies, the methods employed, and the used datasets. The scope of this work is to conduct a systematic review of facial and pose emotion recognition using DL methods and computer vision. The studies were categorized based on a proposed taxonomy that describes the type of expressions used for emotion detection, the testing environment, the currently relevant DL methods, and the datasets used. The taxonomy of methods in our review includes Convolutional Neural Network (CNN), Faster Region-based Convolutional Neural Network (R-CNN), Vision Transformer (ViT), and "Other NNs", which are the most commonly used models in the analyzed studies, indicating their trendiness in the field. Hybrid and augmented models are not explicitly categorized within this taxonomy, but they are still important to the field. This review offers an understanding of state-of-the-art computer vision algorithms and datasets for emotion recognition through facial expressions and body poses, allowing researchers to understand its fundamental components and trends.
Collapse
Affiliation(s)
- Rafael Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Carla Mendes
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - José Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Roberto Ribeiro
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Rolando Miragaia
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Rodrigues
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - Nuno Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
| | - António Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (R.P.); (C.M.); (J.R.); (R.R.); (R.M.); (N.R.); (N.C.)
- INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901 Leiria, Portugal
| |
Collapse
|
8
|
Zhang Q, Zhang Y, Liu N, Sun X. Understanding of facial features in face perception: insights from deep convolutional neural networks. Front Comput Neurosci 2024; 18:1209082. [PMID: 38655070 PMCID: PMC11035738 DOI: 10.3389/fncom.2024.1209082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 03/18/2024] [Indexed: 04/26/2024] Open
Abstract
Introduction Face recognition has been a longstanding subject of interest in the fields of cognitive neuroscience and computer vision research. One key focus has been to understand the relative importance of different facial features in identifying individuals. Previous studies in humans have demonstrated the crucial role of eyebrows in face recognition, potentially even surpassing the importance of the eyes. However, eyebrows are not only vital for face recognition but also play a significant role in recognizing facial expressions and intentions, which might occur simultaneously and influence the face recognition process. Methods To address these challenges, our current study aimed to leverage the power of deep convolutional neural networks (DCNNs), an artificial face recognition system, which can be specifically tailored for face recognition tasks. In this study, we investigated the relative importance of various facial features in face recognition by selectively blocking feature information from the input to the DCNN. Additionally, we conducted experiments in which we systematically blurred the information related to eyebrows to varying degrees. Results Our findings aligned with previous human research, revealing that eyebrows are the most critical feature for face recognition, followed by eyes, mouth, and nose, in that order. The results demonstrated that the presence of eyebrows was more crucial than their specific high-frequency details, such as edges and textures, compared to other facial features, where the details also played a significant role. Furthermore, our results revealed that, unlike other facial features, the activation map indicated that the significance of eyebrows areas could not be readily adjusted to compensate for the absence of eyebrow information. This finding explains why masking eyebrows led to more significant deficits in face recognition performance. Additionally, we observed a synergistic relationship among facial features, providing evidence for holistic processing of faces within the DCNN. Discussion Overall, our study sheds light on the underlying mechanisms of face recognition and underscores the potential of using DCNNs as valuable tools for further exploration in this field.
Collapse
Affiliation(s)
- Qianqian Zhang
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Yueyi Zhang
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Ning Liu
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoyan Sun
- MoE Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| |
Collapse
|
9
|
Ahmad A, Li Z, Iqbal S, Aurangzeb M, Tariq I, Flah A, Blazek V, Prokop L. A comprehensive bibliometric survey of micro-expression recognition system based on deep learning. Heliyon 2024; 10:e27392. [PMID: 38495163 PMCID: PMC10943397 DOI: 10.1016/j.heliyon.2024.e27392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 02/21/2024] [Accepted: 02/28/2024] [Indexed: 03/19/2024] Open
Abstract
Micro-expressions (ME) are rapidly occurring expressions that reveal the true emotions that a human being is trying to hide, cover, or suppress. These expressions, which reveal a person's actual feelings, have a broad spectrum of applications in public safety and clinical diagnosis. This study provides a comprehensive review of the area of ME recognition. A bibliometric and network analysis techniques is used to compile all the available literature related to ME recognition. A total of 735 publications from the Web of Science (WOS) and Scopus databases were evaluated from December 2012 to December 2022 using all relevant keywords. The first round of data screening produced some basic information, which was further extracted for citation, coupling, co-authorship, co-occurrence, bibliographic, and co-citation analysis. Additionally, a thematic and descriptive analysis was executed to investigate the content of prior research findings, and research techniques used in the literature. The year wise publications indicated that the published literature between 2012 and 2017 was relatively low but however by 2021, a nearly 24-fold increment made it to 154 publications. The three topmost productive journals and conferences included IEEE Transactions on Affective Computing (n = 20 publications) followed by Neurocomputing (n = 17) and Multimedia tools and applications (n = 15). Zhao G was the most proficient author with 48 publications and the top influential country was China (620 publications). Publications by citations showed that each of the authors acquired citations ranging from 100 to 1225. While publications by organizations indicated that the University of Oulu had the most published papers (n = 51). Deep learning, facial expression recognition, and emotion recognition were among the most frequently used terms. It has been discovered that ME research was primarily classified in the discipline of engineering, with more contribution from China and Malaysia comparatively.
Collapse
Affiliation(s)
- Adnan Ahmad
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Zhao Li
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Sheeraz Iqbal
- Department of Electrical Engineering, University of Azad Jammu and Kashmir, Muzaffarabad, 13100, AJK, Pakistan
| | - Muhammad Aurangzeb
- School of Electrical Engineering, Southeast University, Nanjing, 210096, China
| | - Irfan Tariq
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Ayman Flah
- College of Engineering, University of Business and Technology (UBT), Jeddah, 21448, Saudi Arabia
- MEU Research Unit, Middle East University, Amman, Jordan
- The Private Higher School of Applied Sciences and Technology of Gabes, University of Gabes, Gabes, Tunisia
- National Engineering School of Gabes, University of Gabes, Gabes, 6029, Tunisia
| | - Vojtech Blazek
- ENET Centre, VSB—Technical University of Ostrava, Ostrava, Czech Republic
| | - Lukas Prokop
- ENET Centre, VSB—Technical University of Ostrava, Ostrava, Czech Republic
| |
Collapse
|
10
|
Wang Z, Yang M, Jiao Q, Xu L, Han B, Li Y, Tan X. Two-Level Spatio-Temporal Feature Fused Two-Stream Network for Micro-Expression Recognition. SENSORS (BASEL, SWITZERLAND) 2024; 24:1574. [PMID: 38475109 DOI: 10.3390/s24051574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 02/12/2024] [Accepted: 02/27/2024] [Indexed: 03/14/2024]
Abstract
Micro-expressions, which are spontaneous and difficult to suppress, reveal a person's true emotions. They are characterized by short duration and low intensity, making the task of micro-expression recognition challenging in the field of emotion computing. In recent years, deep learning-based feature extraction and fusion techniques have been widely used for micro-expression recognition, particularly methods based on Vision Transformer that have gained popularity. However, the Vision Transformer-based architecture used in micro-expression recognition involves a significant amount of invalid computation. Additionally, in the traditional two-stream architecture, although separate streams are combined through late fusion, only the output features from the deepest level of the network are utilized for classification, thus limiting the network's ability to capture subtle details due to the lack of fine-grained information. To address these issues, we propose a new two-level spatio-temporal feature fused with a two-stream architecture. This architecture includes a spatial encoder (modified ResNet) for learning texture features of the face, a temporal encoder (Swin Transformer) for learning facial muscle motor features, a feature fusion algorithm for integrating multi-level spatio-temporal features, a classification head, and a weighted average operator for temporal aggregation. The two-stream architecture has the advantage of extracting richer features compared to the single-stream architecture, leading to improved performance. The shifted window scheme of Swin Transformer restricts self-attention computation to non-overlapping local windows and allows cross-window connections, significantly improving the performance and reducing the computation compared to Vision Transformer. Moreover, the modified ResNet is computationally less intensive. Our proposed feature fusion algorithm leverages the similarity in output feature shapes at each stage of the two streams, enabling the effective fusion of multi-level spatio-temporal features. This algorithm results in an improvement of approximately 4% in both the F1 score and the UAR. Comprehensive evaluations conducted on three widely used spontaneous micro-expression datasets (SMIC-HS, CASME II, and SAMM) consistently demonstrate the superiority of our approach over comparative methods. Notably, our approach achieves a UAR exceeding 0.905 on CASME II, making it one of the few frameworks in the published micro-expression recognition literature to achieve such high performance.
Collapse
Affiliation(s)
- Zebiao Wang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyu Yang
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qingbin Jiao
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Liang Xu
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Bing Han
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
| | - Yuhang Li
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xin Tan
- Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
11
|
Zhao X, Chen J, Chen T, Liu Y, Wang S, Zeng X, Yan J, Liu G. Micro-Expression Recognition Based on Nodal Efficiency in the EEG Functional Networks. IEEE Trans Neural Syst Rehabil Eng 2024; 32:887-894. [PMID: 38190663 DOI: 10.1109/tnsre.2023.3347601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Micro-expression recognition based on ima- ges has made some progress, yet limitations persist. For instance, image-based recognition of micro-expressions is affected by factors such as ambient light, changes in head posture, and facial occlusion. The high temporal resolution of electroencephalogram (EEG) technology can record brain activity associated with micro-expressions and identify them objectively from a neurophysiological standpoint. Accordingly, this study introduces a novel method for recognizing micro-expressions using node efficiency features of brain networks derived from EEG signals. We designed a real-time Supervision and Emotional Expression Suppression (SEES) experimental paradigm to collect video and EEG data reflecting micro- and macro-expression states from 70 participants experiencing positive emotions. By constructing functional brain networks based on graph theory, we analyzed the network efficiencies at both macro- and micro-levels. The participants exhibited lower connection density, global efficiency, and nodal efficiency in the alpha, beta, and gamma networks during micro-expressions compared to macro-expressions. We then selected the optimal subset of nodal efficiency features using a random forest algorithm and applied them to various classifiers, including Support Vector Machine (SVM), Gradient-Boosted Decision Tree (GBDT), Logistic Regression (LR), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). These classifiers achieved promising accuracy in micro-expression recognition, with SVM exhibiting the highest accuracy of 92.6% when 15 channels were selected. This study provides a new neuroscientific indicator for recognizing micro-expressions based on EEG signals, thereby broadening the potential applications for micro-expression recognition.
Collapse
|
12
|
Sharma D, Singh J, Shah B, Ali F, AlZubi AA, AlZubi MA. Public mental health through social media in the post COVID-19 era. Front Public Health 2023; 11:1323922. [PMID: 38146469 PMCID: PMC10749364 DOI: 10.3389/fpubh.2023.1323922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 11/22/2023] [Indexed: 12/27/2023] Open
Abstract
Social media is a powerful communication tool and a reflection of our digital environment. Social media acted as an augmenter and influencer during and after COVID-19. Many of the people sharing social media posts were not actually aware of their mental health status. This situation warrants to automate the detection of mental disorders. This paper presents a methodology for the detection of mental disorders using micro facial expressions. Micro-expressions are momentary, involuntary facial expressions that can be indicative of deeper feelings and mental states. Nevertheless, manually detecting and interpreting micro-expressions can be rather challenging. A deep learning HybridMicroNet model, based on convolution neural networks, is proposed for emotion recognition from micro-expressions. Further, a case study for the detection of mental health has been undertaken. The findings demonstrated that the proposed model achieved a high accuracy when attempting to diagnose mental health disorders based on micro-expressions. The attained accuracy on the CASME dataset was 99.08%, whereas the accuracy that was achieved on SAMM dataset was 97.62%. Based on these findings, deep learning may prove to be an effective method for diagnosing mental health conditions by analyzing micro-expressions.
Collapse
Affiliation(s)
- Deepika Sharma
- Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
| | - Jaiteg Singh
- Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
| | - Babar Shah
- College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
| | - Farman Ali
- Department of Computer Science and Engineering, School of Convergence, College of Computing and Informatics, Sungkyunkwan University, Seoul, Republic of Korea
| | - Ahmad Ali AlZubi
- Department of Computer Science, Community College, King Saud University, Riyadh, Saudi Arabia
| | - Mallak Ahmad AlZubi
- Faculty of Medicine, Jordan University of Science and Technology, Irbid, Jordan
| |
Collapse
|
13
|
Yang H, Xie L, Pan H, Li C, Wang Z, Zhong J. Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1246. [PMID: 37761545 PMCID: PMC10528512 DOI: 10.3390/e25091246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/07/2023] [Accepted: 08/15/2023] [Indexed: 09/29/2023]
Abstract
The emotional changes in facial micro-expressions are combinations of action units. The researchers have revealed that action units can be used as additional auxiliary data to improve facial micro-expression recognition. Most of the researchers attempt to fuse image features and action unit information. However, these works ignore the impact of action units on the facial image feature extraction process. Therefore, this paper proposes a local detail feature enhancement model based on a multimodal dynamic attention fusion network (MADFN) method for micro-expression recognition. This method uses a masked autoencoder based on learnable class tokens to remove local areas with low emotional expression ability in micro-expression images. Then, we utilize the action unit dynamic fusion module to fuse action unit representation to improve the potential representation ability of image features. The state-of-the-art performance of our proposed model is evaluated and verified on SMIC, CASME II, SAMM, and their combined 3DB-Combined datasets. The experimental results demonstrated that the proposed model achieved competitive performance with accuracy rates of 81.71%, 82.11%, and 77.21% on SMIC, CASME II, and SAMM datasets, respectively, that show the MADFN model can help to improve the discrimination of facial image emotional features.
Collapse
Affiliation(s)
- Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Hang Pan
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Chiqin Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Jialiang Zhong
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031, China;
| |
Collapse
|
14
|
Cîrneanu AL, Popescu D, Iordache D. New Trends in Emotion Recognition Using Image Analysis by Neural Networks, A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:7092. [PMID: 37631629 PMCID: PMC10458371 DOI: 10.3390/s23167092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/29/2023] [Accepted: 08/02/2023] [Indexed: 08/27/2023]
Abstract
Facial emotion recognition (FER) is a computer vision process aimed at detecting and classifying human emotional expressions. FER systems are currently used in a vast range of applications from areas such as education, healthcare, or public safety; therefore, detection and recognition accuracies are very important. Similar to any computer vision task based on image analyses, FER solutions are also suitable for integration with artificial intelligence solutions represented by different neural network varieties, especially deep neural networks that have shown great potential in the last years due to their feature extraction capabilities and computational efficiency over large datasets. In this context, this paper reviews the latest developments in the FER area, with a focus on recent neural network models that implement specific facial image analysis algorithms to detect and recognize facial emotions. This paper's scope is to present from historical and conceptual perspectives the evolution of the neural network architectures that proved significant results in the FER area. This paper endorses convolutional neural network (CNN)-based architectures against other neural network architectures, such as recurrent neural networks or generative adversarial networks, highlighting the key elements and performance of each architecture, and the advantages and limitations of the proposed models in the analyzed papers. Additionally, this paper presents the available datasets that are currently used for emotion recognition from facial expressions and micro-expressions. The usage of FER systems is also highlighted in various domains such as healthcare, education, security, or social IoT. Finally, open issues and future possible developments in the FER area are identified.
Collapse
Affiliation(s)
- Andrada-Livia Cîrneanu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania;
| | - Dan Popescu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania;
| | - Dragoș Iordache
- The National Institute for Research & Development in Informatics-ICI Bucharest, 011455 Bucharest, Romania;
| |
Collapse
|
15
|
Pan H, Yang H, Xie L, Wang Z. Multi-scale fusion visual attention network for facial micro-expression recognition. Front Neurosci 2023; 17:1216181. [PMID: 37575295 PMCID: PMC10412924 DOI: 10.3389/fnins.2023.1216181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 06/26/2023] [Indexed: 08/15/2023] Open
Abstract
Introduction Micro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest. Methods This paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model. Results The proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition. Discussion This paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition.
Collapse
Affiliation(s)
- Hang Pan
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi, China
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
| |
Collapse
|
16
|
Fu C, Yang W, Chen D, Wei F. AM3F-FlowNet: Attention-Based Multi-Scale Multi-Branch Flow Network. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1064. [PMID: 37510012 PMCID: PMC10378207 DOI: 10.3390/e25071064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/02/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Micro-expressions are the small, brief facial expression changes that humans momentarily show during emotional experiences, and their data annotation is complicated, which leads to the scarcity of micro-expression data. To extract salient and distinguishing features from a limited dataset, we propose an attention-based multi-scale, multi-modal, multi-branch flow network to thoroughly learn the motion information of micro-expressions by exploiting the attention mechanism and the complementary properties between different optical flow information. First, we extract optical flow information (horizontal optical flow, vertical optical flow, and optical strain) based on the onset and apex frames of micro-expression videos, and each branch learns one kind of optical flow information separately. Second, we propose a multi-scale fusion module to extract more prosperous and more stable feature expressions using spatial attention to focus on locally important information at each scale. Then, we design a multi-optical flow feature reweighting module to adaptively select features for each optical flow separately by channel attention. Finally, to better integrate the information of the three branches and to alleviate the problem of uneven distribution of micro-expression samples, we introduce a logarithmically adjusted prior knowledge weighting loss. This loss function weights the prediction scores of samples from different categories to mitigate the negative impact of category imbalance during the classification process. The effectiveness of the proposed model is demonstrated through extensive experiments and feature visualization on three benchmark datasets (CASMEII, SAMM, and SMIC), and its performance is comparable to that of state-of-the-art methods.
Collapse
Affiliation(s)
- Chenghao Fu
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| | - Wenzhong Yang
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
- Xinjiang Key Laboratory of Multilingual Information Technology, Xinjiang University, Urumqi 830017, China
| | - Danny Chen
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| | - Fuyuan Wei
- School of Information Science and Engineering, Xinjiang University, Urumqi 830017, China
| |
Collapse
|
17
|
Zheng Y, Blasch E. Facial Micro-Expression Recognition Enhanced by Score Fusion and a Hybrid Model from Convolutional LSTM and Vision Transformer. SENSORS (BASEL, SWITZERLAND) 2023; 23:5650. [PMID: 37420815 PMCID: PMC10303532 DOI: 10.3390/s23125650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/02/2023] [Accepted: 06/13/2023] [Indexed: 07/09/2023]
Abstract
In the billions of faces that are shaped by thousands of different cultures and ethnicities, one thing remains universal: the way emotions are expressed. To take the next step in human-machine interactions, a machine (e.g., a humanoid robot) must be able to clarify facial emotions. Allowing systems to recognize micro-expressions affords the machine a deeper dive into a person's true feelings, which will take human emotion into account while making optimal decisions. For instance, these machines will be able to detect dangerous situations, alert caregivers to challenges, and provide appropriate responses. Micro-expressions are involuntary and transient facial expressions capable of revealing genuine emotions. We propose a new hybrid neural network (NN) model capable of micro-expression recognition in real-time applications. Several NN models are first compared in this study. Then, a hybrid NN model is created by combining a convolutional neural network (CNN), a recurrent neural network (RNN, e.g., long short-term memory (LSTM)), and a vision transformer. The CNN can extract spatial features (within a neighborhood of an image), whereas the LSTM can summarize temporal features. In addition, a transformer with an attention mechanism can capture sparse spatial relations residing in an image or between frames in a video clip. The inputs of the model are short facial videos, while the outputs are the micro-expressions recognized from the videos. The NN models are trained and tested with publicly available facial micro-expression datasets to recognize different micro-expressions (e.g., happiness, fear, anger, surprise, disgust, sadness). Score fusion and improvement metrics are also presented in our experiments. The results of our proposed models are compared with that of literature-reported methods tested on the same datasets. The proposed hybrid model performs the best, where score fusion can dramatically increase recognition performance.
Collapse
Affiliation(s)
- Yufeng Zheng
- Department of Data Science, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | | |
Collapse
|
18
|
Li Z, Zhang Y, Xing H, Chan KL. Facial Micro-Expression Recognition Using Double-Stream 3D Convolutional Neural Network with Domain Adaptation. SENSORS (BASEL, SWITZERLAND) 2023; 23:3577. [PMID: 37050637 PMCID: PMC10098639 DOI: 10.3390/s23073577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 03/17/2023] [Accepted: 03/27/2023] [Indexed: 06/19/2023]
Abstract
Humans show micro-expressions (MEs) under some circumstances. MEs are a display of emotions that a human wants to conceal. The recognition of MEs has been applied in various fields. However, automatic ME recognition remains a challenging problem due to two major obstacles. As MEs are typically of short duration and low intensity, it is hard to extract discriminative features from ME videos. Moreover, it is tedious to collect ME data. Existing ME datasets usually contain insufficient video samples. In this paper, we propose a deep learning model, double-stream 3D convolutional neural network (DS-3DCNN), for recognizing MEs captured in video. The recognition framework contains two streams of 3D-CNN. The first extracts spatiotemporal features from the raw ME videos. The second extracts variations of the facial motions within the spatiotemporal domain. To facilitate feature extraction, the subtle motion embedded in a ME is amplified. To address the insufficient ME data, a macro-expression dataset is employed to expand the training sample size. Supervised domain adaptation is adopted in model training in order to bridge the difference between ME and macro-expression datasets. The DS-3DCNN model is evaluated on two publicly available ME datasets. The results show that the model outperforms various state-of-the-art models; in particular, the model outperformed the best model presented in MEGC2019 by more than 6%.
Collapse
Affiliation(s)
- Zhengdao Li
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China; (Z.L.); (H.X.)
| | - Yupei Zhang
- Centre for Intelligent Multidimensional Data Analysis Limited, Hong Kong, China;
| | - Hanwen Xing
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China; (Z.L.); (H.X.)
| | - Kwok-Leung Chan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China; (Z.L.); (H.X.)
| |
Collapse
|
19
|
Zhou H, Huang S, Li J, Wang SJ. Dual-ATME: Dual-Branch Attention Network for Micro-Expression Recognition. ENTROPY (BASEL, SWITZERLAND) 2023; 25:460. [PMID: 36981348 PMCID: PMC10048169 DOI: 10.3390/e25030460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 02/26/2023] [Accepted: 02/28/2023] [Indexed: 06/18/2023]
Abstract
Micro-expression recognition (MER) is challenging due to the difficulty of capturing the instantaneous and subtle motion changes of micro-expressions (MEs). Early works based on hand-crafted features extracted from prior knowledge showed some promising results, but have recently been replaced by deep learning methods based on the attention mechanism. However, with limited ME sample sizes, features extracted by these methods lack discriminative ME representations, in yet-to-be improved MER performance. This paper proposes the Dual-branch Attention Network (Dual-ATME) for MER to address the problem of ineffective single-scale features representing MEs. Specifically, Dual-ATME consists of two components: Hand-crafted Attention Region Selection (HARS) and Automated Attention Region Selection (AARS). HARS uses prior knowledge to manually extract features from regions of interest (ROIs). Meanwhile, AARS is based on attention mechanisms and extracts hidden information from data automatically. Finally, through similarity comparison and feature fusion, the dual-scale features could be used to learn ME representations effectively. Experiments on spontaneous ME datasets (including CASME II, SAMM, SMIC) and their composite dataset, MEGC2019-CD, showed that Dual-ATME achieves better, or more competitive, performance than the state-of-the-art MER methods.
Collapse
Affiliation(s)
- Haoliang Zhou
- School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China;
- Key Laboratory of Behavior Sciences, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China;
| | - Shucheng Huang
- School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China;
| | - Jingting Li
- Key Laboratory of Behavior Sciences, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China;
- Department of Psychology, University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Su-Jing Wang
- Key Laboratory of Behavior Sciences, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China;
- Department of Psychology, University of the Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
20
|
Li J, Dong Z, Lu S, Wang SJ, Yan WJ, Ma Y, Liu Y, Huang C, Fu X. CAS(ME) 3: A Third Generation Facial Spontaneous Micro-Expression Database With Depth Information and High Ecological Validity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2782-2800. [PMID: 35560102 DOI: 10.1109/tpami.2022.3174895] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Micro-expression (ME) is a significant non-verbal communication clue that reveals one person's genuine emotional state. The development of micro-expression analysis (MEA) has just gained attention in the last decade. However, the small sample size problem constrains the use of deep learning on MEA. Besides, ME samples distribute in six different databases, leading to database bias. Moreover, the ME database development is complicated. In this article, we introduce a large-scale spontaneous ME database: CAS(ME) 3. The contribution of this article is summarized as follows: (1) CAS(ME) 3 offers around 80 hours of videos with over 8,000,000 frames, including manually labeled 1,109 MEs and 3,490 macro-expressions. Such a large sample size allows effective MEA method validation while avoiding database bias. (2) Inspired by psychological experiments, CAS(ME) 3 provides the depth information as an additional modality unprecedentedly, contributing to multi-modal MEA. (3) For the first time, CAS(ME) 3 elicits ME with high ecological validity using the mock crime paradigm, along with physiological and voice signals, contributing to practical MEA. (4) Besides, CAS(ME) 3 provides 1,508 unlabeled videos with more than 4,000,000 frames, i.e., a data platform for unsupervised MEA methods. (5) Finally, we demonstrate the effectiveness of depth information by the proposed depth flow algorithm and RGB-D information.
Collapse
|
21
|
Temporal augmented contrastive learning for micro-expression recognition. Pattern Recognit Lett 2023. [DOI: 10.1016/j.patrec.2023.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
22
|
Li C, Wen C, Qiu Y. A Video Sequence Face Expression Recognition Method Based on Squeeze-and-Excitation and 3DPCA Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:823. [PMID: 36679620 PMCID: PMC9861482 DOI: 10.3390/s23020823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 12/26/2022] [Accepted: 01/07/2023] [Indexed: 06/17/2023]
Abstract
Expression recognition is a very important direction for computers to understand human emotions and human-computer interaction. However, for 3D data such as video sequences, the complex structure of traditional convolutional neural networks, which stretch the input 3D data into vectors, not only leads to a dimensional explosion, but also fails to retain structural information in 3D space, simultaneously leading to an increase in computational cost and a lower accuracy rate of expression recognition. This paper proposes a video sequence face expression recognition method based on Squeeze-and-Excitation and 3DPCA Network (SE-3DPCANet). The introduction of a 3DPCA algorithm in the convolution layer directly constructs tensor convolution kernels to extract the dynamic expression features of video sequences from the spatial and temporal dimensions, without weighting the convolution kernels of adjacent frames by shared weights. Squeeze-and-Excitation Network is introduced in the feature encoding layer, to automatically learn the weights of local channel features in the tensor features, thus increasing the representation capability of the model and further improving recognition accuracy. The proposed method is validated on three video face expression datasets. Comparisons were made with other common expression recognition methods, achieving higher recognition rates while significantly reducing the time required for training.
Collapse
Affiliation(s)
- Chang Li
- School of Automation, Guangdong University of Petrochemical Technology, Maoming 525000, China
| | - Chenglin Wen
- School of Automation, Guangdong University of Petrochemical Technology, Maoming 525000, China
| | - Yiting Qiu
- School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
| |
Collapse
|
23
|
Yang C, You X, Xie X, Duan Y, Wang B, Zhou Y, Feng H, Wang W, Fan L, Huang G, Shen X. Development of a Chinese werewolf deception database. Front Psychol 2023; 13:1047427. [PMID: 36698609 PMCID: PMC9869050 DOI: 10.3389/fpsyg.2022.1047427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 12/15/2022] [Indexed: 01/11/2023] Open
Abstract
Although it is important to accurately detect deception, limited research in this area has been undertaken involving Asian people. We aim to address this gap by undertaking research regarding the identification of deception in Asians in realistic environments. In this study, we develop a Chinese Werewolf Deception Database (C2W2D), which consists of 168 video clips (84 deception videos and 84 honest videos). A total of 1,738,760 frames of facial data are recorded. Fifty-eight healthy undergraduates (24 men and 34 women) and 26 drug addicts (26 men) participated in a werewolf game. The development of C2W2D is accomplished based on a "werewolf" deception game paradigm in which the participants spontaneously tell the truth or a lie. Two synced high-speed cameras are used to capture the game process. To explore the differences between lying and truth-telling in the database, descriptive statistics (e.g., duration and quantity) and hypothesis tests are conducted using action units (AUs) of facial expressions (e.g., t-test). The C2W2D contributes to a relatively sizable number of deceptive and honest samples with high ecological validity. These samples can be used to study the individual differences and the underlying mechanisms of lies and truth-telling between drug addicts and healthy people.
Collapse
Affiliation(s)
- Chaocao Yang
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China,School of Psychology, Shaanxi Normal University, Xi’an, China,Shaanxi Provincial Key Laboratory of Behavior and Cognitive Neuroscience, Shaanxi Normal University, Xi’an, China
| | - Xuqun You
- School of Psychology, Shaanxi Normal University, Xi’an, China,Shaanxi Provincial Key Laboratory of Behavior and Cognitive Neuroscience, Shaanxi Normal University, Xi’an, China
| | - Xudong Xie
- School of Psychology, Shaanxi Normal University, Xi’an, China,Shaanxi Provincial Key Laboratory of Behavior and Cognitive Neuroscience, Shaanxi Normal University, Xi’an, China
| | - Yuanyuan Duan
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Buxue Wang
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Yuxi Zhou
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Hong Feng
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Wenjing Wang
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Ling Fan
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Genying Huang
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Xunbing Shen
- Key Laboratory of Psychology of TCM and Brain Science, Jiangxi Administration of Traditional Chinese Medicine, Jiangxi University of Chinese Medicine, Nanchang, China,*Correspondence: Xunbing Shen,
| |
Collapse
|
24
|
A Survey of Micro-expression Recognition Methods Based on LBP, Optical Flow and Deep Learning. Neural Process Lett 2023. [DOI: 10.1007/s11063-022-11123-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
25
|
Dong Z, Wang G, Lu S, Dai L, Huang S, Liu Y. Intentional-Deception Detection Based on Facial Muscle Movements in an Interactive Social Context. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
26
|
Wu Q, Peng K, Xie Y, Lai Y, Liu X, Zhao Z. An ingroup disadvantage in recognizing micro-expressions. Front Psychol 2022; 13:1050068. [PMID: 36507018 PMCID: PMC9732534 DOI: 10.3389/fpsyg.2022.1050068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/08/2022] [Indexed: 11/27/2022] Open
Abstract
Micro-expression is a fleeting facial expression of emotion that usually occurs in high-stake situations and reveals the true emotion that a person tries to conceal. Due to its unique nature, recognizing micro-expression has great applications for fields like law enforcement, medical treatment, and national security. However, the psychological mechanism of micro-expression recognition is still poorly understood. In the present research, we sought to expand upon previous research to investigate whether the group membership of the expresser influences the recognition process of micro-expressions. By conducting two behavioral studies, we found that contrary to the widespread ingroup advantage found in macro-expression recognition, there was a robust ingroup disadvantage in micro-expression recognition instead. Specifically, in Study 1A and 1B, we found that participants were more accurate at recognizing the intense and subtle micro-expressions of their racial outgroups than those micro-expressions of their racial ingroups, and neither the training experience nor the duration of micro-expressions moderated this ingroup disadvantage. In Study 2A and 2B, we further found that mere social categorization alone was sufficient to elicit the ingroup disadvantage for the recognition of intense and subtle micro-expressions, and such an effect was also unaffected by the duration of micro-expressions. These results suggest that individuals spontaneously employ the social category information of others to recognize micro-expressions, and the ingroup disadvantage in micro-expression stems partly from motivated differential processing of ingroup micro-expressions.
Collapse
Affiliation(s)
- Qi Wu
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China,*Correspondence: Qi Wu,
| | - Kunling Peng
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Yanni Xie
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Yeying Lai
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Xuanchen Liu
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Ziwei Zhao
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China,Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| |
Collapse
|
27
|
Xie T, Sun G, Sun H, Lin Q, Ben X. Decoupling facial motion features and identity features for micro-expression recognition. PeerJ Comput Sci 2022; 8:e1140. [PMID: 36426264 PMCID: PMC9680898 DOI: 10.7717/peerj-cs.1140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/10/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Micro-expression is a kind of expression produced by people spontaneously and unconsciously when receiving stimulus. It has the characteristics of low intensity and short duration. Moreover, it cannot be controlled and disguised. Thus, micro-expression can objectively reflect people's real emotional states. Therefore, automatic recognition of micro-expressions can help machines better understand the users' emotion, which can promote human-computer interaction. What's more, micro-expression recognition has a wide range of applications in fields like security systems and psychological treatment. Nowadays, thanks to the development of artificial intelligence, most micro-expression recognition algorithms are based on deep learning. The features extracted by deep learning model from the micro-expression video sequences mainly contain facial motion feature information and identity feature information. However, in micro-expression recognition tasks, the motions of facial muscles are subtle. As a result, the recognition can be easily interfered by identity feature information. METHODS To solve the above problem, a micro-expression recognition algorithm which decouples facial motion features and identity features is proposed in this paper. A Micro-Expression Motion Information Features Extraction Network (MENet) and an Identity Information Features Extraction Network (IDNet) are designed. By adding a Diverse Attention Operation (DAO) module and constructing divergence loss function in MENet, facial motion features can be effectively extracted. Global attention operations are used in IDNet to extract identity features. A Mutual Information Neural Estimator (MINE) is utilized to decouple facial motion features and identity features, which can help the model obtain more discriminative micro-expression features. RESULTS Experiments on the SDU, MMEW, SAMM and CASME II datasets were conducted, which achieved competitive results and proved the superiority of the proposed algorithm.
Collapse
|
28
|
A fixed-point rotation-based feature selection method for micro-expression recognition. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.10.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
29
|
Verma M, Reddy MSK, Meedimale YR, Mandal M, Vipparthi SK. AutoMER: Spatiotemporal Neural Architecture Search for Microexpression Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6116-6128. [PMID: 33886480 DOI: 10.1109/tnnls.2021.3072290] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Facial microexpressions offer useful insights into subtle human emotions. This unpremeditated emotional leakage exhibits the true emotions of a person. However, the minute temporal changes in the video sequences are very difficult to model for accurate classification. In this article, we propose a novel spatiotemporal architecture search algorithm, AutoMER for microexpression recognition (MER). Our main contribution is a new parallelogram design-based search space for efficient architecture search. We introduce a spatiotemporal feature module named 3-D singleton convolution for cell-level analysis. Furthermore, we present four such candidate operators and two 3-D dilated convolution operators to encode the raw video sequences in an end-to-end manner. To the best of our knowledge, this is the first attempt to discover 3-D convolutional neural network (CNN) architectures with a network-level search for MER. The searched models using the proposed AutoMER algorithm are evaluated over five microexpression data sets: CASME-I, SMIC, CASME-II, CAS(ME) ∧2 , and SAMM. The proposed generated models quantitatively outperform the existing state-of-the-art approaches. The AutoMER is further validated with different configurations, such as downsampling rate factor, multiscale singleton 3-D convolution, parallelogram, and multiscale kernels. Overall, five ablation experiments were conducted to analyze the operational insights of the proposed AutoMER.
Collapse
|
30
|
Concordance between Facial Micro-expressions and Physiological Signals under Emotion Elicitation. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
31
|
He Y, Xu Z, Ma L, Li H. Micro-expression spotting based on optical flow features. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
32
|
Micro-expression recognition with supervised contrastive learning. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
33
|
Zhao X, Chen J, Chen T, Wang S, Liu Y, Zeng X, Liu G. Responses of functional brain networks in micro-expressions: An EEG study. Front Psychol 2022; 13:996905. [DOI: 10.3389/fpsyg.2022.996905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 10/04/2022] [Indexed: 11/13/2022] Open
Abstract
Micro-expressions (MEs) can reflect an individual’s subjective emotions and true mental state, and they are widely used in the fields of mental health, justice, law enforcement, intelligence, and security. However, one of the major challenges of working with MEs is that their neural mechanism is not entirely understood. To the best of our knowledge, the present study is the first to use electroencephalography (EEG) to investigate the reorganizations of functional brain networks involved in MEs. We aimed to reveal the underlying neural mechanisms that can provide electrophysiological indicators for ME recognition. A real-time supervision and emotional expression suppression experimental paradigm was designed to collect video and EEG data of MEs and no expressions (NEs) of 70 participants expressing positive emotions. Based on the graph theory, we analyzed the efficiency of functional brain network at the scalp level on both macro and micro scales. The results revealed that in the presence of MEs compared with NEs, the participants exhibited higher global efficiency and nodal efficiency in the frontal, occipital, and temporal regions. Additionally, using the random forest algorithm to select a subset of functional connectivity features as input, the support vector machine classifier achieved a classification accuracy for MEs and NEs of 0.81, with an area under the curve of 0.85. This finding demonstrates the possibility of using EEG to recognize MEs, with a wide range of application scenarios, such as persons wearing face masks or patients with expression disorders.
Collapse
|
34
|
Micro-expression recognition model based on TV-L1 optical flow method and improved ShuffleNet. Sci Rep 2022; 12:17522. [PMID: 36266408 PMCID: PMC9585088 DOI: 10.1038/s41598-022-21738-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 09/30/2022] [Indexed: 01/13/2023] Open
Abstract
Micro-expression is a kind of facial action that reflects the real emotional state of a person, and has high objectivity in emotion detection. Therefore, micro-expression recognition has become one of the research hotspots in the field of computer vision in recent years. Research with neural networks with convolutional structure is still one of the main methods of recognition. This method has the advantage of high operational efficiency and low computational complexity, but the disadvantage is its localization of feature extraction. In recent years, there are more and more plug-and-play self-attentive modules being used in convolutional neural networks to improve the ability of the model to extract global features of the samples. In this paper, we propose the ShuffleNet model combined with a miniature self-attentive module, which has only 1.53 million training parameters. First, the start frame and vertex frame of each sample will be taken out, and its TV-L1 optical flow features will be extracted. After that, the optical flow features are fed into the model for pre-training. Finally, the weights obtained from the pre-training are used as initialization weights for the model to train the complete micro-expression samples and classify them by the SVM classifier. To evaluate the effectiveness of the method, it was trained and tested on a composite dataset consisting of CASMEII, SMIC, and SAMM, and the model achieved competitive results compared to state-of-the-art methods through cross-validation of leave-one-out subjects.
Collapse
|
35
|
Zhao X, Liu Y, Wang S, Chen J, Chen T, Liu G. Electrophysiological evidence for inhibition hypothesis of micro-expressions based on tensor component analysis and Physarum network algorithm. Neurosci Lett 2022; 790:136897. [PMID: 36195299 DOI: 10.1016/j.neulet.2022.136897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 09/01/2022] [Accepted: 09/29/2022] [Indexed: 11/30/2022]
Abstract
The inhibition hypothesis advocated by Ekman (1985) states when an emotion is concealed or masked, the true emotion is manifested as a micro-expression (ME) which is a fleeting expression lasting for 40 to 500 ms. However, research about the inhibition hypothesis of ME from the perspective of electrophysiology is lacking. Here, we report the electrophysiological evidence obtained from an electroencephalography (EEG) data analysis method. Specifically, we designed an ME elicitation paradigm to collect data of MEs of positive emotions and EEG from 70 subjects, and proposed a method based on tensor component analysis (TCA) combined with the Physarum network (PN) algorithm to characterize the spatial, temporal, and spectral signatures of dynamic EEG data of MEs. The proposed TCA-PN methods revealed two pathways involving dorsal and ventral streams in functional brain networks of MEs, which reflected the inhibition processing and emotion arousal of MEs. The results provide evidence for the inhibition hypothesis from an electrophysiological standpoint, which allows us to better understand the neural mechanism of MEs.
Collapse
Affiliation(s)
- Xingcong Zhao
- School of Electronic and Information Engineering, Southwest University, 400715, China
| | - Ying Liu
- School of Music, Southwest University, 400715, China
| | - Shiyuan Wang
- School of Electronic and Information Engineering, Southwest University, 400715, China
| | - Jiejia Chen
- School of Electronic and Information Engineering, Southwest University, 400715, China
| | - Tong Chen
- School of Electronic and Information Engineering, Southwest University, 400715, China
| | - Guangyuan Liu
- School of Electronic and Information Engineering, Southwest University, 400715, China; Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, 400715, China.
| |
Collapse
|
36
|
Pan H, Xie L, Wang Z. Spatio-temporal convolutional emotional attention network for spotting macro- and micro-expression intervals in long video sequences. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
37
|
Fan X, Shahid AR, Yan H. Edge-aware motion based facial micro-expression generation with attention mechanism. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
38
|
Zhao X, Liu Y, Chen T, Wang S, Chen J, Wang L, Liu G. Differences in brain activations between micro- and macro-expressions based on electroencephalography. Front Neurosci 2022; 16:903448. [PMID: 36172039 PMCID: PMC9511965 DOI: 10.3389/fnins.2022.903448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 08/23/2022] [Indexed: 12/04/2022] Open
Abstract
Micro-expressions can reflect an individual's subjective emotions and true mental state and are widely used in the fields of mental health, justice, law enforcement, intelligence, and security. However, the current approach based on image and expert assessment-based micro-expression recognition technology has limitations such as limited application scenarios and time consumption. Therefore, to overcome these limitations, this study is the first to explore the brain mechanisms of micro-expressions and their differences from macro-expressions from a neuroscientific perspective. This can be a foundation for micro-expression recognition based on EEG signals. We designed a real-time supervision and emotional expression suppression (SEES) experimental paradigm to synchronously collect facial expressions and electroencephalograms. Electroencephalogram signals were analyzed at the scalp and source levels to determine the temporal and spatial neural patterns of micro- and macro-expressions. We found that micro-expressions were more strongly activated in the premotor cortex, supplementary motor cortex, and middle frontal gyrus in frontal regions under positive emotions than macro-expressions. Under negative emotions, micro-expressions were more weakly activated in the somatosensory cortex and corneal gyrus regions than macro-expressions. The activation of the right temporoparietal junction (rTPJ) was stronger in micro-expressions under positive than negative emotions. The reason for this difference is that the pathways of facial control are different; the production of micro-expressions under positive emotion is dependent on the control of the face, while micro-expressions under negative emotions are more dependent on the intensity of the emotion.
Collapse
Affiliation(s)
- Xingcong Zhao
- School of Electronic and Information Engineering, Southwest University, Chongqing, China
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| | - Ying Liu
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
- School of Music, Southwest University, Chongqing, China
| | - Tong Chen
- School of Electronic and Information Engineering, Southwest University, Chongqing, China
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| | - Shiyuan Wang
- School of Electronic and Information Engineering, Southwest University, Chongqing, China
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| | - Jiejia Chen
- School of Electronic and Information Engineering, Southwest University, Chongqing, China
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| | - Linwei Wang
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| | - Guangyuan Liu
- School of Electronic and Information Engineering, Southwest University, Chongqing, China
- Key Laboratory of Cognition and Personality, Ministry of Education, Southwest University, Chongqing, China
| |
Collapse
|
39
|
Wei M, Zong Y, Jiang X, Lu C, Liu J. Micro-Expression Recognition Using Uncertainty-Aware Magnification-Robust Networks. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1271. [PMID: 36141156 PMCID: PMC9498083 DOI: 10.3390/e24091271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/02/2022] [Accepted: 09/06/2022] [Indexed: 06/16/2023]
Abstract
A micro-expression (ME) is a kind of involuntary facial expressions, which commonly occurs with subtle intensity. The accurately recognition ME, a. k. a. micro-expression recognition (MER), has a number of potential applications, e.g., interrogation and clinical diagnosis. Therefore, the subject has received a high level of attention among researchers in affective computing and pattern recognition communities. In this paper, we proposed a straightforward and effective deep learning method called uncertainty-aware magnification-robust networks (UAMRN) for MER, which attempts to address two key issues in MER including the low intensity of ME and imbalance of ME samples. Specifically, to better distinguish subtle ME movements, we reconstructed a new sequence by magnifying the ME intensity. Furthermore, a sparse self-attention (SSA) block was implemented which rectifies the standard self-attention with locality sensitive hashing (LSH), resulting in the suppression of artefacts generated during magnification. On the other hand, for the class imbalance problem, we guided the network optimization based on the confidence about the estimation, through which the samples from rare classes were allotted greater uncertainty and thus trained more carefully. We conducted the experiments on three public ME databases, i.e., CASME II, SAMM and SMIC-HS, the results of which demonstrate improvement compared to recent state-of-the-art MER methods.
Collapse
Affiliation(s)
- Mengting Wei
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Yuan Zong
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Xingxun Jiang
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| | - Cheng Lu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Information Science and Engineering, Southeast University, Nanjing 210096, China
| | - Jiateng Liu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medicial Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|
40
|
Lee C, Hong J, Jung H. N-Step Pre-Training and Décalcomanie Data Augmentation for Micro-Expression Recognition. SENSORS (BASEL, SWITZERLAND) 2022; 22:6671. [PMID: 36081132 PMCID: PMC9460268 DOI: 10.3390/s22176671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/19/2022] [Accepted: 08/28/2022] [Indexed: 06/15/2023]
Abstract
Facial expressions are divided into micro- and macro-expressions. Micro-expressions are low-intensity emotions presented for a short moment of about 0.25 s, whereas macro-expressions last up to 4 s. To derive micro-expressions, participants are asked to suppress their emotions as much as possible while watching emotion-inducing videos. However, it is a challenging process, and the number of samples collected tends to be less than those of macro-expressions. Because training models with insufficient data may lead to decreased performance, this study proposes two ways to solve the problem of insufficient data for micro-expression training. The first method involves N-step pre-training, which performs multiple transfer learning from action recognition datasets to those in the facial domain. Second, we propose Décalcomanie data augmentation, which is based on facial symmetry, to create a composite image by cutting and pasting both faces around their center lines. The results show that the proposed methods can successfully overcome the data shortage problem and achieve high performance.
Collapse
|
41
|
Ben X, Ren Y, Zhang J, Wang SJ, Kpalma K, Meng W, Liu YJ. Video-Based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:5826-5846. [PMID: 33739920 DOI: 10.1109/tpami.2021.3067464] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unlike the conventional facial expressions, micro-expressions are involuntary and transient facial expressions capable of revealing the genuine emotions that people attempt to hide. Therefore, they can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, however, their detection and recognition is difficult and relies heavily on expert experiences. Due to its intrinsic particularity and complexity, video-based micro-expression analysis is attractive but challenging, and has recently become an active area of research. Although there have been numerous developments in this area, thus far there has been no comprehensive survey that provides researchers with a systematic overview of these developments with a unified evaluation. Accordingly, in this survey paper, we first highlight the key differences between macro- and micro-expressions, then use these differences to guide our research survey of video-based micro-expression analysis in a cascaded structure, encompassing the neuropsychological basis, datasets, features, spotting algorithms, recognition algorithms, applications and evaluation of state-of-the-art approaches. For each aspect, the basic techniques, advanced developments and major challenges are addressed and discussed. Furthermore, after considering the limitations of existing micro-expression datasets, we present and release a new dataset - called micro-and-macro expression warehouse (MMEW) - containing more video samples and more labeled emotion types. We then perform a unified comparison of representative methods on CAS(ME) 2 for spotting, and on MMEW and SAMM for recognition, respectively. Finally, some potential future research directions are explored and outlined.
Collapse
|
42
|
Gan Y, See J, Khor HQ, Liu KH, Liong ST. Needle in a Haystack: Spotting and recognising micro-expressions “in the wild”. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.06.101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
43
|
Ge Y, Su R, Liang Z, Luo J, Tian S, Shen X, Wu H, Liu C. Transcranial Direct Current Stimulation Over the Right Temporal Parietal Junction Facilitates Spontaneous Micro-Expression Recognition. Front Hum Neurosci 2022; 16:933831. [PMID: 35874155 PMCID: PMC9305610 DOI: 10.3389/fnhum.2022.933831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 06/21/2022] [Indexed: 11/19/2022] Open
Abstract
Micro-expressions are fleeting and subtle emotional expressions. As they are spontaneous and uncontrollable by one's mind, micro-expressions are considered an indicator of genuine emotions. Their accurate recognition and interpretation promote interpersonal interaction and social communication. Therefore, enhancing the ability to recognize micro-expressions has captured much attention. In the current study, we investigated the effects of training on micro-expression recognition with a Chinese version of the Micro-Expression Training Tool (METT). Our goal was to confirm whether the recognition accuracy of spontaneous micro-expressions could be improved through training and brain stimulation. Since the right temporal parietal junction (rTPJ) has been shown to be involved in the explicit process of facial emotion recognition, we hypothesized that the rTPJ would play a role in facilitating the recognition of micro-expressions. The results showed that anodal transcranial direct-current stimulation (tDCS) of the rTPJ indeed improved the recognition of spontaneous micro-expressions, especially for those associated with fear. The improved accuracy of recognizing fear spontaneous micro-expressions was positively correlated with personal distress in the anodal group but not in the sham group. Our study supports that the combined use of tDCS and METT can be a viable way to train and enhance micro-expression recognition.
Collapse
Affiliation(s)
- Yue Ge
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
- Beijing Institute of Biomedicine, Beijing, China
| | - Rui Su
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Zilu Liang
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Jing Luo
- Beijing Institute of Biomedicine, Beijing, China
| | - Suizi Tian
- School of Psychology, Beijing Normal University, Beijing, China
| | - Xunbing Shen
- College of Humanities, Jiangxi University of Chinese Medicine, Nanchang, China
| | - Haiyan Wu
- Centre for Cognitive and Brain Sciences and Department of Psychology, University of Macau, Taipa, China
| | - Chao Liu
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| |
Collapse
|
44
|
Flotho P, Heiss C, Steidl G, Strauss DJ. Lagrangian Motion Magnification with Landmark-Prior and Sparse PCA for Facial Microexpressions and Micromovements. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:2215-2218. [PMID: 36086177 DOI: 10.1109/embc48229.2022.9871549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Video motion magnification methods are motion visualization techniques that aim to magnify subtle and imper-ceptibly small motions in videos. They fall into two main groups where Eulerian methods work on the pixel grid with implicit motion information and Lagrangian methods use explicitly estimated motion and modify point trajectories. The motion in high framerate videos of faces can contain a wide variety of information that ranges from microexpressions over pulse or respiratory rate to cues on speech and affective state. In his work, we propose a novel strategy for Lagrangian motion magnification that integrates landmark information from the face as well as an approach to decompose facial motions in an unsupervised manner using sparse PCA. We decompose the estimated displacements into different movement components that are subsequently amplified selectively. We propose two approaches: A landmark-based decomposition into global and local movements and a decomposition into multiple coherent motion components based on sparse PCA. Optical flow estimation is performed using a state-of-the-art deep learning-based method that we retrain on a microexpression database. Clinical relevance- This method could be applied to the annotation and analysis of micromovements for neurocognitive assessment and even novel, medical applications where micro-motions of the face might play a role.
Collapse
|
45
|
Liu Y, Li Y, Yi X, Hu Z, Zhang H, Liu Y. Lightweight ViT Model for Micro-Expression Recognition Enhanced by Transfer Learning. Front Neurorobot 2022; 16:922761. [PMID: 35845761 PMCID: PMC9280988 DOI: 10.3389/fnbot.2022.922761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 05/16/2022] [Indexed: 11/23/2022] Open
Abstract
As opposed to macro-expressions, micro-expressions are subtle and not easily detectable emotional expressions, often containing rich information about mental activities. The practical recognition of micro-expressions is essential in interrogation and healthcare. Neural networks are currently one of the most common approaches to micro-expression recognition. Still, neural networks often increase their complexity when improving accuracy, and overly large neural networks require extremely high hardware requirements for running equipment. In recent years, vision transformers based on self-attentive mechanisms have achieved accuracy in image recognition and classification that is no less than that of neural networks. Still, the drawback is that without the image-specific biases inherent to neural networks, the cost of improving accuracy is an exponential increase in the number of parameters. This approach describes training a facial expression feature extractor by transfer learning and then fine-tuning and optimizing the MobileViT model to perform the micro-expression recognition task. First, the CASME II, SAMM, and SMIC datasets are combined into a compound dataset, and macro-expression samples are extracted from the three macro-expression datasets. Each macro-expression sample and micro-expression sample are pre-processed identically to make them similar. Second, the macro-expression samples were used to train the MobileNetV2 block in MobileViT as a facial expression feature extractor and to save the weights when the accuracy was highest. Finally, some of the hyperparameters of the MobileViT model are determined by grid search and then fed into the micro-expression samples for training. The samples are classified using an SVM classifier. In the experiments, the proposed method obtained an accuracy of 84.27%, and the time to process individual samples was only 35.4 ms. Comparative experiments show that the proposed method is comparable to state-of-the-art methods in terms of accuracy while improving recognition efficiency.
Collapse
Affiliation(s)
- Yanju Liu
- School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing, China
| | - Yange Li
- School of Computer and Control Engineering, Qiqihar University, Qiqihar, China
| | - Xinhai Yi
- School of Computer and Control Engineering, Qiqihar University, Qiqihar, China
| | - Zuojin Hu
- School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing, China
| | - Huiyu Zhang
- School of Computer and Control Engineering, Qiqihar University, Qiqihar, China
| | - Yanzhong Liu
- School of Computer and Control Engineering, Qiqihar University, Qiqihar, China
- *Correspondence: Yanzhong Liu
| |
Collapse
|
46
|
Wu Q, Xie Y, Liu X, Liu Y. Oxytocin Impairs the Recognition of Micro-Expressions of Surprise and Disgust. Front Psychol 2022; 13:947418. [PMID: 35846599 PMCID: PMC9277341 DOI: 10.3389/fpsyg.2022.947418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
As fleeting facial expressions which reveal the emotion that a person tries to conceal, micro-expressions have great application potentials for fields like security, national defense and medical treatment. However, the physiological basis for the recognition of these facial expressions is poorly understood. In the present research, we utilized a double-blind, placebo-controlled, mixed-model experimental design to investigate the effects of oxytocin on the recognition of micro-expressions in three behavioral studies. Specifically, in Studies 1 and 2, participants were asked to perform a laboratory-based standardized micro-expression recognition task after self-administration of a single dose of intranasal oxytocin (40 IU) or placebo (containing all ingredients except for the neuropeptide). In Study 3, we further examined the effects of oxytocin on the recognition of natural micro-expressions. The results showed that intranasal oxytocin decreased the recognition speed for standardized intense micro-expressions of surprise (Study 1) and decreased the recognition accuracy for standardized subtle micro-expressions of disgust (Study 2). The results of Study 3 further revealed that intranasal oxytocin administration significantly reduced the recognition accuracy for natural micro-expressions of surprise and disgust. The present research is the first to investigate the effects of oxytocin on micro-expression recognition. It suggests that the oxytocin mainly plays an inhibiting role in the recognition of micro-expressions and there are fundamental differences in the neurophysiological basis for the recognition of micro-expressions and macro-expressions.
Collapse
Affiliation(s)
- Qi Wu
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China
- Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
- *Correspondence: Qi Wu,
| | - Yanni Xie
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China
- Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Xuanchen Liu
- Department of Psychology, School of Educational Science, Hunan Normal University, Changsha, China
- Cognition and Human Behavior Key Laboratory of Hunan Province, Hunan Normal University, Changsha, China
| | - Yulong Liu
- School of Finance and Management, Changsha Social Work College, Changsha, China
| |
Collapse
|
47
|
Saffaryazdi N, Wasim ST, Dileep K, Nia AF, Nanayakkara S, Broadbent E, Billinghurst M. Using Facial Micro-Expressions in Combination With EEG and Physiological Signals for Emotion Recognition. Front Psychol 2022; 13:864047. [PMID: 35837650 PMCID: PMC9275379 DOI: 10.3389/fpsyg.2022.864047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 05/30/2022] [Indexed: 11/13/2022] Open
Abstract
Emotions are multimodal processes that play a crucial role in our everyday lives. Recognizing emotions is becoming more critical in a wide range of application domains such as healthcare, education, human-computer interaction, Virtual Reality, intelligent agents, entertainment, and more. Facial macro-expressions or intense facial expressions are the most common modalities in recognizing emotional states. However, since facial expressions can be voluntarily controlled, they may not accurately represent emotional states. Earlier studies have shown that facial micro-expressions are more reliable than facial macro-expressions for revealing emotions. They are subtle, involuntary movements responding to external stimuli that cannot be controlled. This paper proposes using facial micro-expressions combined with brain and physiological signals to more reliably detect underlying emotions. We describe our models for measuring arousal and valence levels from a combination of facial micro-expressions, Electroencephalography (EEG) signals, galvanic skin responses (GSR), and Photoplethysmography (PPG) signals. We then evaluate our model using the DEAP dataset and our own dataset based on a subject-independent approach. Lastly, we discuss our results, the limitations of our work, and how these limitations could be overcome. We also discuss future directions for using facial micro-expressions and physiological signals in emotion recognition.
Collapse
Affiliation(s)
- Nastaran Saffaryazdi
- Empathic Computing Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Syed Talal Wasim
- Empathic Computing Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Kuldeep Dileep
- Empathic Computing Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Alireza Farrokhi Nia
- Empathic Computing Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Suranga Nanayakkara
- Augmented Human Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Elizabeth Broadbent
- Department of Psychological Medicine, The University of Auckland, Auckland, New Zealand
| | - Mark Billinghurst
- Empathic Computing Laboratory, Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
48
|
Micro-Expression-Based Emotion Recognition Using Waterfall Atrous Spatial Pyramid Pooling Networks. SENSORS 2022; 22:s22124634. [PMID: 35746417 PMCID: PMC9227116 DOI: 10.3390/s22124634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 06/04/2022] [Accepted: 06/16/2022] [Indexed: 02/04/2023]
Abstract
Understanding a person’s attitude or sentiment from their facial expressions has long been a straightforward task for humans. Numerous methods and techniques have been used to classify and interpret human emotions that are commonly communicated through facial expressions, with either macro- or micro-expressions. However, performing this task using computer-based techniques or algorithms has been proven to be extremely difficult, whereby it is a time-consuming task to annotate it manually. Compared to macro-expressions, micro-expressions manifest the real emotional cues of a human, which they try to suppress and hide. Different methods and algorithms for recognizing emotions using micro-expressions are examined in this research, and the results are presented in a comparative approach. The proposed technique is based on a multi-scale deep learning approach that aims to extract facial cues of various subjects under various conditions. Then, two popular multi-scale approaches are explored, Spatial Pyramid Pooling (SPP) and Atrous Spatial Pyramid Pooling (ASPP), which are then optimized to suit the purpose of emotion recognition using micro-expression cues. There are four new architectures introduced in this paper based on multi-layer multi-scale convolutional networks using both direct and waterfall network flows. The experimental results show that the ASPP module with waterfall network flow, which we coined as WASPP-Net, outperforms the state-of-the-art benchmark techniques with an accuracy of 80.5%. For future work, a high-resolution approach to multi-scale approaches can be explored to further improve the recognition performance.
Collapse
|
49
|
Micro-Expression Recognition Based on Optical Flow and PCANet+. SENSORS 2022; 22:s22114296. [PMID: 35684917 PMCID: PMC9185295 DOI: 10.3390/s22114296] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 05/10/2022] [Accepted: 05/31/2022] [Indexed: 11/27/2022]
Abstract
Micro-expressions are rapid and subtle facial movements. Different from ordinary facial expressions in our daily life, micro-expressions are very difficult to detect and recognize. In recent years, due to a wide range of potential applications in many domains, micro-expression recognition has aroused extensive attention from computer vision. Because available micro-expression datasets are very small, deep neural network models with a huge number of parameters are prone to over-fitting. In this article, we propose an OF-PCANet+ method for micro-expression recognition, in which we design a spatiotemporal feature learning strategy based on shallow PCANet+ model, and we incorporate optical flow sequence stacking with the PCANet+ network to learn discriminative spatiotemporal features. We conduct comprehensive experiments on publicly available SMIC and CASME2 datasets. The results show that our lightweight model obviously outperforms popular hand-crafted methods and also achieves comparable performances with deep learning based methods, such as 3D-FCNN and ELRCN.
Collapse
|
50
|
Zhao S, Tang H, Liu S, Zhang Y, Wang H, Xu T, Chen E, Guan C. ME-PLAN: A deep prototypical learning with local attention network for dynamic micro-expression recognition. Neural Netw 2022; 153:427-443. [DOI: 10.1016/j.neunet.2022.06.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 05/09/2022] [Accepted: 06/20/2022] [Indexed: 10/17/2022]
|