1
|
Ahmad A, Li Z, Iqbal S, Aurangzeb M, Tariq I, Flah A, Blazek V, Prokop L. A comprehensive bibliometric survey of micro-expression recognition system based on deep learning. Heliyon 2024; 10:e27392. [PMID: 38495163 PMCID: PMC10943397 DOI: 10.1016/j.heliyon.2024.e27392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 02/21/2024] [Accepted: 02/28/2024] [Indexed: 03/19/2024] Open
Abstract
Micro-expressions (ME) are rapidly occurring expressions that reveal the true emotions that a human being is trying to hide, cover, or suppress. These expressions, which reveal a person's actual feelings, have a broad spectrum of applications in public safety and clinical diagnosis. This study provides a comprehensive review of the area of ME recognition. A bibliometric and network analysis techniques is used to compile all the available literature related to ME recognition. A total of 735 publications from the Web of Science (WOS) and Scopus databases were evaluated from December 2012 to December 2022 using all relevant keywords. The first round of data screening produced some basic information, which was further extracted for citation, coupling, co-authorship, co-occurrence, bibliographic, and co-citation analysis. Additionally, a thematic and descriptive analysis was executed to investigate the content of prior research findings, and research techniques used in the literature. The year wise publications indicated that the published literature between 2012 and 2017 was relatively low but however by 2021, a nearly 24-fold increment made it to 154 publications. The three topmost productive journals and conferences included IEEE Transactions on Affective Computing (n = 20 publications) followed by Neurocomputing (n = 17) and Multimedia tools and applications (n = 15). Zhao G was the most proficient author with 48 publications and the top influential country was China (620 publications). Publications by citations showed that each of the authors acquired citations ranging from 100 to 1225. While publications by organizations indicated that the University of Oulu had the most published papers (n = 51). Deep learning, facial expression recognition, and emotion recognition were among the most frequently used terms. It has been discovered that ME research was primarily classified in the discipline of engineering, with more contribution from China and Malaysia comparatively.
Collapse
Affiliation(s)
- Adnan Ahmad
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Zhao Li
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Sheeraz Iqbal
- Department of Electrical Engineering, University of Azad Jammu and Kashmir, Muzaffarabad, 13100, AJK, Pakistan
| | - Muhammad Aurangzeb
- School of Electrical Engineering, Southeast University, Nanjing, 210096, China
| | - Irfan Tariq
- Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Ayman Flah
- College of Engineering, University of Business and Technology (UBT), Jeddah, 21448, Saudi Arabia
- MEU Research Unit, Middle East University, Amman, Jordan
- The Private Higher School of Applied Sciences and Technology of Gabes, University of Gabes, Gabes, Tunisia
- National Engineering School of Gabes, University of Gabes, Gabes, 6029, Tunisia
| | - Vojtech Blazek
- ENET Centre, VSB—Technical University of Ostrava, Ostrava, Czech Republic
| | - Lukas Prokop
- ENET Centre, VSB—Technical University of Ostrava, Ostrava, Czech Republic
| |
Collapse
|
2
|
Bai J, Yang X, Li Q, Zhao J, Guo S. S3DCN-OLSR: A shallow 3D CNN method for online learning state recognition. Heliyon 2023; 9:e20508. [PMID: 37867877 PMCID: PMC10589783 DOI: 10.1016/j.heliyon.2023.e20508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 09/19/2023] [Accepted: 09/27/2023] [Indexed: 10/24/2023] Open
Abstract
The repeated recurrence of COVID-19 has significantly disrupted learning for students in face-to-face instructional settings. While moving from offline to online instruction has proven to be one of the best solutions, classroom management and capturing students' learning states have emerged as important challenges with the increasing popularity of online instruction. To address these challenges, in this paper we propose an online learning status recognition method based on shallow 3D convolution (S3DC-OLSR) for online students, to identify students' online learning states by analysing their micro-expressions. Specifically, we first use the data augmentation method proposed in this paper to decompose the students' online video file into three features: horizontal component of optical flow, vertical component of optical flow and optical amplitude. Next, the students' online learning status is recognised by feeding the processed data into a shallow 3D convolution neural network. To test the performance of our method, we conduct extensive experiments on the CASME II and SMIC datasets, and the results indicate that our method outperforms the other state-of-the-art methods considered in terms of recognition accuracy, UF1 and UAR, which demonstrates the superiority of our method in identifying students' online learning states.
Collapse
Affiliation(s)
- Jing Bai
- Northwest Normal University, Lanzhou, 730070, China
| | | | - Qi Li
- Northwest Normal University, Lanzhou, 730070, China
| | - Jinxiong Zhao
- State Grid Gansu Electric Power Research Institute, Lanzhou, 730070, China
- School of Cybersecurity, Northwestern Polytechnical University, Xi'an, Shaanxi, 710072, China
| | - Sensen Guo
- School of Cybersecurity, Northwestern Polytechnical University, Xi'an, Shaanxi, 710072, China
| |
Collapse
|
3
|
Yang H, Xie L, Pan H, Li C, Wang Z, Zhong J. Multimodal Attention Dynamic Fusion Network for Facial Micro-Expression Recognition. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1246. [PMID: 37761545 PMCID: PMC10528512 DOI: 10.3390/e25091246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/07/2023] [Accepted: 08/15/2023] [Indexed: 09/29/2023]
Abstract
The emotional changes in facial micro-expressions are combinations of action units. The researchers have revealed that action units can be used as additional auxiliary data to improve facial micro-expression recognition. Most of the researchers attempt to fuse image features and action unit information. However, these works ignore the impact of action units on the facial image feature extraction process. Therefore, this paper proposes a local detail feature enhancement model based on a multimodal dynamic attention fusion network (MADFN) method for micro-expression recognition. This method uses a masked autoencoder based on learnable class tokens to remove local areas with low emotional expression ability in micro-expression images. Then, we utilize the action unit dynamic fusion module to fuse action unit representation to improve the potential representation ability of image features. The state-of-the-art performance of our proposed model is evaluated and verified on SMIC, CASME II, SAMM, and their combined 3DB-Combined datasets. The experimental results demonstrated that the proposed model achieved competitive performance with accuracy rates of 81.71%, 82.11%, and 77.21% on SMIC, CASME II, and SAMM datasets, respectively, that show the MADFN model can help to improve the discrimination of facial image emotional features.
Collapse
Affiliation(s)
- Hongling Yang
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Lun Xie
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Hang Pan
- Department of Computer Science, Changzhi University, Changzhi 046011, China;
| | - Chiqin Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Zhiliang Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; (L.X.); (C.L.); (Z.W.)
| | - Jialiang Zhong
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang 330031, China;
| |
Collapse
|
4
|
Li J, Dong Z, Lu S, Wang SJ, Yan WJ, Ma Y, Liu Y, Huang C, Fu X. CAS(ME) 3: A Third Generation Facial Spontaneous Micro-Expression Database With Depth Information and High Ecological Validity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2782-2800. [PMID: 35560102 DOI: 10.1109/tpami.2022.3174895] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Micro-expression (ME) is a significant non-verbal communication clue that reveals one person's genuine emotional state. The development of micro-expression analysis (MEA) has just gained attention in the last decade. However, the small sample size problem constrains the use of deep learning on MEA. Besides, ME samples distribute in six different databases, leading to database bias. Moreover, the ME database development is complicated. In this article, we introduce a large-scale spontaneous ME database: CAS(ME) 3. The contribution of this article is summarized as follows: (1) CAS(ME) 3 offers around 80 hours of videos with over 8,000,000 frames, including manually labeled 1,109 MEs and 3,490 macro-expressions. Such a large sample size allows effective MEA method validation while avoiding database bias. (2) Inspired by psychological experiments, CAS(ME) 3 provides the depth information as an additional modality unprecedentedly, contributing to multi-modal MEA. (3) For the first time, CAS(ME) 3 elicits ME with high ecological validity using the mock crime paradigm, along with physiological and voice signals, contributing to practical MEA. (4) Besides, CAS(ME) 3 provides 1,508 unlabeled videos with more than 4,000,000 frames, i.e., a data platform for unsupervised MEA methods. (5) Finally, we demonstrate the effectiveness of depth information by the proposed depth flow algorithm and RGB-D information.
Collapse
|
5
|
Verma M, Reddy MSK, Meedimale YR, Mandal M, Vipparthi SK. AutoMER: Spatiotemporal Neural Architecture Search for Microexpression Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6116-6128. [PMID: 33886480 DOI: 10.1109/tnnls.2021.3072290] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Facial microexpressions offer useful insights into subtle human emotions. This unpremeditated emotional leakage exhibits the true emotions of a person. However, the minute temporal changes in the video sequences are very difficult to model for accurate classification. In this article, we propose a novel spatiotemporal architecture search algorithm, AutoMER for microexpression recognition (MER). Our main contribution is a new parallelogram design-based search space for efficient architecture search. We introduce a spatiotemporal feature module named 3-D singleton convolution for cell-level analysis. Furthermore, we present four such candidate operators and two 3-D dilated convolution operators to encode the raw video sequences in an end-to-end manner. To the best of our knowledge, this is the first attempt to discover 3-D convolutional neural network (CNN) architectures with a network-level search for MER. The searched models using the proposed AutoMER algorithm are evaluated over five microexpression data sets: CASME-I, SMIC, CASME-II, CAS(ME) ∧2 , and SAMM. The proposed generated models quantitatively outperform the existing state-of-the-art approaches. The AutoMER is further validated with different configurations, such as downsampling rate factor, multiscale singleton 3-D convolution, parallelogram, and multiscale kernels. Overall, five ablation experiments were conducted to analyze the operational insights of the proposed AutoMER.
Collapse
|
6
|
Fan X, Shahid AR, Yan H. Edge-aware motion based facial micro-expression generation with attention mechanism. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
7
|
Using Computer Vision to Track Facial Color Changes and Predict Heart Rate. J Imaging 2022; 8:jimaging8090245. [PMID: 36135410 PMCID: PMC9503443 DOI: 10.3390/jimaging8090245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 08/23/2022] [Accepted: 09/06/2022] [Indexed: 11/17/2022] Open
Abstract
The current technological advances have pushed the quantification of exercise intensity to new era of physical exercise sciences. Monitoring physical exercise is essential in the process of planning, applying, and controlling loads for performance optimization and health. A lot of research studies applied various statistical approaches to estimate various physiological indices, to our knowledge, no studies found to investigate the relationship of facial color changes and increased exercise intensity. The aim of this study was to develop a non-contact method based on computer vision to determine the heart rate and, ultimately, the exercise intensity. The method was based on analyzing facial color changes during exercise by using RGB, HSV, YCbCr, Lab, and YUV color models. Nine university students participated in the study (mean age = 26.88 ± 6.01 years, mean weight = 72.56 ± 14.27 kg, mean height = 172.88 ± 12.04 cm, six males and three females, and all white Caucasian). The data analyses were carried out separately for each participant (personalized model) as well as all the participants at a time (universal model). The multiple auto regressions, and a multiple polynomial regression model were designed to predict maximum heart rate percentage (maxHR%) from each color models. The results were analyzed and evaluated using Root Mean Square Error (RMSE), F-values, and R-square. The multiple polynomial regression using all participants exhibits the best accuracy with RMSE of 6.75 (R-square = 0.78). Exercise prescription and monitoring can benefit from the use of these methods, for example, to optimize the process of online monitoring, without having the need to use any other instrumentation.
Collapse
|
8
|
Ben X, Ren Y, Zhang J, Wang SJ, Kpalma K, Meng W, Liu YJ. Video-Based Facial Micro-Expression Analysis: A Survey of Datasets, Features and Algorithms. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:5826-5846. [PMID: 33739920 DOI: 10.1109/tpami.2021.3067464] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unlike the conventional facial expressions, micro-expressions are involuntary and transient facial expressions capable of revealing the genuine emotions that people attempt to hide. Therefore, they can provide important information in a broad range of applications such as lie detection, criminal detection, etc. Since micro-expressions are transient and of low intensity, however, their detection and recognition is difficult and relies heavily on expert experiences. Due to its intrinsic particularity and complexity, video-based micro-expression analysis is attractive but challenging, and has recently become an active area of research. Although there have been numerous developments in this area, thus far there has been no comprehensive survey that provides researchers with a systematic overview of these developments with a unified evaluation. Accordingly, in this survey paper, we first highlight the key differences between macro- and micro-expressions, then use these differences to guide our research survey of video-based micro-expression analysis in a cascaded structure, encompassing the neuropsychological basis, datasets, features, spotting algorithms, recognition algorithms, applications and evaluation of state-of-the-art approaches. For each aspect, the basic techniques, advanced developments and major challenges are addressed and discussed. Furthermore, after considering the limitations of existing micro-expression datasets, we present and release a new dataset - called micro-and-macro expression warehouse (MMEW) - containing more video samples and more labeled emotion types. We then perform a unified comparison of representative methods on CAS(ME) 2 for spotting, and on MMEW and SAMM for recognition, respectively. Finally, some potential future research directions are explored and outlined.
Collapse
|
9
|
Zhao S, Tang H, Liu S, Zhang Y, Wang H, Xu T, Chen E, Guan C. ME-PLAN: A deep prototypical learning with local attention network for dynamic micro-expression recognition. Neural Netw 2022; 153:427-443. [DOI: 10.1016/j.neunet.2022.06.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 05/09/2022] [Accepted: 06/20/2022] [Indexed: 10/17/2022]
|
10
|
Investigating the significance of color space for abnormality detection in wireless capsule endoscopy images. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103624] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
11
|
Cen S, Yu Y, Yan G, Yu M, Kong Y. Micro-expression recognition based on facial action learning with muscle movement constraints. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-202962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
As a spontaneous facial expression, micro-expression reveals the psychological responses of human beings. However, micro-expression recognition (MER) is highly susceptible to noise interference due to the short existing time and low-intensity of facial actions. Research on facial action coding systems explores the correlation between emotional states and facial actions, which provides more discriminative features. Therefore, based on the exploration of correlation information, the goal of our work is to propose a spatiotemporal network that is robust to low-intensity muscle movements for the MER task. Firstly, a multi-scale weighted module is proposed to encode the spatial global context, which is obtained by merging features of different resolutions preserved from the backbone network. Secondly, we propose a multi-task-based facial action learning module using the constraints of the correlation between muscle movement and micro-expressions to encode local action features. Besides, a clustering constraint term is introduced to restrict the feature distribution of similar actions to improve categories’ separability in feature space. Finally, the global context and local action features are stacked as high-quality spatial descriptions to predict micro-expressions by passing through the Convolutional Long Short-Term Memory (ConvLSTM) network. The proposed method is proved to outperform other mainstream methods through comparative experiments on the SMIC, CASME-I, and CASME-II datasets.
Collapse
Affiliation(s)
- Shixin Cen
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, P.R. China
| | - Yang Yu
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, P.R. China
| | - Gang Yan
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, P.R. China
| | - Ming Yu
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, P.R. China
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, P.R. China
| | - Yanlei Kong
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, P.R. China
| |
Collapse
|
12
|
Lai X, Huang Q, Xin J, Yu H, Wen J, Huang S, Zhang H, Shen H, Tang Y. Identifying Methamphetamine Abstainers With Convolutional Neural Networks and Short-Time Fourier Transform. Front Psychol 2021; 12:684001. [PMID: 34456796 PMCID: PMC8385271 DOI: 10.3389/fpsyg.2021.684001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 07/12/2021] [Indexed: 11/13/2022] Open
Abstract
Few studies have investigated the functional patterns of methamphetamine abstainers. A better understanding of the underlying neurobiological mechanism in the brains of methamphetamine abstainers will help to explain their abnormal behaviors. Forty-two male methamphetamine abstainers, currently in a long-term abstinence status (for at least 14 months), and 32 male healthy controls were recruited. All subjects underwent functional MRI while responding to drug-associated cues. This study proposes to combine a convolutional neural network with a short-time Fourier transform to identify different brain patterns between methamphetamine abstainers and controls. The short-time Fourier transformation provides time-localized frequency information, while the convolutional neural network extracts the structural features of the time-frequency spectrograms. The results showed that the classifier achieved a satisfactory performance (98.9% accuracy) and could extract robust brain voxel information. The highly discriminative power voxels were mainly concentrated in the left inferior orbital frontal gyrus, the bilateral postcentral gyri, and the bilateral paracentral lobules. This study provides a novel insight into the different functional patterns between methamphetamine abstainers and healthy controls. It also elucidates the pathological mechanism of methamphetamine abstainers from the view of time-frequency spectrograms.
Collapse
Affiliation(s)
- Xin Lai
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Qiuping Huang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, The Second Xiangya Hospital of Central South University, Changsha, China.,Institute of Mental Health of Central South University, Chinese National Technology Institute on Mental Disorders, Hunan Key Laboratory of Psychiatry and Mental Health, Hunan Medical Center for Mental Health, Changsha, China
| | - Jiang Xin
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hufei Yu
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Jingxi Wen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Shucai Huang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, The Second Xiangya Hospital of Central South University, Changsha, China.,Institute of Mental Health of Central South University, Chinese National Technology Institute on Mental Disorders, Hunan Key Laboratory of Psychiatry and Mental Health, Hunan Medical Center for Mental Health, Changsha, China.,The Fourth People's Hospital of Wuhu, Wuhu, China
| | - Hao Zhang
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hongxian Shen
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, The Second Xiangya Hospital of Central South University, Changsha, China.,Institute of Mental Health of Central South University, Chinese National Technology Institute on Mental Disorders, Hunan Key Laboratory of Psychiatry and Mental Health, Hunan Medical Center for Mental Health, Changsha, China
| | - Yan Tang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
13
|
Review of Automatic Microexpression Recognition in the Past Decade. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3020021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Facial expressions provide important information concerning one’s emotional state. Unlike regular facial expressions, microexpressions are particular kinds of small quick facial movements, which generally last only 0.05 to 0.2 s. They reflect individuals’ subjective emotions and real psychological states more accurately than regular expressions which can be acted. However, the small range and short duration of facial movements when microexpressions happen make them challenging to recognize both by humans and machines alike. In the past decade, automatic microexpression recognition has attracted the attention of researchers in psychology, computer science, and security, amongst others. In addition, a number of specialized microexpression databases have been collected and made publicly available. The purpose of this article is to provide a comprehensive overview of the current state of the art automatic facial microexpression recognition work. To be specific, the features and learning methods used in automatic microexpression recognition, the existing microexpression data sets, the major outstanding challenges, and possible future development directions are all discussed.
Collapse
|
14
|
Zontone P, Affanni A, Bernardini R, Piras A, Rinaldo R, Formaggia F, Minen D, Minen M, Savorgnan C. Car Driver's Sympathetic Reaction Detection Through Electrodermal Activity and Electrocardiogram Measurements. IEEE Trans Biomed Eng 2020; 67:3413-3424. [PMID: 32305889 DOI: 10.1109/tbme.2020.2987168] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE in this paper we propose a system to detect a subject's sympathetic reaction, which is related to unexpected or challenging events during a car drive. METHODS we use the Electrocardiogram (ECG) signal and the Skin Potential Response (SPR) signal, which has several advantages with respect to other Electrodermal (EDA) signals. We record one SPR signal for each hand, and use an algorithm that, selecting the smoother signal, is able to remove motion artifacts. We extract statistical features from the ECG and SPR signals in order to classify signal segments and identify the presence or absence of emotional events via a Supervised Learning Algorithm. The experiments were carried out in a company which specializes in driving simulator equipment, using a motorized platform and a driving simulator. Different subjects were tested with this setup, with different challenging events happening on predetermined locations on the track. RESULTS we obtain an Accuracy as high as 79.10% for signal blocks and as high as 91.27% for events. CONCLUSION results demonstrate the good performance of the presented system in detecting sympathetic reactions, and the effectiveness of the motion artifact removal procedure. SIGNIFICANCE our work demonstrates the possibility to classify the emotional state of the driver, using the ECG and EDA signals and a slightly invasive setup. In particular, the proposed use of SPR and of the motion artifact removal procedure are crucial for the effectiveness of the system.
Collapse
|
15
|
Li Y, Huang X, Zhao G. Joint Local and Global Information Learning With Single Apex Frame Detection for Micro-Expression Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:249-263. [PMID: 33156789 DOI: 10.1109/tip.2020.3035042] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Micro-expressions (MEs) are rapid and subtle facial movements that are difficult to detect and recognize. Most recent works have attempted to recognize MEs with spatial and temporal information from video clips. According to psychological studies, the apex frame conveys the most emotional information expressed in facial expressions. However, it is not clear how the single apex frame contributes to micro-expression recognition. To alleviate that problem, this paper firstly proposes a new method to detect the apex frame by estimating pixel-level change rates in the frequency domain. With frequency information, it performs more effectively on apex frame spotting than the currently existing apex frame spotting methods based on the spatio-temporal change information. Secondly, with the apex frame, this paper proposes a joint feature learning architecture coupling local and global information to recognize MEs, because not all regions make the same contribution to ME recognition and some regions do not even contain any emotional information. More specifically, the proposed model involves the local information learned from the facial regions contributing major emotion information, and the global information learned from the whole face. Leveraging the local and global information enables our model to learn discriminative ME representations and suppress the negative influence of unrelated regions to MEs. The proposed method is extensively evaluated using CASME, CASME II, SAMM, SMIC, and composite databases. Experimental results demonstrate that our method with the detected apex frame achieves considerably promising ME recognition performance, compared with the state-of-the-art methods employing the whole ME sequence. Moreover, the results indicate that the apex frame can significantly contribute to micro-expression recognition.
Collapse
|
16
|
Xia Z, Peng W, Khor HQ, Feng X, Zhao G. Revealing the Invisible with Model and Data Shrinking for Composite-database Micro-expression Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:8590-8605. [PMID: 32845838 DOI: 10.1109/tip.2020.3018222] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Composite-database micro-expression recognition is attracting increasing attention as it is more practical for real-world applications. Though the composite database provides more sample diversity for learning good representation models, the important subtle dynamics are prone to disappearing in the domain shift such that the models greatly degrade their performance, especially for deep models. In this paper, we analyze the influence of learning complexity, including input complexity and model complexity, and discover that the lower-resolution input data and shallower-architecture model are helpful to ease the degradation of deep models in composite-database task. Based on this, we propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data, shrinking model and input complexities simultaneously. Furthermore, we develop three parameter-free modules (i.e., wide expansion, shortcut connection and attention unit) to integrate with RCN without increasing any learnable parameters. These three modules can enhance the representation ability in various perspectives while preserving not-very-deep architecture for lower-resolution data. Besides, three modules can further be combined by an automatic strategy (a neural architecture search strategy) and the searched architecture becomes more robust. Extensive experiments on the MEGC2019 dataset (composited of existing SMIC, CASME II and SAMM datasets) have verified the influence of learning complexity and shown that RCNs with three modules and the searched combination outperform the state-of-the-art approaches.
Collapse
|
17
|
Abstract
Micro-Expression (ME) recognition is a hot topic in computer vision as it presents a gateway to capture and understand daily human emotions. It is nonetheless a challenging problem due to ME typically being transient (lasting less than 200 ms) and subtle. Recent advances in machine learning enable new and effective methods to be adopted for solving diverse computer vision tasks. In particular, the use of deep learning techniques on large datasets outperforms classical approaches based on classical machine learning which rely on hand-crafted features. Even though available datasets for spontaneous ME are scarce and much smaller, using off-the-shelf Convolutional Neural Networks (CNNs) still demonstrates satisfactory classification results. However, these networks are intense in terms of memory consumption and computational resources. This poses great challenges when deploying CNN-based solutions in many applications, such as driver monitoring and comprehension recognition in virtual classrooms, which demand fast and accurate recognition. As these networks were initially designed for tasks of different domains, they are over-parameterized and need to be optimized for ME recognition. In this paper, we propose a new network based on the well-known ResNet18 which we optimized for ME classification in two ways. Firstly, we reduced the depth of the network by removing residual layers. Secondly, we introduced a more compact representation of optical flow used as input to the network. We present extensive experiments and demonstrate that the proposed network obtains accuracies comparable to the state-of-the-art methods while significantly reducing the necessary memory space. Our best classification accuracy was 60.17% on the challenging composite dataset containing five objectives classes. Our method takes only 24.6 ms for classifying a ME video clip (less than the occurrence time of the shortest ME which lasts 40 ms). Our CNN design is suitable for real-time embedded applications with limited memory and computing resources.
Collapse
|
18
|
Verma M, Vipparthi SK, Singh G, Murala S. LEARNet: Dynamic Imaging Network for Micro Expression Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1618-1627. [PMID: 31545721 DOI: 10.1109/tip.2019.2912358] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Unlike prevalent facial expressions, micro expressions have subtle, involuntary muscle movements which are short-lived in nature. These minute muscle movements reflect true emotions of a person. Due to the short duration and low intensity, these micro-expressions are very difficult to perceive and interpret correctly. In this paper, we propose the dynamic representation of micro-expressions to preserve facial movement information of a video in a single frame. We also propose a Lateral Accretive Hybrid Network (LEARNet) to capture micro-level features of an expression in the facial region. The LEARNet refines the salient expression features in accretive manner by incorporating accretion layers (AL) in the network. The response of the AL holds the hybrid feature maps generated by prior laterally connected convolution layers. Moreover, LEARNet architecture incorporates the cross decoupled relationship between convolution layers which helps in preserving the tiny but influential facial muscle change information. The visual responses of the proposed LEARNet depict the effectiveness of the system by preserving both high- and micro-level edge features of facial expression. The effectiveness of the proposed LEARNet is evaluated on four benchmark datasets: CASME-I, CASME-II, CAS(ME)'2 and SMIC. The experimental results after investigation show a significant improvement of 4.03%, 1.90%, 1.79% and 2.82% as compared with ResNet on CASME-I, CASME-II, CAS(ME)'2 and SMIC datasets respectively.
Collapse
|
19
|
Wang SJ, Lin B, Wang Y, Yi T, Zou B, Lyu XW. Action Units recognition based on Deep Spatial-Convolutional and Multi-label Residual network. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.05.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
20
|
Zhao G, Li X. Automatic Micro-Expression Analysis: Open Challenges. Front Psychol 2019; 10:1833. [PMID: 31447752 PMCID: PMC6692451 DOI: 10.3389/fpsyg.2019.01833] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 07/24/2019] [Indexed: 11/13/2022] Open
Affiliation(s)
- Guoying Zhao
- School of Information and Technology, Northwest University, Xi'an, China
- Center for Machine Vision and Signal Analysis, University of Oulu, Oulu, Finland
| | - Xiaobai Li
- Center for Machine Vision and Signal Analysis, University of Oulu, Oulu, Finland
| |
Collapse
|
21
|
|
22
|
Yu Y, Duan H, Yu M. Spatiotemporal features selection for spontaneous micro-expression recognition. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-172307] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Yang Yu
- School of Computer Science and Engineering, Hebei University of Technology, Tianjin, China
| | - Huiyan Duan
- School of Computer Science and Engineering, Hebei University of Technology, Tianjin, China
| | - Ming Yu
- School of Computer Science and Engineering, Hebei University of Technology, Tianjin, China
| |
Collapse
|
23
|
Abstract
Micro-expressions are brief spontaneous facial expressions that appear on a face when a person conceals an emotion, making them different to normal facial expressions in subtlety and duration. Currently, emotion classes within the CASME II dataset (Chinese Academy of Sciences Micro-expression II) are based on Action Units and self-reports, creating conflicts during machine learning training. We will show that classifying expressions using Action Units, instead of predicted emotion, removes the potential bias of human reporting. The proposed classes are tested using LBP-TOP (Local Binary Patterns from Three Orthogonal Planes), HOOF (Histograms of Oriented Optical Flow) and HOG 3D (3D Histogram of Oriented Gradient) feature descriptors. The experiments are evaluated on two benchmark FACS (Facial Action Coding System) coded datasets: CASME II and SAMM (A Spontaneous Micro-Facial Movement). The best result achieves 86.35% accuracy when classifying the proposed 5 classes on CASME II using HOG 3D, outperforming the result of the state-of-the-art 5-class emotional-based classification in CASME II. Results indicate that classification based on Action Units provides an objective method to improve micro-expression recognition.
Collapse
|
24
|
Micro-expression recognition with small sample size by transferring long-term convolutional neural network. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.05.107] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
25
|
Oh YH, See J, Le Ngo AC, Phan RCW, Baskaran VM. A Survey of Automatic Facial Micro-Expression Analysis: Databases, Methods, and Challenges. Front Psychol 2018; 9:1128. [PMID: 30042706 PMCID: PMC6049018 DOI: 10.3389/fpsyg.2018.01128] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Accepted: 06/12/2018] [Indexed: 11/13/2022] Open
Abstract
Over the last few years, automatic facial micro-expression analysis has garnered increasing attention from experts across different disciplines because of its potential applications in various fields such as clinical diagnosis, forensic investigation and security systems. Advances in computer algorithms and video acquisition technology have rendered machine analysis of facial micro-expressions possible today, in contrast to decades ago when it was primarily the domain of psychiatrists where analysis was largely manual. Indeed, although the study of facial micro-expressions is a well-established field in psychology, it is still relatively new from the computational perspective with many interesting problems. In this survey, we present a comprehensive review of state-of-the-art databases and methods for micro-expressions spotting and recognition. Individual stages involved in the automation of these tasks are also described and reviewed at length. In addition, we also deliberate on the challenges and future directions in this growing field of automatic facial micro-expression analysis.
Collapse
Affiliation(s)
- Yee-Hui Oh
- Faculty of Engineering, Multimedia University Cyberjaya, Malaysia
| | - John See
- Faculty of Computing and Informatics, Multimedia University Cyberjaya, Malaysia
| | - Anh Cat Le Ngo
- School of Psychology, University of Nottingham Nottingham, United Kingdom
| | - Raphael C-W Phan
- Faculty of Engineering, Multimedia University Cyberjaya, Malaysia.,Research Institute for Digital Security, Multimedia University Cyberjaya, Malaysia
| | - Vishnu M Baskaran
- School of Information Technology, Monash University Malaysia Bandar Sunway, Malaysia
| |
Collapse
|
26
|
Zong Y, Zheng W, Huang X, Shi J, Cui Z, Zhao G. Domain Regeneration for Cross-Database Micro-Expression Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2484-2498. [PMID: 29994602 DOI: 10.1109/tip.2018.2797479] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recently, micro-expression recognition has attracted lots of researchers' attention due to its potential value in many practical applications, e.g., lie detection. In this paper, we investigate an interesting and challenging problem in micro-expression recognition, i.e., cross-database micro-expression recognition, in which the training and testing samples come from different micro-expression databases. Under this problem setting, the consistent feature distribution between the training and testing samples originally existing in conventional micro-expression recognition would be seriously broken and hence the performance of most current well-performing micro-expression recognition methods may sharply drop. In order to overcome it, we propose a simple yet effective framework called Domain Regeneration (DR) in this paper. DR framework aims at learning a domain regenerator to regenerate the micro-expression samples from source and target databases respectively such that they can abide by the same or similar feature distributions. Thus, we are able to use the classifier learned based on the labeled source micro-expression samples to predict the label information of the unlabeled target micro-expression samples. To evaluate the proposed DR framework, we conduct extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases. Experimental results show that compared with recent state-of-the-art cross-database emotion recognition methods, the proposed DR framework has more promising performance.
Collapse
|
27
|
Wang SH, Phillips P, Dong ZC, Zhang YD. Intelligent facial emotion recognition based on stationary wavelet entropy and Jaya algorithm. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.08.015] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
28
|
A main directional maximal difference analysis for spotting facial movements from long-term videos. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.12.034] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
29
|
Hu W, Yang Y, Zhang W, Xie Y. Moving Object Detection Using Tensor-Based Low-Rank and Saliently Fused-Sparse Decomposition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:724-737. [PMID: 27849530 DOI: 10.1109/tip.2016.2627803] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we propose a new low-rank and sparse representation model for moving object detection. The model preserves the natural space-time structure of video sequences by representing them as three-way tensors. Then, it operates the low-rank background and sparse foreground decomposition in the tensor framework. On the one hand, we use the tensor nuclear norm to exploit the spatio-temporal redundancy of background based on the circulant algebra. On the other, we use the new designed saliently fused-sparse regularizer (SFS) to adaptively constrain the foreground with spatio-temporal smoothness. To refine the existing foreground smooth regularizers, the SFS incorporates the local spatio-temporal geometric structure information into the tensor total variation by using the 3D locally adaptive regression kernel (3D-LARK). What is more, the SFS further uses the 3D-LARK to compute the space-time motion saliency of foreground, which is combined with the l1 norm and improves the robustness of foreground extraction. Finally, we solve the proposed model with globally optimal guarantee. Extensive experiments on challenging well-known data sets demonstrate that our method significantly outperforms the state-of-the-art approaches and works effectively on a wide range of complex scenarios.
Collapse
|
30
|
Yang Y, Liu C, Yu H, Shao D, Tsow F, Tao N. Motion robust remote photoplethysmography in CIELab color space. JOURNAL OF BIOMEDICAL OPTICS 2016; 21:117001. [PMID: 27812695 PMCID: PMC5995145 DOI: 10.1117/1.jbo.21.11.117001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 10/18/2016] [Indexed: 06/06/2023]
Abstract
Remote photoplethysmography (rPPG) is attractive for tracking a subject’s physiological parameters without wearing a device. However, rPPG is known to be prone to body movement-induced artifacts, making it unreliable in realistic situations. Here we report a method to minimize the movement-induced artifacts. The method selects an optimal region of interest (ROI) automatically, prunes frames in which the ROI is not clearly captured (e.g., subject moves out of the view), and analyzes rPPG using an algorithm in CIELab color space, rather than the widely used RGB color space. We show that body movement primarily affects image intensity, rather than chromaticity, and separating chromaticity from intensity in CIELab color space thus helps achieve effective reduction of the movement-induced artifacts. We validate the method by performing a pilot study including 17 people with diverse skin tones.
Collapse
Affiliation(s)
- Yuting Yang
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
- Nanjing University, School Chemistry and Chemical Engineering, State Key Lab of Analytical Chemistry for Life Science, Nanjing, Jiangsu 210093, China
| | - Chenbin Liu
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
- Nanjing University, School Chemistry and Chemical Engineering, State Key Lab of Analytical Chemistry for Life Science, Nanjing, Jiangsu 210093, China
| | - Hui Yu
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
- Nanjing University, School Chemistry and Chemical Engineering, State Key Lab of Analytical Chemistry for Life Science, Nanjing, Jiangsu 210093, China
| | - Dangdang Shao
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
| | - Francis Tsow
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
| | - Nongjian Tao
- Arizona State University, Biodesign Institute, Tempe, Arizona 85287-5801, United States
- Nanjing University, School Chemistry and Chemical Engineering, State Key Lab of Analytical Chemistry for Life Science, Nanjing, Jiangsu 210093, China
| |
Collapse
|