1
|
Chen L, Li M, Wu M, Pedrycz W, Hirota K. Coupled Multimodal Emotional Feature Analysis Based on Broad-Deep Fusion Networks in Human-Robot Interaction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:9663-9673. [PMID: 37021991 DOI: 10.1109/tnnls.2023.3236320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
A coupled multimodal emotional feature analysis (CMEFA) method based on broad-deep fusion networks, which divide multimodal emotion recognition into two layers, is proposed. First, facial emotional features and gesture emotional features are extracted using the broad and deep learning fusion network (BDFN). Considering that the bi-modal emotion is not completely independent of each other, canonical correlation analysis (CCA) is used to analyze and extract the correlation between the emotion features, and a coupling network is established for emotion recognition of the extracted bi-modal features. Both simulation and application experiments are completed. According to the simulation experiments completed on the bimodal face and body gesture database (FABO), the recognition rate of the proposed method has increased by 1.15% compared to that of the support vector machine recursive feature elimination (SVMRFE) (without considering the unbalanced contribution of features). Moreover, by using the proposed method, the multimodal recognition rate is 21.22%, 2.65%, 1.61%, 1.54%, and 0.20% higher than those of the fuzzy deep neural network with sparse autoencoder (FDNNSA), ResNet-101 + GFK, C3D + MCB + DBN, the hierarchical classification fusion strategy (HCFS), and cross-channel convolutional neural network (CCCNN), respectively. In addition, preliminary application experiments are carried out on our developed emotional social robot system, where emotional robot recognizes the emotions of eight volunteers based on their facial expressions and body gestures.
Collapse
|
2
|
Hinduja S, Nourivandi T, Cohn JF, Canavan S. Time to retire F1-binary score for action unit detection. Pattern Recognit Lett 2024; 182:111-117. [PMID: 39086494 PMCID: PMC11290352 DOI: 10.1016/j.patrec.2024.04.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
Detecting action units is an important task in face analysis, especially in facial expression recognition. This is due, in part, to the idea that expressions can be decomposed into multiple action units. To evaluate systems that detect action units, F1-binary score is often used as the evaluation metric. In this paper, we argue that F1-binary score does not reliably evaluate these models due largely to class imbalance. Because of this, F1-binary score should be retired and a suitable replacement should be used. We justify this argument through a detailed evaluation of the negative influence of class imbalance on action unit detection. This includes an investigation into the influence of class imbalance in train and test sets and in new data (i.e., generalizability). We empirically show that F1-micro should be used as the replacement for F1-binary.
Collapse
Affiliation(s)
- Saurabh Hinduja
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Tara Nourivandi
- Department of Computer Science and Engineering, University of South Florida, USA
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| | - Shaun Canavan
- Department of Computer Science and Engineering, University of South Florida, USA
| |
Collapse
|
3
|
Yang B, Wu J, Ikeda K, Hattori G, Sugano M, Iwasawa Y, Matsuo Y. Deep Learning Pipeline for Spotting Macro- and Micro-expressions in Long Video Sequences Based on Action Units and Optical Flow. Pattern Recognit Lett 2023. [DOI: 10.1016/j.patrec.2022.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
4
|
Jia X, Xu S, Zhou Y, Wang L, Li W. A Novel Dual-channel Graph Convolutional Neural Network for Facial Action Unit Recognition. Pattern Recognit Lett 2023. [DOI: 10.1016/j.patrec.2023.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
5
|
Zhou Y, Jin L, Ma G, Xu X. Quaternion Capsule Neural Network With Region Attention for Facial Expression Recognition in Color Images. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2022. [DOI: 10.1109/tetci.2021.3120513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Yu Zhou
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Lianghai Jin
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xiangyang Xu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
6
|
Facial Motion Analysis beyond Emotional Expressions. SENSORS 2022; 22:s22103839. [PMID: 35632248 PMCID: PMC9144218 DOI: 10.3390/s22103839] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 12/04/2022]
Abstract
Facial motion analysis is a research field with many practical applications, and has been strongly developed in the last years. However, most effort has been focused on the recognition of basic facial expressions of emotion and neglects the analysis of facial motions related to non-verbal communication signals. This paper focuses on the classification of facial expressions that are of the utmost importance in sign languages (Grammatical Facial Expressions) but also present in expressive spoken language. We have collected a dataset of Spanish Sign Language sentences and extracted the intervals for three types of Grammatical Facial Expressions: negation, closed queries and open queries. A study of several deep learning models using different input features on the collected dataset (LSE_GFE) and an external dataset (BUHMAP) shows that GFEs can be learned reliably with Graph Convolutional Networks simply fed with face landmarks.
Collapse
|
7
|
Jia X, Zhou Y, Li W, Li J, Yin B. Data-aware relation learning-based graph convolution neural network for facial action unit recognition. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.02.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
8
|
Zhou Y, Jin L, Liu H, Song E. Color Facial Expression Recognition by Quaternion Convolutional Neural Network With Gabor Attention. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.3041642] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
9
|
|
10
|
Liang X, Xu L, Liu J, Liu Z, Cheng G, Xu J, Liu L. Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition. SENSORS (BASEL, SWITZERLAND) 2021; 21:833. [PMID: 33513723 PMCID: PMC7865259 DOI: 10.3390/s21030833] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 01/13/2021] [Accepted: 01/21/2021] [Indexed: 01/21/2023]
Abstract
Recognizing facial expression has attracted much more attention due to its broad range of applications in human-computer interaction systems. Although facial representation is crucial to final recognition accuracy, traditional handcrafted representations only reflect shallow characteristics and it is uncertain whether the convolutional layer can extract better ones. In addition, the policy that weights are shared across a whole image is improper for structured face images. To overcome such limitations, a novel method based on patches of interest, the Patch Attention Layer (PAL) of embedding handcrafted features, is proposed to learn the local shallow facial features of each patch on face images. Firstly, a handcrafted feature, Gabor surface feature (GSF), is extracted by convolving the input face image with a set of predefined Gabor filters. Secondly, the generated feature is segmented as nonoverlapped patches that can capture local shallow features by the strategy of using different local patches with different filters. Then, the weighted shallow features are fed into the remaining convolutional layers to capture high-level features. Our method can be carried out directly on a static image without facial landmark information, and the preprocessing step is very simple. Experiments on four databases show that our method achieved very competitive performance (Extended Cohn-Kanade database (CK+): 98.93%; Oulu-CASIA: 97.57%; Japanese Female Facial Expressions database (JAFFE): 93.38%; and RAF-DB: 86.8%) compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Xingcan Liang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Linsen Xu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- Anhui Province Key Laboratory of Biomimetic Sensing and Advanced Robot Technology, Hefei 230031, China
| | - Jinfu Liu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
| | - Zhipeng Liu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Gaoxin Cheng
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Jiajun Xu
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Lei Liu
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| |
Collapse
|
11
|
Wan J, Lai Z, Shen L, Zhou J, Gao C, Xiao G, Hou X. Robust facial landmark detection by cross-order cross-semantic deep network. Neural Netw 2020; 136:233-243. [PMID: 33257223 DOI: 10.1016/j.neunet.2020.11.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 09/22/2020] [Accepted: 11/02/2020] [Indexed: 10/23/2022]
Abstract
Recently, convolutional neural networks (CNNs)-based facial landmark detection methods have achieved great success. However, most of existing CNN-based facial landmark detection methods have not attempted to activate multiple correlated facial parts and learn different semantic features from them that they can not accurately model the relationships among the local details and can not fully explore more discriminative and fine semantic features, thus they suffer from partial occlusions and large pose variations. To address these problems, we propose a cross-order cross-semantic deep network (CCDN) to boost the semantic features learning for robust facial landmark detection. Specifically, a cross-order two-squeeze multi-excitation (CTM) module is proposed to introduce the cross-order channel correlations for more discriminative representations learning and multiple attention-specific part activation. Moreover, a novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection. It is interesting to show that by integrating the CTM module and COCS regularizer, the proposed CCDN can effectively activate and learn more fine and complementary cross-order cross-semantic features to improve the accuracy of facial landmark detection under extremely challenging scenarios. Experimental results on challenging benchmark datasets demonstrate the superiority of our CCDN over state-of-the-art facial landmark detection methods.
Collapse
Affiliation(s)
- Jun Wan
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; School of Mathematics and Statistics, Hanshan Normal University, Chaozhou 521041, China
| | - Zhihui Lai
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518060, China.
| | - Linlin Shen
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518060, China
| | - Jie Zhou
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518060, China
| | - Can Gao
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, 518060, China
| | - Gang Xiao
- School of Mathematics and Statistics, Hanshan Normal University, Chaozhou 521041, China
| | - Xianxu Hou
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| |
Collapse
|