1
|
Borgalli RA, Surve S. Review on learning framework for facial expression recognition. THE IMAGING SCIENCE JOURNAL 2023. [DOI: 10.1080/13682199.2023.2172526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Affiliation(s)
- Rohan Appasaheb Borgalli
- Department of Electronics Engineering, Fr. Conceicao Rodrigues College of Engineering, Bandra, University of Mumbai, Mumbai, Maharashtra, India
| | - Sunil Surve
- Department of Computer Engineering, Fr. Conceicao Rodrigues College of Engineering, Bandra, University of Mumbai, Mumbai, Maharashtra, India
| |
Collapse
|
2
|
Heterogeneous Spatio-Temporal Relation Learning Network for Facial Action Unit Detection. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
3
|
Hu Y, Wen G, Luo M, Dai D, Cao W, Yu Z, Hall W. Inner-Imaging Networks: Put Lenses Into Convolutional Structure. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8547-8560. [PMID: 34398768 DOI: 10.1109/tcyb.2020.3034605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address that by enhancing the diversities of filters, they have not considered the complementarity and the completeness of the internal convolutional structure. To respond to this problem, we propose a novel inner-imaging (InI) architecture, which allows relationships between channels to meet the above requirement. Specifically, we organize the channel signal points in groups using convolutional kernels to model both the intragroup and intergroup relationships simultaneously. A convolutional filter is a powerful tool for modeling spatial relations and organizing grouped signals, so the proposed methods map the channel signals onto a pseudoimage, like putting a lens into the internal convolution structure. Consequently, not only is the diversity of channels increased but also the complementarity and completeness can be explicitly enhanced. The proposed architecture is lightweight and easy to be implement. It provides an efficient self-organization strategy for convolutional networks to improve their performance. Extensive experiments are conducted on multiple benchmark datasets, including CIFAR, SVHN, and ImageNet. Experimental results verify the effectiveness of the InI mechanism with the most popular convolutional networks as the backbones.
Collapse
|
4
|
Jia X, Zhou Y, Li W, Li J, Yin B. Data-aware relation learning-based graph convolution neural network for facial action unit recognition. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.02.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
5
|
Chen J, Wang C, Wang K, Liu M. Lightweight network architecture using difference saliency maps for facial action unit detection. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02755-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
6
|
Facial action unit detection methodology with application in Brazilian sign language recognition. Pattern Anal Appl 2021. [DOI: 10.1007/s10044-021-01024-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
7
|
Li Y, Huang X, Zhao G. Micro-expression action unit detection with spatial and channel attention. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.01.032] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Niinuma K, Onal Ertugrul I, Cohn JF, Jeni LA. Systematic Evaluation of Design Choices for Deep Facial Action Coding Across Pose. FRONTIERS IN COMPUTER SCIENCE 2021. [DOI: 10.3389/fcomp.2021.636094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The performance of automated facial expression coding is improving steadily. Advances in deep learning techniques have been key to this success. While the advantage of modern deep learning techniques is clear, the contribution of critical design choices remains largely unknown, especially for facial action unit occurrence and intensity across pose. Using the The Facial Expression Recognition and Analysis 2017 (FERA 2017) database, which provides a common protocol to evaluate robustness to pose variation, we systematically evaluated design choices in pre-training, feature alignment, model size selection, and optimizer details. Informed by the findings, we developed an architecture that exceeds state-of-the-art on FERA 2017. The architecture achieved a 3.5% increase in F1 score for occurrence detection and a 5.8% increase in Intraclass Correlation (ICC) for intensity estimation. To evaluate the generalizability of the architecture to unseen poses and new dataset domains, we performed experiments across pose in FERA 2017 and across domains in Denver Intensity of Spontaneous Facial Action (DISFA) and the UNBC Pain Archive.
Collapse
|
9
|
Liang X, Xu L, Liu J, Liu Z, Cheng G, Xu J, Liu L. Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition. SENSORS (BASEL, SWITZERLAND) 2021; 21:833. [PMID: 33513723 PMCID: PMC7865259 DOI: 10.3390/s21030833] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 01/13/2021] [Accepted: 01/21/2021] [Indexed: 01/21/2023]
Abstract
Recognizing facial expression has attracted much more attention due to its broad range of applications in human-computer interaction systems. Although facial representation is crucial to final recognition accuracy, traditional handcrafted representations only reflect shallow characteristics and it is uncertain whether the convolutional layer can extract better ones. In addition, the policy that weights are shared across a whole image is improper for structured face images. To overcome such limitations, a novel method based on patches of interest, the Patch Attention Layer (PAL) of embedding handcrafted features, is proposed to learn the local shallow facial features of each patch on face images. Firstly, a handcrafted feature, Gabor surface feature (GSF), is extracted by convolving the input face image with a set of predefined Gabor filters. Secondly, the generated feature is segmented as nonoverlapped patches that can capture local shallow features by the strategy of using different local patches with different filters. Then, the weighted shallow features are fed into the remaining convolutional layers to capture high-level features. Our method can be carried out directly on a static image without facial landmark information, and the preprocessing step is very simple. Experiments on four databases show that our method achieved very competitive performance (Extended Cohn-Kanade database (CK+): 98.93%; Oulu-CASIA: 97.57%; Japanese Female Facial Expressions database (JAFFE): 93.38%; and RAF-DB: 86.8%) compared to other state-of-the-art methods.
Collapse
Affiliation(s)
- Xingcan Liang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Linsen Xu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- Anhui Province Key Laboratory of Biomimetic Sensing and Advanced Robot Technology, Hefei 230031, China
| | - Jinfu Liu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
| | - Zhipeng Liu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Gaoxin Cheng
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; (X.L.); (J.L.); (Z.L.); (G.C.)
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Jiajun Xu
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| | - Lei Liu
- University of Science and Technology of China, Hefei 230026, China; (J.X.); (L.L.)
| |
Collapse
|
10
|
|
11
|
Ertugrul IO, Cohn JF, Jeni LA, Zhang Z, Yin L, Ji Q. Crossing Domains for AU Coding: Perspectives, Approaches, and Measures. IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE 2020; 2:158-171. [PMID: 32377637 PMCID: PMC7202467 DOI: 10.1109/tbiom.2020.2977225] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Facial action unit (AU) detectors have performed well when trained and tested within the same domain. How well do AU detectors transfer to domains in which they have not been trained? We review literature on cross-domain transfer and conduct experiments to address limitations of prior research. We evaluate generalizability in four publicly available databases. EB+ (an expanded version of BP4D+), Sayette GFT, DISFA and UNBC Shoulder Pain (SP). The databases differ in observational scenarios, context, participant diversity, range of head pose, video resolution, and AU base rates. In most cases performance decreased with change in domain, often to below the threshold needed for behavioral research. However, exceptions were noted. Deep and shallow approaches generally performed similarly and average results were slightly better for deep model compared to shallow one. Occlusion sensitivity maps revealed that local specificity was greater for AU detection within than cross domains. The findings suggest that more varied domains and deep learning approaches may be better suited for generalizability and suggest the need for more attention to characteristics that vary between domains. Until further improvement is realized, caution is warranted when applying AU classifiers from one domain to another.
Collapse
Affiliation(s)
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Zheng Zhang
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Lijun Yin
- Department of Computer Science, State University of New York at Binghamton, USA
| | - Qiang Ji
- Rensselaer Polytechnic Institute, Troy, NY, USA
| |
Collapse
|
12
|
Ertugrul IO, Yang L, Jeni LA, Cohn JF. D-PAttNet: Dynamic Patch-Attentive Deep Network for Action Unit Detection. FRONTIERS IN COMPUTER SCIENCE 2019; 1:11. [PMID: 31930192 PMCID: PMC6953909 DOI: 10.3389/fcomp.2019.00011] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Facial action units (AUs) relate to specific local facial regions. Recent efforts in automated AU detection have focused on learning the facial patch representations to detect specific AUs. These efforts have encountered three hurdles. First, they implicitly assume that facial patches are robust to head rotation; yet non-frontal rotation is common. Second, mappings between AUs and patches are defined a priori, which ignores co-occurrences among AUs. And third, the dynamics of AUs are either ignored or modeled sequentially rather than simultaneously as in human perception. Inspired by recent advances in human perception, we propose a dynamic patch-attentive deep network, called D-PAttNet, for AU detection that (i) controls for 3D head and face rotation, (ii) learns mappings of patches to AUs, and (iii) models spatiotemporal dynamics. D-PAttNet approach significantly improves upon existing state of the art.
Collapse
Affiliation(s)
- Itir Onal Ertugrul
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Le Yang
- School of Computer Science, Northwestern Polytechnical University, Xian, China
| | - László A. Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Jeffrey F. Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
13
|
Ertugrul IO, Jeni LA, Ding W, Cohn JF. AFAR: A Deep Learning Based Tool for Automated Facial Affect Recognition. PROCEEDINGS OF THE ... INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION. IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION 2019; 2019. [PMID: 31762712 DOI: 10.1109/fg.2019.8756623] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
| | - László A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Wanqiao Ding
- Department of Psychology, University of Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, PA, USA
| |
Collapse
|
14
|
Leo M, Carcagnì P, Distante C, Spagnolo P, Mazzeo PL, Rosato AC, Petrocchi S, Pellegrino C, Levante A, De Lumè F, Lecciso F. Computational Assessment of Facial Expression Production in ASD Children. SENSORS (BASEL, SWITZERLAND) 2018; 18:E3993. [PMID: 30453518 PMCID: PMC6263710 DOI: 10.3390/s18113993] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 11/09/2018] [Accepted: 11/14/2018] [Indexed: 12/01/2022]
Abstract
In this paper, a computational approach is proposed and put into practice to assess the capability of children having had diagnosed Autism Spectrum Disorders (ASD) to produce facial expressions. The proposed approach is based on computer vision components working on sequence of images acquired by an off-the-shelf camera in unconstrained conditions. Action unit intensities are estimated by analyzing local appearance and then both temporal and geometrical relationships, learned by Convolutional Neural Networks, are exploited to regularize gathered estimates. To cope with stereotyped movements and to highlight even subtle voluntary movements of facial muscles, a personalized and contextual statistical modeling of non-emotional face is formulated and used as a reference. Experimental results demonstrate how the proposed pipeline can improve the analysis of facial expressions produced by ASD children. A comparison of system's outputs with the evaluations performed by psychologists, on the same group of ASD children, makes evident how the performed quantitative analysis of children's abilities helps to go beyond the traditional qualitative ASD assessment/diagnosis protocols, whose outcomes are affected by human limitations in observing and understanding multi-cues behaviors such as facial expressions.
Collapse
Affiliation(s)
- Marco Leo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Pierluigi Carcagnì
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Cosimo Distante
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Paolo Spagnolo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Pier Luigi Mazzeo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | | | - Serena Petrocchi
- USI, Institute of Communication and Health, Via Buffi 6, 6900 Lugano, Switzerland.
| | | | - Annalisa Levante
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| | - Filomena De Lumè
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| | - Flavia Lecciso
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| |
Collapse
|