1
|
Zaharieva MS, Salvadori EA, Messinger DS, Visser I, Colonnesi C. Automated facial expression measurement in a longitudinal sample of 4- and 8-month-olds: Baby FaceReader 9 and manual coding of affective expressions. Behav Res Methods 2024:10.3758/s13428-023-02301-3. [PMID: 38273072 DOI: 10.3758/s13428-023-02301-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/20/2023] [Indexed: 01/27/2024]
Abstract
Facial expressions are among the earliest behaviors infants use to express emotional states, and are crucial to preverbal social interaction. Manual coding of infant facial expressions, however, is laborious and poses limitations to replicability. Recent developments in computer vision have advanced automated facial expression analyses in adults, providing reproducible results at lower time investment. Baby FaceReader 9 is commercially available software for automated measurement of infant facial expressions, but has received little validation. We compared Baby FaceReader 9 output to manual micro-coding of positive, negative, or neutral facial expressions in a longitudinal dataset of 58 infants at 4 and 8 months of age during naturalistic face-to-face interactions with the mother, father, and an unfamiliar adult. Baby FaceReader 9's global emotional valence formula yielded reasonable classification accuracy (AUC = .81) for discriminating manually coded positive from negative/neutral facial expressions; however, the discrimination of negative from neutral facial expressions was not reliable (AUC = .58). Automatically detected a priori action unit (AU) configurations for distinguishing positive from negative facial expressions based on existing literature were also not reliable. A parsimonious approach using only automatically detected smiling (AU12) yielded good performance for discriminating positive from negative/neutral facial expressions (AUC = .86). Likewise, automatically detected brow lowering (AU3+AU4) reliably distinguished neutral from negative facial expressions (AUC = .79). These results provide initial support for the use of selected automatically detected individual facial actions to index positive and negative affect in young infants, but shed doubt on the accuracy of complex a priori formulas.
Collapse
Affiliation(s)
- Martina S Zaharieva
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences, University of Amsterdam, Nieuwe Achtergracht 129b, 1001 NK, Amsterdam, The Netherlands.
- Developmental Psychopathology Unit, Development and Education, Faculty of Social and Behavioural Sciences, Research Institute of Child, University of Amsterdam, Nieuwe Achtergracht 129b, 1001 NK, Amsterdam, The Netherlands.
- Yield, Research Priority Area, University of Amsterdam, Amsterdam, The Netherlands.
| | - Eliala A Salvadori
- Developmental Psychopathology Unit, Development and Education, Faculty of Social and Behavioural Sciences, Research Institute of Child, University of Amsterdam, Nieuwe Achtergracht 129b, 1001 NK, Amsterdam, The Netherlands
- Yield, Research Priority Area, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel S Messinger
- Department of Psychology, University of Miami, Coral Gables, FL, USA
- Department of Pediatrics, University of Miami, Coral Gables, FL, USA
- Department of Music Engineering, University of Miami, Coral Gables, FL, USA
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA
| | - Ingmar Visser
- Department of Developmental Psychology, Faculty of Social and Behavioural Sciences, University of Amsterdam, Nieuwe Achtergracht 129b, 1001 NK, Amsterdam, The Netherlands
- Yield, Research Priority Area, University of Amsterdam, Amsterdam, The Netherlands
| | - Cristina Colonnesi
- Developmental Psychopathology Unit, Development and Education, Faculty of Social and Behavioural Sciences, Research Institute of Child, University of Amsterdam, Nieuwe Achtergracht 129b, 1001 NK, Amsterdam, The Netherlands
- Yield, Research Priority Area, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
2
|
Gwak S, Park K. Designing Effective Visual Feedback for Facial Rehabilitation Exercises: Investigating the Role of Shape, Transparency, and Age on User Experience. Healthcare (Basel) 2023; 11:1835. [PMID: 37444669 DOI: 10.3390/healthcare11131835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 06/16/2023] [Accepted: 06/17/2023] [Indexed: 07/15/2023] Open
Abstract
Facial expression recognition technology has been utilized both for entertainment purposes and as a valuable aid in rehabilitation and facial exercise assistance. This technology leverages artificial intelligence models to predict facial landmark points and provide visual feedback, thereby facilitating users' facial movements. However, feedback designs that disregard user preferences may cause discomfort and diminish the benefits of exercise. This study aimed to develop a feedback design guide for facial rehabilitation exercises by investigating user responses to various feedback design methods. We created a facial recognition mobile application and designed six feedback variations based on shape and transparency. To evaluate user experience, we conducted a usability test involving 48 participants (24 subjects in their 20s and 24 over 60 years of age), assessing factors such as feedback, assistance, disturbance, aesthetics, cognitive ease, and appropriateness. The experimental results revealed significant differences in transparency, age, and the interaction between transparency and age. Consequently, it is essential to consider both transparency and user age when designing facial recognition feedback. The findings of this study could potentially inform the design of more effective and personalized visual feedback for facial motion, ultimately benefiting users in rehabilitation and exercise contexts.
Collapse
Affiliation(s)
- Sojung Gwak
- Department of Artificial Intelligence Applications, Kwangwoon University, Seoul 01897, Republic of Korea
| | - Kyudong Park
- Department of Artificial Intelligence Applications, Kwangwoon University, Seoul 01897, Republic of Korea
- School of Information Convergence, Kwangwoon University, Seoul 01897, Republic of Korea
| |
Collapse
|
3
|
Lukowicz P, Nijholt A, Siddiqi K, Pelillo M, Laerhoven KV, Viganò L, Zannone N. Editorial: 2021 editors' pick: Computer science. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.1062066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
4
|
Wörtwein T, Sheeber LB, Allen N, Cohn JF, Morency LP. Human-Guided Modality Informativeness for Affective States. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. ICMI (CONFERENCE) 2021; 2021:728-734. [PMID: 35128550 PMCID: PMC8812829 DOI: 10.1145/3462244.3481004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This paper studies the hypothesis that not all modalities are always needed to predict affective states. We explore this hypothesis in the context of recognizing three affective states that have shown a relation to a future onset of depression: positive, aggressive, and dysphoric. In particular, we investigate three important modalities for face-to-face conversations: vision, language, and acoustic modality. We first perform a human study to better understand which subset of modalities people find informative, when recognizing three affective states. As a second contribution, we explore how these human annotations can guide automatic affect recognition systems to be more interpretable while not degrading their predictive performance. Our studies show that humans can reliably annotate modality informativeness. Further, we observe that guided models significantly improve interpretability, i.e., they attend to modalities similarly to how humans rate the modality informativeness, while at the same time showing a slight increase in predictive performance.
Collapse
Affiliation(s)
- Torsten Wörtwein
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Nicholas Allen
- Department of Psychology, University of Oregon, Eugene, OR, USA
| | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
5
|
Niinuma K, Ertugrul IO, Cohn JF, Jeni LA. Synthetic Expressions are Better Than Real for Learning to Detect Facial Actions. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION 2021; 2021:1247-1256. [PMID: 38250021 PMCID: PMC10798354 DOI: 10.1109/wacv48630.2021.00129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
Critical obstacles in training classifiers to detect facial actions are the limited sizes of annotated video databases and the relatively low frequencies of occurrence of many actions. To address these problems, we propose an approach that makes use of facial expression generation. Our approach reconstructs the 3D shape of the face from each video frame, aligns the 3D mesh to a canonical view, and then trains a GAN-based network to synthesize novel images with facial action units of interest. To evaluate this approach, a deep neural network was trained on two separate datasets: One network was trained on video of synthesized facial expressions generated from FERA17; the other network was trained on unaltered video from the same database. Both networks used the same train and validation partitions and were tested on the test partition of actual video from FERA17. The network trained on synthesized facial expressions outperformed the one trained on actual facial expressions and surpassed current state-of-the-art approaches.
Collapse
|
6
|
|
7
|
ML-DCNNet: Multi-level Deep Convolutional Neural Network for Facial Expression Recognition and Intensity Estimation. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2020. [DOI: 10.1007/s13369-020-04811-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
8
|
Krithika LB, Priya GGL. Graph based feature extraction and hybrid classification approach for facial expression recognition. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2020; 12:2131-2147. [PMID: 32837594 PMCID: PMC7359439 DOI: 10.1007/s12652-020-02311-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 07/07/2020] [Indexed: 06/11/2023]
Abstract
In the current trends, face recognition has a remarkable attraction towards favorable and inquiry of an image. Several algorithms are utilized for recognizing the facial expressions, but they lack in the issues like inaccurate recognition of facial expression. To overcome these issues, a Graph-based Feature Extraction and Hybrid Classification Approach (GFE-HCA) is proposed for recognizing the facial expressions. The main motive of this work is to recognize human emotions in an effective manner. Initially, the face image is identified using the Viola-Jones algorithm. Subsequently, the facial parts such as right eye, left eye, nose and mouth are extracted from the detected facial image. The edge-based invariant transform feature is utilized to extract the features from the extracted facial parts. From this edge-based invariant features, the dimensions are optimized using Weighted Visibility Graph which produces the graph-based features. Also, the shape appearance-based features from the facial parts are extracted. From these extracted features, facial expressions are recognized and classified using a Self-Organizing Map based Neural Network Classifier. The performance of this GFE-HCA approach is evaluated and compared with the existing techniques, and the superiority of the proposed approach is proved with its increased recognition rate.
Collapse
Affiliation(s)
- L. B. Krithika
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - G. G. Lakshmi Priya
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
9
|
Cheng D, Liu D, Philpotts LL, Turner DP, Houle TT, Chen L, Zhang M, Yang J, Zhang W, Deng H. Current state of science in machine learning methods for automatic infant pain evaluation using facial expression information: study protocol of a systematic review and meta-analysis. BMJ Open 2019; 9:e030482. [PMID: 31831532 PMCID: PMC6924806 DOI: 10.1136/bmjopen-2019-030482] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2019] [Revised: 11/18/2019] [Accepted: 11/18/2019] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Infants can experience pain similar to adults, and improperly controlled pain stimuli could have a long-term adverse impact on their cognitive and neurological function development. The biggest challenge of achieving good infant pain control is obtaining objective pain assessment when direct communication is lacking. For years, computer scientists have developed many different facial expression-centred machine learning (ML) methods for automatic infant pain assessment. Many of these ML algorithms showed rather satisfactory performance and have demonstrated good potential to be further enhanced for implementation in real-world clinical settings. To date, there is no prior research that has systematically summarised and compared the performance of these ML algorithms. Our proposed meta-analysis will provide the first comprehensive evidence on this topic to guide further ML algorithm development and clinical implementation. METHODS AND ANALYSIS We will search four major public electronic medical and computer science databases including Web of Science, PubMed, Embase and IEEE Xplore Digital Library from January 2008 to present. All the articles will be imported into the Covidence platform for study eligibility screening and inclusion. Study-level extracted data will be stored in the Systematic Review Data Repository online platform. The primary outcome will be the prediction accuracy of the ML model. The secondary outcomes will be model utility measures including generalisability, interpretability and computational efficiency. All extracted outcome data will be imported into RevMan V.5.2.1 software and R V3.3.2 for analysis. Risk of bias will be summarised using the latest Prediction Model Study Risk of Bias Assessment Tool. ETHICS AND DISSEMINATION This systematic review and meta-analysis will only use study-level data from public databases, thus formal ethical approval is not required. The results will be disseminated in the form of an official publication in a peer-reviewed journal and/or presentation at relevant conferences. PROSPERO REGISTRATION NUMBER CRD42019118784.
Collapse
Affiliation(s)
- Dan Cheng
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Dianbo Liu
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | | | - Dana P Turner
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Timothy T Houle
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Lucy Chen
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Miaomiao Zhang
- Department of Engineering, University of Virginia, Charlottesville, Virginia, USA
| | - Jianjun Yang
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Wei Zhang
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Hao Deng
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
- DRPH Program, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| |
Collapse
|
10
|
Niinuma K, Jeni LA, Ertugrul IO, Cohn JF. Unmasking the Devil in the Details: What Works for Deep Facial Action Coding? BMVC : PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE. BRITISH MACHINE VISION CONFERENCE 2019; 2019:4. [PMID: 32510058 PMCID: PMC7274256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The performance of automated facial expression coding has improving steadily as evidenced by results of the latest Facial Expression Recognition and Analysis (FERA 2017) Challenge. Advances in deep learning techniques have been key to this success. Yet the contribution of critical design choices remains largely unknown. Using the FERA 2017 database, we systematically evaluated design choices in pre-training, feature alignment, model size selection, and optimizer details. Our findings vary from the counter-intuitive (e.g., generic pre-training outperformed face-specific models) to best practices in tuning optimizers. Informed by what we found, we developed an architecture that exceeded state-of-the-art on FERA 2017. We achieved a 3.5% increase in F1 score for occurrence detection and a 5.8% increase in ICC for intensity estimation.
Collapse
Affiliation(s)
| | - Laszlo A Jeni
- Robotics Institute Carnegie Mellon University Pittsburgh, PA, USA
| | | | - Jeffrey F Cohn
- Department of Psychology University of Pittsburgh Pittsburgh, PA, USA
| |
Collapse
|
11
|
Leo M, Carcagnì P, Distante C, Spagnolo P, Mazzeo PL, Rosato AC, Petrocchi S, Pellegrino C, Levante A, De Lumè F, Lecciso F. Computational Assessment of Facial Expression Production in ASD Children. SENSORS (BASEL, SWITZERLAND) 2018; 18:E3993. [PMID: 30453518 PMCID: PMC6263710 DOI: 10.3390/s18113993] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 11/09/2018] [Accepted: 11/14/2018] [Indexed: 12/01/2022]
Abstract
In this paper, a computational approach is proposed and put into practice to assess the capability of children having had diagnosed Autism Spectrum Disorders (ASD) to produce facial expressions. The proposed approach is based on computer vision components working on sequence of images acquired by an off-the-shelf camera in unconstrained conditions. Action unit intensities are estimated by analyzing local appearance and then both temporal and geometrical relationships, learned by Convolutional Neural Networks, are exploited to regularize gathered estimates. To cope with stereotyped movements and to highlight even subtle voluntary movements of facial muscles, a personalized and contextual statistical modeling of non-emotional face is formulated and used as a reference. Experimental results demonstrate how the proposed pipeline can improve the analysis of facial expressions produced by ASD children. A comparison of system's outputs with the evaluations performed by psychologists, on the same group of ASD children, makes evident how the performed quantitative analysis of children's abilities helps to go beyond the traditional qualitative ASD assessment/diagnosis protocols, whose outcomes are affected by human limitations in observing and understanding multi-cues behaviors such as facial expressions.
Collapse
Affiliation(s)
- Marco Leo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Pierluigi Carcagnì
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Cosimo Distante
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Paolo Spagnolo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | - Pier Luigi Mazzeo
- Institute of Applied Sciences and Intelligent Systems, National Research Council of Italy, via Monteroni, 73100 Lecce, Italy.
| | | | - Serena Petrocchi
- USI, Institute of Communication and Health, Via Buffi 6, 6900 Lugano, Switzerland.
| | | | - Annalisa Levante
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| | - Filomena De Lumè
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| | - Flavia Lecciso
- Dipartimento di Storia, University of Salento, Società e Studi Sull' Uomo, Studium 2000-Edificio 5-Via di Valesio, 73100 Lecce, Italy.
| |
Collapse
|
12
|
Li W, Abtahi F, Zhu Z, Yin L. EAC-Net: Deep Nets with Enhancing and Cropping for Facial Action Unit Detection. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:2583-2596. [PMID: 29994168 DOI: 10.1109/tpami.2018.2791608] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, we propose a deep learning based approach for facial action unit (AU) detection by enhancing and cropping regions of interest of face images. The approach is implemented by adding two novel nets (a.k.a. layers): the enhancing layers and the cropping layers, to a pretrained convolutional neural network (CNN) model. For the enhancing layers (noted as E-Net), we have designed an attention map based on facial landmark features and apply it to a pretrained neural network to conduct enhanced learning. For the cropping layers (noted as C-Net ), we crop facial regions around the detected landmarks and design individual convolutional layers to learn deeper features for each facial region. We then combine the E-Net and the C-Net to construct a so-called Enhancing and Cropping Net (EAC-Net), which can learn both features enhancing and region cropping functions effectively. The EAC-Net integrates three important elements, i.e., learning transfer, attention coding, and regions of interest processing, making our AU detection approach more efficient and more robust to facial position and orientation changes. Our approach shows a significant performance improvement over the state-of-the-art methods when tested on the BP4D and DISFA AU datasets. The EAC-Net with a slight modification also shows its potentials in estimating accurate AU intensities. We have also studied the performance of the proposed EAC-Net under two very challenging conditions: (1) faces with partial occlusion and (2) faces with large head pose variations. Experimental results show that (1) the EAC-Net learns facial AUs correlation effectively and predicts AUs reliably even with only half of a face being visible, especially for the lower half; (2) Our EAC-Net model also works well under very large head poses, which outperforms significantly a compared baseline approach. It further shows that the EAC-Net works much better without a face frontalization than with face frontalization through image warping as pre-processing, in terms of computational efficiency and AU detection accuracy.
Collapse
|
13
|
Calvo MG, Fernández-Martín A, Recio G, Lundqvist D. Human Observers and Automated Assessment of Dynamic Emotional Facial Expressions: KDEF-dyn Database Validation. Front Psychol 2018; 9:2052. [PMID: 30416473 PMCID: PMC6212581 DOI: 10.3389/fpsyg.2018.02052] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 10/05/2018] [Indexed: 12/11/2022] Open
Abstract
Most experimental studies of facial expression processing have used static stimuli (photographs), yet facial expressions in daily life are generally dynamic. In its original photographic format, the Karolinska Directed Emotional Faces (KDEF) has been frequently utilized. In the current study, we validate a dynamic version of this database, the KDEF-dyn. To this end, we applied animation between neutral and emotional expressions (happy, sad, angry, fearful, disgusted, and surprised; 1,033-ms unfolding) to 40 KDEF models, with morphing software. Ninety-six human observers categorized the expressions of the resulting 240 video-clip stimuli, and automated face analysis assessed the evidence for 6 expressions and 20 facial action units (AUs) at 31 intensities. Low-level image properties (luminance, signal-to-noise ratio, etc.) and other purely perceptual factors (e.g., size, unfolding speed) were controlled. Human recognition performance (accuracy, efficiency, and confusions) patterns were consistent with prior research using static and other dynamic expressions. Automated assessment of expressions and AUs was sensitive to intensity manipulations. Significant correlations emerged between human observers' categorization and automated classification. The KDEF-dyn database aims to provide a balance between experimental control and ecological validity for research on emotional facial expression processing. The stimuli and the validation data are available to the scientific community.
Collapse
Affiliation(s)
- Manuel G. Calvo
- Department of Cognitive Psychology, Universidad de La Laguna, San Cristóbal de La Laguna, Spain
- Instituto Universitario de Neurociencia (IUNE), Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | | | - Guillermo Recio
- Institute of Psychology, Universität Hamburg, Hamburg, Germany
| | - Daniel Lundqvist
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
14
|
Cohn JF, Okun MS, Jeni LA, Ertugrul IO, Borton D, Malone D, Goodman WK. Automated Affect Detection in Deep Brain Stimulation for Obsessive-Compulsive Disorder: A Pilot Study. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. ICMI (CONFERENCE) 2018; 2018:40-44. [PMID: 30511050 PMCID: PMC6271416 DOI: 10.1145/3242969.3243023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Automated measurement of affective behavior in psychopathology has been limited primarily to screening and diagnosis. While useful, clinicians more often are concerned with whether patients are improving in response to treatment. Are symptoms abating, is affect becoming more positive, are unanticipated side effects emerging? When treatment includes neural implants, need for objective, repeatable biometrics tied to neurophysiology becomes especially pressing. We used automated face analysis to assess treatment response to deep brain stimulation (DBS) in two patients with intractable obsessive-compulsive disorder (OCD). One was assessed intraoperatively following implantation and activation of the DBS device. The other was assessed three months post-implantation. Both were assessed during DBS on and o conditions. Positive and negative valence were quantified using a CNN trained on normative data of 160 non-OCD participants. Thus, a secondary goal was domain transfer of the classifiers. In both contexts, DBS-on resulted in marked positive affect. In response to DBS-off, affect flattened in both contexts and alternated with increased negative affect in the outpatient setting. Mean AUC for domain transfer was 0.87. These findings suggest that parametric variation of DBS is strongly related to affective behavior and may introduce vulnerability for negative affect in the event that DBS is discontinued.
Collapse
|
15
|
|
16
|
Ertugrul IO, Jeni LA, Cohn JF. FACSCaps: Pose-Independent Facial Action Coding with Capsules. CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. WORKSHOPS 2018; 2018:2211-2220. [PMID: 30944768 PMCID: PMC6443417 DOI: 10.1109/cvprw.2018.00287] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Most automated facial expression analysis methods treat the face as a 2D object, flat like a sheet of paper. That works well provided images are frontal or nearly so. In real- world conditions, moderate to large head rotation is common and system performance to recognize expression degrades. Multi-view Convolutional Neural Networks (CNNs) have been proposed to increase robustness to pose, but they require greater model sizes and may generalize poorly across views that are not included in the training set. We propose FACSCaps architecture to handle multi-view and multi-label facial action unit (AU) detection within a single model that can generalize to novel views. Additionally, FACSCaps's ability to synthesize faces enables insights into what is leaned by the model. FACSCaps models video frames using matrix capsules, where hierarchical pose relationships between face parts are built into internal representations. The model is trained by jointly optimizing a multi-label loss and the reconstruction accuracy. FACSCaps was evaluated using the FERA 2017 facial expression dataset that includes spontaneous facial expressions in a wide range of head orientations. FACSCaps outperformed both state-of-the-art CNNs and their temporal extensions.
Collapse
Affiliation(s)
| | - Lászlό A Jeni
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
17
|
Corneanu CA, Simon MO, Cohn JF, Guerrero SE. Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016; 38:1548-68. [PMID: 26761193 PMCID: PMC7426891 DOI: 10.1109/tpami.2016.2515606] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.
Collapse
|