1
|
Bian Y, Küster D, Liu H, Krumhuber EG. Understanding Naturalistic Facial Expressions with Deep Learning and Multimodal Large Language Models. SENSORS (BASEL, SWITZERLAND) 2023; 24:126. [PMID: 38202988 PMCID: PMC10781259 DOI: 10.3390/s24010126] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 11/30/2023] [Accepted: 12/21/2023] [Indexed: 01/12/2024]
Abstract
This paper provides a comprehensive overview of affective computing systems for facial expression recognition (FER) research in naturalistic contexts. The first section presents an updated account of user-friendly FER toolboxes incorporating state-of-the-art deep learning models and elaborates on their neural architectures, datasets, and performances across domains. These sophisticated FER toolboxes can robustly address a variety of challenges encountered in the wild such as variations in illumination and head pose, which may otherwise impact recognition accuracy. The second section of this paper discusses multimodal large language models (MLLMs) and their potential applications in affective science. MLLMs exhibit human-level capabilities for FER and enable the quantification of various contextual variables to provide context-aware emotion inferences. These advancements have the potential to revolutionize current methodological approaches for studying the contextual influences on emotions, leading to the development of contextualized emotion models.
Collapse
Affiliation(s)
- Yifan Bian
- Department of Experimental Psychology, University College London, London WC1H 0AP, UK;
| | - Dennis Küster
- Department of Mathematics and Computer Science, University of Bremen, 28359 Bremen, Germany; (D.K.); (H.L.)
| | - Hui Liu
- Department of Mathematics and Computer Science, University of Bremen, 28359 Bremen, Germany; (D.K.); (H.L.)
| | - Eva G. Krumhuber
- Department of Experimental Psychology, University College London, London WC1H 0AP, UK;
| |
Collapse
|
2
|
Cheong JH, Jolly E, Xie T, Byrne S, Kenney M, Chang LJ. Py-Feat: Python Facial Expression Analysis Toolbox. AFFECTIVE SCIENCE 2023; 4:781-796. [PMID: 38156250 PMCID: PMC10751270 DOI: 10.1007/s42761-023-00191-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/07/2023] [Indexed: 12/30/2023]
Abstract
Studying facial expressions is a notoriously difficult endeavor. Recent advances in the field of affective computing have yielded impressive progress in automatically detecting facial expressions from pictures and videos. However, much of this work has yet to be widely disseminated in social science domains such as psychology. Current state-of-the-art models require considerable domain expertise that is not traditionally incorporated into social science training programs. Furthermore, there is a notable absence of user-friendly and open-source software that provides a comprehensive set of tools and functions that support facial expression research. In this paper, we introduce Py-Feat, an open-source Python toolbox that provides support for detecting, preprocessing, analyzing, and visualizing facial expression data. Py-Feat makes it easy for domain experts to disseminate and benchmark computer vision models and also for end users to quickly process, analyze, and visualize face expression data. We hope this platform will facilitate increased use of facial expression data in human behavior research. Supplementary Information The online version contains supplementary material available at 10.1007/s42761-023-00191-4.
Collapse
Affiliation(s)
- Jin Hyun Cheong
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Eshin Jolly
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Tiankang Xie
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
- Department of Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| | - Sophie Byrne
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Matthew Kenney
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Luke J. Chang
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
- Department of Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| |
Collapse
|
3
|
Kim H, Küster D, Girard JM, Krumhuber EG. Human and machine recognition of dynamic and static facial expressions: prototypicality, ambiguity, and complexity. Front Psychol 2023; 14:1221081. [PMID: 37794914 PMCID: PMC10546417 DOI: 10.3389/fpsyg.2023.1221081] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 08/22/2023] [Indexed: 10/06/2023] Open
Abstract
A growing body of research suggests that movement aids facial expression recognition. However, less is known about the conditions under which the dynamic advantage occurs. The aim of this research was to test emotion recognition in static and dynamic facial expressions, thereby exploring the role of three featural parameters (prototypicality, ambiguity, and complexity) in human and machine analysis. In two studies, facial expression videos and corresponding images depicting the peak of the target and non-target emotion were presented to human observers and the machine classifier (FACET). Results revealed higher recognition rates for dynamic stimuli compared to non-target images. Such benefit disappeared in the context of target-emotion images which were similarly well (or even better) recognised than videos, and more prototypical, less ambiguous, and more complex in appearance than non-target images. While prototypicality and ambiguity exerted more predictive power in machine performance, complexity was more indicative of human emotion recognition. Interestingly, recognition performance by the machine was found to be superior to humans for both target and non-target images. Together, the findings point towards a compensatory role of dynamic information, particularly when static-based stimuli lack relevant features of the target emotion. Implications for research using automatic facial expression analysis (AFEA) are discussed.
Collapse
Affiliation(s)
- Hyunwoo Kim
- Departmet of Experimental Psychology, University College London, London, United Kingdom
| | - Dennis Küster
- Cognitive Systems Lab, Department of Mathematics and Computer Science, University of Bremen, Bremen, Germany
| | - Jeffrey M. Girard
- Department of Psychology, University of Kansas, Lawrence, KS, United States
| | - Eva G. Krumhuber
- Departmet of Experimental Psychology, University College London, London, United Kingdom
| |
Collapse
|
4
|
Hama T, Koeda M. Characteristics of healthy Japanese young adults with respect to recognition of facial expressions: a preliminary study. BMC Psychol 2023; 11:237. [PMID: 37592360 PMCID: PMC10436396 DOI: 10.1186/s40359-023-01281-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/09/2023] [Indexed: 08/19/2023] Open
Abstract
BACKGROUND Emotional cognitive impairment is a core phenotype of the clinical symptoms of psychiatric disorders. The ability to measure emotional cognition is useful for assessing neurodegenerative conditions and treatment responses. However, certain factors such as culture, gender, and generation influence emotional recognition, and these differences require examination. We investigated the characteristics of healthy young Japanese adults with respect to facial expression recognition. METHODS We generated 17 models of facial expressions for each of the six basic emotions (happiness, sadness, anger, fear, disgust, and surprise) at three levels of emotional intensity using the Facial Acting Coding System (FACS). Thirty healthy Japanese young adults evaluated the type of emotion and emotional intensity the models represented to them. RESULTS Assessment accuracy for all emotions, except fear, exceeded 60% in approximately half of the videos. Most facial expressions of fear were rarely accurately recognized. Gender differences were observed with respect to both faces and participants, indicating that expressions on female faces were more recognizable than those on male faces, and female participants had more accurate perceptions of facial emotions than males. CONCLUSION The videos used may constitute a dataset, with the possible exception of those that represent fear. The subject's ability to recognize the type and intensity of emotions was affected by the gender of the portrayed face and the evaluator's gender. These gender differences must be considered when developing a scale of facial expression recognition.
Collapse
Affiliation(s)
- Tomoko Hama
- Department of Medical Technology, Ehime Prefectural University of Health Sciences, 543 Takoda, Tobe-Cho, Iyo-Gun, Ehime, 791-2101, Japan
| | - Michihiko Koeda
- Department of Neuropsychiatry, Graduate School of Medicine, Nippon Medical School, 1-1-5, Sendagi, Bunkyo-Ku, Tokyo, 113-8603, Japan.
- Department of Neuropsychiatry, Nippon Medical School Tama Nagayama Hospital, 1-7-1, Nagayama, Tama, Tokyo, 206-8512, Japan.
| |
Collapse
|
5
|
Pasqualette L, Klinger S, Kulke L. Development and validation of a natural dynamic facial expression stimulus set. PLoS One 2023; 18:e0287049. [PMID: 37379278 DOI: 10.1371/journal.pone.0287049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 05/26/2023] [Indexed: 06/30/2023] Open
Abstract
Emotion research commonly uses either controlled and standardised pictures or natural video stimuli to measure participants' reactions to emotional content. Natural stimulus materials can be beneficial; however, certain measures such as neuroscientific methods, require temporally and visually controlled stimulus material. The current study aimed to create and validate video stimuli in which a model displays positive, neutral and negative expressions. These stimuli were kept as natural as possible while editing timing and visual features to make them suitable for neuroscientific research (e.g. EEG). The stimuli were successfully controlled regarding their features and the validation studies show that participants reliably classify the displayed expression correctly and perceive it as genuine. In conclusion, we present a motion stimulus set that is perceived as natural and that is suitable for neuroscientific research, as well as a pipeline describing successful editing methods for controlling natural stimuli.
Collapse
Affiliation(s)
- Laura Pasqualette
- Neurocognitive Developmental Psychology, Psychology Department, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Bavaria, Germany
- Developmental and Educational Psychology Department, University of Bremen, Bremen, Germany
| | - Sara Klinger
- Neurocognitive Developmental Psychology, Psychology Department, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Bavaria, Germany
| | - Louisa Kulke
- Neurocognitive Developmental Psychology, Psychology Department, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Bavaria, Germany
- Developmental and Educational Psychology Department, University of Bremen, Bremen, Germany
| |
Collapse
|
6
|
Höfling TTA, Alpers GW. Automatic facial coding predicts self-report of emotion, advertisement and brand effects elicited by video commercials. Front Neurosci 2023; 17:1125983. [PMID: 37205049 PMCID: PMC10185761 DOI: 10.3389/fnins.2023.1125983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 02/10/2023] [Indexed: 05/21/2023] Open
Abstract
Introduction Consumers' emotional responses are the prime target for marketing commercials. Facial expressions provide information about a person's emotional state and technological advances have enabled machines to automatically decode them. Method With automatic facial coding we investigated the relationships between facial movements (i.e., action unit activity) and self-report of commercials advertisement emotion, advertisement and brand effects. Therefore, we recorded and analyzed the facial responses of 219 participants while they watched a broad array of video commercials. Results Facial expressions significantly predicted self-report of emotion as well as advertisement and brand effects. Interestingly, facial expressions had incremental value beyond self-report of emotion in the prediction of advertisement and brand effects. Hence, automatic facial coding appears to be useful as a non-verbal quantification of advertisement effects beyond self-report. Discussion This is the first study to measure a broad spectrum of automatically scored facial responses to video commercials. Automatic facial coding is a promising non-invasive and non-verbal method to measure emotional responses in marketing.
Collapse
|
7
|
Küster D, Baker M, Krumhuber EG. PDSTD - The Portsmouth Dynamic Spontaneous Tears Database. Behav Res Methods 2022; 54:2678-2692. [PMID: 34918224 PMCID: PMC9729121 DOI: 10.3758/s13428-021-01752-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/15/2021] [Indexed: 12/16/2022]
Abstract
The vast majority of research on human emotional tears has relied on posed and static stimulus materials. In this paper, we introduce the Portsmouth Dynamic Spontaneous Tears Database (PDSTD), a free resource comprising video recordings of 24 female encoders depicting a balanced representation of sadness stimuli with and without tears. Encoders watched a neutral film and a self-selected sad film and reported their emotional experience for 9 emotions. Extending this initial validation, we obtained norming data from an independent sample of naïve observers (N = 91, 45 females) who watched videos of the encoders during three time phases (neutral, pre-sadness, sadness), yielding a total of 72 validated recordings. Observers rated the expressions during each phase on 7 discrete emotions, negative and positive valence, arousal, and genuineness. All data were analyzed by means of general linear mixed modelling (GLMM) to account for sources of random variance. Our results confirm the successful elicitation of sadness, and demonstrate the presence of a tear effect, i.e., a substantial increase in perceived sadness for spontaneous dynamic weeping. To our knowledge, the PDSTD is the first database of spontaneously elicited dynamic tears and sadness that is openly available to researchers. The stimuli can be accessed free of charge via OSF from https://osf.io/uyjeg/?view_only=24474ec8d75949ccb9a8243651db0abf .
Collapse
Affiliation(s)
- Dennis Küster
- Department of Mathematics and Computer Science, University of Bremen, Enrique-Schmidt Str. 5, 28359, Bremen, Germany.
| | - Marc Baker
- Department of Psychology, University of Portsmouth, Portsmouth, UK
| | - Eva G Krumhuber
- Department of Experimental Psychology, University College London, London, UK
| |
Collapse
|
8
|
Automatic Identification of a Depressive State in Primary Care. Healthcare (Basel) 2022; 10:healthcare10122347. [PMID: 36553871 PMCID: PMC9777617 DOI: 10.3390/healthcare10122347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/04/2022] [Accepted: 11/19/2022] [Indexed: 11/24/2022] Open
Abstract
The Center for Epidemiologic Studies Depression Scale (CES-D) performs well in screening depression in primary care. However, people are looking for alternatives because it screens for too many items. With the popularity of social media platforms, facial movement can be recorded ecologically. Considering that there are nonverbal behaviors, including facial movement, associated with a depressive state, this study aims to establish an automatic depression recognition model to be easily used in primary healthcare. We integrated facial activities and gaze behaviors to establish a machine learning algorithm (Kernal Ridge Regression, KRR). We compared different algorithms and different features to achieve the best model. The results showed that the prediction effect of facial and gaze features was higher than that of only facial features. In all of the models we tried, the ridge model with a periodic kernel showed the best performance. The model showed a mutual fund R-squared (R2) value of 0.43 and a Pearson correlation coefficient (r) value of 0.69 (p < 0.001). Then, the most relevant variables (e.g., gaze directions and facial action units) were revealed in the present study.
Collapse
|
9
|
Dawel A, Miller EJ, Horsburgh A, Ford P. A systematic survey of face stimuli used in psychological research 2000-2020. Behav Res Methods 2022; 54:1889-1901. [PMID: 34731426 DOI: 10.3758/s13428-021-01705-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2021] [Indexed: 12/16/2022]
Abstract
For decades, psychology has relied on highly standardized images to understand how people respond to faces. Many of these stimuli are rigorously generated and supported by excellent normative data; as such, they have played an important role in the development of face science. However, there is now clear evidence that testing with ambient images (i.e., naturalistic images "in the wild") and including expressions that are spontaneous can lead to new and important insights. To precisely quantify the extent to which our current knowledge base has relied on standardized and posed stimuli, we systematically surveyed the face stimuli used in 12 key journals in this field across 2000-2020 (N = 3374 articles). Although a small number of posed expression databases continue to dominate the literature, the use of spontaneous expressions seems to be increasing. However, there has been no increase in the use of ambient or dynamic stimuli over time. The vast majority of articles have used highly standardized and nonmoving pictures of faces. An emerging trend is that virtual faces are being used as stand-ins for human faces in research. Overall, the results of the present survey highlight that there has been a significant imbalance in favor of standardized face stimuli. We argue that psychology would benefit from a more balanced approach because ambient and spontaneous stimuli have much to offer. We advocate a cognitive ethological approach that involves studying face processing in natural settings as well as the lab, incorporating more stimuli from "the wild".
Collapse
Affiliation(s)
- Amy Dawel
- Research School of Psychology (building 39), The Australian National University, Canberra, ACT 2600, Australia.
| | - Elizabeth J Miller
- Research School of Psychology (building 39), The Australian National University, Canberra, ACT 2600, Australia
| | - Annabel Horsburgh
- Research School of Psychology (building 39), The Australian National University, Canberra, ACT 2600, Australia
| | - Patrice Ford
- Research School of Psychology (building 39), The Australian National University, Canberra, ACT 2600, Australia
| |
Collapse
|
10
|
Monteith S, Glenn T, Geddes J, Whybrow PC, Bauer M. Commercial Use of Emotion Artificial Intelligence (AI): Implications for Psychiatry. Curr Psychiatry Rep 2022; 24:203-211. [PMID: 35212918 DOI: 10.1007/s11920-022-01330-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/07/2022] [Indexed: 11/03/2022]
Abstract
PURPOSE OF REVIEW Emotion artificial intelligence (AI) is technology for emotion detection and recognition. Emotion AI is expanding rapidly in commercial and government settings outside of medicine, and will increasingly become a routine part of daily life. The goal of this narrative review is to increase awareness both of the widespread use of emotion AI, and of the concerns with commercial use of emotion AI in relation to people with mental illness. RECENT FINDINGS This paper discusses emotion AI fundamentals, a general overview of commercial emotion AI outside of medicine, and examples of the use of emotion AI in employee hiring and workplace monitoring. The successful re-integration of patients with mental illness into society must recognize the increasing commercial use of emotion AI. There are concerns that commercial use of emotion AI will increase stigma and discrimination, and have negative consequences in daily life for people with mental illness. Commercial emotion AI algorithm predictions about mental illness should not be treated as medical fact.
Collapse
Affiliation(s)
- Scott Monteith
- Michigan State University College of Human Medicine, Traverse City Campus, 1400 Medical Campus Drive, Traverse City, MI, 49684, USA.
| | - Tasha Glenn
- ChronoRecord Association, Fullerton, CA, USA
| | - John Geddes
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - Peter C Whybrow
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles (UCLA), Los Angeles, CA, USA
| | - Michael Bauer
- Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus Medical Faculty, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
11
|
Sato W, Namba S, Yang D, Nishida S, Ishi C, Minato T. An Android for Emotional Interaction: Spatiotemporal Validation of Its Facial Expressions. Front Psychol 2022; 12:800657. [PMID: 35185697 PMCID: PMC8855677 DOI: 10.3389/fpsyg.2021.800657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
Android robots capable of emotional interactions with humans have considerable potential for application to research. While several studies developed androids that can exhibit human-like emotional facial expressions, few have empirically validated androids' facial expressions. To investigate this issue, we developed an android head called Nikola based on human psychology and conducted three studies to test the validity of its facial expressions. In Study 1, Nikola produced single facial actions, which were evaluated in accordance with the Facial Action Coding System. The results showed that 17 action units were appropriately produced. In Study 2, Nikola produced the prototypical facial expressions for six basic emotions (anger, disgust, fear, happiness, sadness, and surprise), and naïve participants labeled photographs of the expressions. The recognition accuracy of all emotions was higher than chance level. In Study 3, Nikola produced dynamic facial expressions for six basic emotions at four different speeds, and naïve participants evaluated the naturalness of the speed of each expression. The effect of speed differed across emotions, as in previous studies of human expressions. These data validate the spatial and temporal patterns of Nikola's emotional facial expressions, and suggest that it may be useful for future psychological studies and real-life applications.
Collapse
Affiliation(s)
- Wataru Sato
- Psychological Process Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan
- Field Science Education and Research Center, Kyoto University, Kyoto, Japan
| | - Shushi Namba
- Psychological Process Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan
| | - Dongsheng Yang
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Shin’ya Nishida
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
- NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Atsugi, Japan
| | - Carlos Ishi
- Interactive Robot Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan
| | - Takashi Minato
- Interactive Robot Research Team, Guardian Robot Project, RIKEN, Kyoto, Japan
| |
Collapse
|
12
|
Comparing self-reported emotions and facial expressions of joy in heterosexual romantic couples. PERSONALITY AND INDIVIDUAL DIFFERENCES 2022. [DOI: 10.1016/j.paid.2021.111182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
13
|
Fernández Carbonell M, Boman M, Laukka P. Comparing supervised and unsupervised approaches to multimodal emotion recognition. PeerJ Comput Sci 2021; 7:e804. [PMID: 35036530 PMCID: PMC8725659 DOI: 10.7717/peerj-cs.804] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 11/11/2021] [Indexed: 06/14/2023]
Abstract
We investigated emotion classification from brief video recordings from the GEMEP database wherein actors portrayed 18 emotions. Vocal features consisted of acoustic parameters related to frequency, intensity, spectral distribution, and durations. Facial features consisted of facial action units. We first performed a series of person-independent supervised classification experiments. Best performance (AUC = 0.88) was obtained by merging the output from the best unimodal vocal (Elastic Net, AUC = 0.82) and facial (Random Forest, AUC = 0.80) classifiers using a late fusion approach and the product rule method. All 18 emotions were recognized with above-chance recall, although recognition rates varied widely across emotions (e.g., high for amusement, anger, and disgust; and low for shame). Multimodal feature patterns for each emotion are described in terms of the vocal and facial features that contributed most to classifier performance. Next, a series of exploratory unsupervised classification experiments were performed to gain more insight into how emotion expressions are organized. Solutions from traditional clustering techniques were interpreted using decision trees in order to explore which features underlie clustering. Another approach utilized various dimensionality reduction techniques paired with inspection of data visualizations. Unsupervised methods did not cluster stimuli in terms of emotion categories, but several explanatory patterns were observed. Some could be interpreted in terms of valence and arousal, but actor and gender specific aspects also contributed to clustering. Identifying explanatory patterns holds great potential as a meta-heuristic when unsupervised methods are used in complex classification tasks.
Collapse
Affiliation(s)
- Marcos Fernández Carbonell
- Department of Software and Computer Systems, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Magnus Boman
- Department of Software and Computer Systems, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Learning, Informatics, Management and Ethics (LIME), Karolinska Institutet, Stockholm, Sweden
| | - Petri Laukka
- Department of Psychology, Stockholm University, Stockholm, Sweden
| |
Collapse
|
14
|
Tejada J, Freitag RMK, Pinheiro BFM, Cardoso PB, Souza VRA, Silva LS. Building and validation of a set of facial expression images to detect emotions: a transcultural study. PSYCHOLOGICAL RESEARCH 2021; 86:1996-2006. [PMID: 34652530 DOI: 10.1007/s00426-021-01605-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 09/28/2021] [Indexed: 11/26/2022]
Abstract
The automatic emotion recognition from facial expressions has become an exceptional tool in research involving human subjects and has made it possible to obtain objective measurements of the emotional state of research subjects. Different software and commercial solutions are offered to perform this task. However, the adaptation to cultural context and the recognition of complex expressions and/or emotions are two of the main challenges faced by these solutions. Here, we describe the construction and validation of a set of facial expression images suitable for training a recognition system. Our datasets consist of images of people with no experience in acting who were recorded with a webcam as they performed a computer-assisted task in a room with a light background and overhead illumination. The six basic emotions and mockery were included and a combination of OpenCV, Dlib and Scikit-learn Python libraries were used to develop a support vector machine classifier. The code is available at GitHub and the images will be provided upon request. Since transcultural facial expressions to evaluate complex emotions and open-source solutions were used in this study, we strongly believe that our dataset will be useful in different research contexts.
Collapse
Affiliation(s)
- Julian Tejada
- Departamento de Psicologia, Universidade Federal de Sergipe, São Cristóvão, Brazil.
- Facultad de Psicología, Fundación Universitaria Konrad Lorenz, Bogotá, Colombia.
| | | | | | | | | | - Lucas Santos Silva
- Departamento de Letras, Universidade Federal de Sergipe, São Cristóvão, Brazil
| |
Collapse
|
15
|
Müller SR, Chen XL, Peters H, Chaintreau A, Matz SC. Depression predictions from GPS-based mobility do not generalize well to large demographically heterogeneous samples. Sci Rep 2021; 11:14007. [PMID: 34234186 PMCID: PMC8263566 DOI: 10.1038/s41598-021-93087-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 06/21/2021] [Indexed: 11/25/2022] Open
Abstract
Depression is one of the most common mental health issues in the United States, affecting the lives of millions of people suffering from it as well as those close to them. Recent advances in research on mobile sensing technologies and machine learning have suggested that a person's depression can be passively measured by observing patterns in people's mobility behaviors. However, the majority of work in this area has relied on highly homogeneous samples, most frequently college students. In this study, we analyse over 57 million GPS data points to show that the same procedure that leads to high prediction accuracy in a homogeneous student sample (N = 57; AUC = 0.82), leads to accuracies only slightly higher than chance in a U.S.-wide sample that is heterogeneous in its socio-demographic composition as well as mobility patterns (N = 5,262; AUC = 0.57). This pattern holds across three different modelling approaches which consider both linear and non-linear relationships. Further analyses suggest that the prediction accuracy is low across different socio-demographic groups, and that training the models on more homogeneous subsamples does not substantially improve prediction accuracy. Overall, the findings highlight the challenge of applying mobility-based predictions of depression at scale.
Collapse
Affiliation(s)
- Sandrine R Müller
- Data Science Institute, Columbia University, New York, USA.
- Department of Psychology, Bielefeld University, Bielefeld, Germany.
| | - Xi Leslie Chen
- Computer Science Department, Columbia University, New York, USA
| | - Heinrich Peters
- Columbia Business School, Columbia University, New York, USA
| | | | - Sandra C Matz
- Columbia Business School, Columbia University, New York, USA
| |
Collapse
|
16
|
Namba S, Sato W, Osumi M, Shimokawa K. Assessing Automated Facial Action Unit Detection Systems for Analyzing Cross-Domain Facial Expression Databases. SENSORS (BASEL, SWITZERLAND) 2021; 21:4222. [PMID: 34203007 PMCID: PMC8235167 DOI: 10.3390/s21124222] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 06/15/2021] [Accepted: 06/17/2021] [Indexed: 11/16/2022]
Abstract
In the field of affective computing, achieving accurate automatic detection of facial movements is an important issue, and great progress has already been made. However, a systematic evaluation of systems that now have access to the dynamic facial database remains an unmet need. This study compared the performance of three systems (FaceReader, OpenFace, AFARtoolbox) that detect each facial movement corresponding to an action unit (AU) derived from the Facial Action Coding System. All machines could detect the presence of AUs from the dynamic facial database at a level above chance. Moreover, OpenFace and AFAR provided higher area under the receiver operating characteristic curve values compared to FaceReader. In addition, several confusion biases of facial components (e.g., AU12 and AU14) were observed to be related to each automated AU detection system and the static mode was superior to dynamic mode for analyzing the posed facial database. These findings demonstrate the features of prediction patterns for each system and provide guidance for research on facial expressions.
Collapse
Affiliation(s)
- Shushi Namba
- Psychological Process Team, BZP, Robotics Project, RIKEN, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 6190288, Japan
| | - Wataru Sato
- Psychological Process Team, BZP, Robotics Project, RIKEN, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 6190288, Japan
| | - Masaki Osumi
- KOHINATA Limited Liability Company, 2-7-3, Tateba, Naniwa-ku, Osaka 5560020, Japan; (M.O.); (K.S.)
| | - Koh Shimokawa
- KOHINATA Limited Liability Company, 2-7-3, Tateba, Naniwa-ku, Osaka 5560020, Japan; (M.O.); (K.S.)
| |
Collapse
|
17
|
Guan H, Wei H, Hauer RJ, Liu P. Facial expressions of Asian people exposed to constructed urban forests: Accuracy validation and variation assessment. PLoS One 2021; 16:e0253141. [PMID: 34138924 PMCID: PMC8211262 DOI: 10.1371/journal.pone.0253141] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 05/30/2021] [Indexed: 01/27/2023] Open
Abstract
An outcome of building sustainable urban forests is that people's well-being is improved when they are exposed to trees. Facial expressions directly represents one's inner emotions, and can be used to assess real-time perception. The emergence and change in the facial expressions of forest visitors are an implicit process. As such, the reserved character of Asians requires an instrument rating to accurately recognize expressions. In this study, a dataset was established with 2,886 randomly photographed faces from visitors at a constructed urban forest park and at a promenade during summertime in Shenyang City, Northeast China. Six experts were invited to choose 160 photos in total with 20 images representing one of eight typical expressions: angry, contempt, disgusted, happy, neutral, sad, scared, and surprised. The FireFACE ver. 3.0 software was used to test hit-ratio validation as an accuracy measurement (ac.) to match machine-recognized photos with those identified by experts. According to the Kruskal-Wallis test on the difference from averaged scores in 20 recently published papers, contempt (ac. = 0.40%, P = 0.0038) and scared (ac. = 25.23%, P = 0.0018) expressions do not pass the validation test. Both happy and sad expression scores were higher in forests than in promenades, but there were no difference in net positive response (happy minus sad) between locations. Men had a higher happy score but lower disgusted score in forests than in promenades. Men also had a higher angry score in forests. We conclude that FireFACE can be used for analyzing facial expressions in Asian people within urban forests. Women are encouraged to visit urban forests rather than promenades to elicit more positive emotions.
Collapse
Affiliation(s)
- Haoming Guan
- School of Geographical Sciences, Northeast Normal University, Changchun, China
| | - Hongxu Wei
- Key Laboratory of Wetland Ecology and Environment, Northeast Institute of Geography and Agroecology, Chinse Academy of Sciences, Changchun, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Richard J. Hauer
- College of Natural Resources, University of Wisconsin-Stevens Point, Stevens Point, Wisconsin, United States of America
| | - Ping Liu
- College of Forestry, Shenyang Agricultural University, Shenyang, China
| |
Collapse
|
18
|
Automatic Detection of a Student's Affective States for Intelligent Teaching Systems. Brain Sci 2021; 11:brainsci11030331. [PMID: 33808032 PMCID: PMC7998267 DOI: 10.3390/brainsci11030331] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 11/16/2022] Open
Abstract
AutoTutor is an automated computer tutor that simulates human tutors and holds conversations with students in natural language. Using data collected from AutoTutor, the following determinations were sought: Can we automatically classify affect states from intelligent teaching systems to aid in the detection of a learner’s emotional state? Using frequency patterns of AutoTutor feedback and assigned user emotion in a series of pairs, can the next pair of feedback/emotion series be predicted? Through a priori data mining approaches, we found dominant frequent item sets that predict the next set of responses. Thirty-four participants provided 200 turns between the student and the AutoTutor. Two series of attributes and emotions were concatenated into one row to create a record of previous and next set of emotions. Feature extraction techniques, such as multilayer-perceptron and naive Bayes, were performed on the dataset to perform classification for affective state labeling. The emotions ‘Flow’ and ‘Frustration’ had the highest classification of all the other emotions when measured against other emotions and their respective attributes. The most common frequent item sets were ‘Flow’ and ‘Confusion’.
Collapse
|
19
|
Zloteanu M, Krumhuber EG. Expression Authenticity: The Role of Genuine and Deliberate Displays in Emotion Perception. Front Psychol 2021; 11:611248. [PMID: 33519624 PMCID: PMC7840656 DOI: 10.3389/fpsyg.2020.611248] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/21/2020] [Indexed: 11/13/2022] Open
Abstract
People dedicate significant attention to others' facial expressions and to deciphering their meaning. Hence, knowing whether such expressions are genuine or deliberate is important. Early research proposed that authenticity could be discerned based on reliable facial muscle activations unique to genuine emotional experiences that are impossible to produce voluntarily. With an increasing body of research, such claims may no longer hold up to empirical scrutiny. In this article, expression authenticity is considered within the context of senders' ability to produce convincing facial displays that resemble genuine affect and human decoders' judgments of expression authenticity. This includes a discussion of spontaneous vs. posed expressions, as well as appearance- vs. elicitation-based approaches for defining emotion recognition accuracy. We further expand on the functional role of facial displays as neurophysiological states and communicative signals, thereby drawing upon the encoding-decoding and affect-induction perspectives of emotion expressions. Theoretical and methodological issues are addressed with the aim to instigate greater conceptual and operational clarity in future investigations of expression authenticity.
Collapse
Affiliation(s)
- Mircea Zloteanu
- Department of Criminology and Sociology, Kingston University London, Kingston, United Kingdom.,Department of Psychology, Kingston University London, Kingston, United Kingdom
| | - Eva G Krumhuber
- Department of Experimental Psychology, University College London, London, United Kingdom
| |
Collapse
|
20
|
Etcoff N, Stock S, Krumhuber EG, Reed LI. A Novel Test of the Duchenne Marker: Smiles After Botulinum Toxin Treatment for Crow's Feet Wrinkles. Front Psychol 2021; 11:612654. [PMID: 33510690 PMCID: PMC7835207 DOI: 10.3389/fpsyg.2020.612654] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 12/08/2020] [Indexed: 11/24/2022] Open
Abstract
Smiles that vary in muscular configuration also vary in how they are perceived. Previous research suggests that “Duchenne smiles,” indicated by the combined actions of the orbicularis oculi (cheek raiser) and the zygomaticus major muscles (lip corner puller), signal enjoyment. This research has compared perceptions of Duchenne smiles with non-Duchenne smiles among individuals voluntarily innervating or inhibiting the orbicularis oculi muscle. Here we used a novel set of highly controlled stimuli: photographs of patients taken before and after receiving botulinum toxin treatment for crow’s feet lines that selectively paralyzed the lateral orbicularis oculi muscle and removed visible lateral eye wrinkles, to test perception of smiles. Smiles in which the orbicularis muscle was active (prior to treatment) were rated as more felt, spontaneous, intense, and happier. Post treatment patients looked younger, although not more attractive. We discuss the potential implications of these findings within the context of emotion science and clinical research on botulinum toxin.
Collapse
Affiliation(s)
- Nancy Etcoff
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Shannon Stock
- Department of Mathematics and Computer Science, College of the Holy Cross, Worcester, MA, United States
| | - Eva G Krumhuber
- Department of Experimental Psychology, University College London, London, United Kingdom
| | - Lawrence Ian Reed
- Department of Psychology, New York University, New York, NY, United States
| |
Collapse
|
21
|
Tcherkassof A, Dupré D. The emotion-facial expression link: evidence from human and automatic expression recognition. PSYCHOLOGICAL RESEARCH 2020; 85:2954-2969. [PMID: 33236175 DOI: 10.1007/s00426-020-01448-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 11/06/2020] [Indexed: 10/22/2022]
Abstract
While it has been taken for granted in the development of several automatic facial expression recognition tools, the question of the coherence between subjective feelings and facial expressions is still a subject of debate. On one hand, the "Basic Emotion View" conceives emotions as genetically hardwired and, therefore, being genuinely displayed through facial expressions. Consequently, emotion recognition is perceiver independent. On the other hand, the constructivist approach conceives emotions as socially constructed, the emotional meaning of a facial expression being inferred by the perceiver. Hence, emotion recognition is perceiver dependent. In order (1) to evaluate the coherence between the subjective feeling of emotions and their spontaneous facial displays, and (2) to compare the recognition of such displays by human perceivers and by an automatic facial expression classifier, 232 videos of expressers recruited to carry out an emotion elicitation task were annotated by 1383 human perceivers as well as by Affdex, an automatic classifier. Results show a weak consistency between self-reported emotional states by expressers and their facial emotional displays. They also show low accuracy both of human perceivers and of the automatic classifier to infer the subjective feeling from the spontaneous facial expressions displayed by expressers. However, the results are more in favor of a perceiver-dependent view. Based on these results, the hypothesis of genetically hardwired emotion genuinely displayed is difficult to support, whereas the idea of emotion and facial expression as being socially constructed appears to be more likely. Accordingly, automatic emotion recognition tools based on facial expressions should be questioned.
Collapse
Affiliation(s)
- Anna Tcherkassof
- Psychology Department, Université Grenoble Alpes, Bâtiment Michel Dubois, 1251 Avenue Centrale, Saint-Martin-d'Hères, 38400, France.
| | - Damien Dupré
- Business School, Dublin City University, DCU Glasnevin Campus, Dublin, D09, Ireland
| |
Collapse
|