1
|
Cheong JH, Jolly E, Xie T, Byrne S, Kenney M, Chang LJ. Py-Feat: Python Facial Expression Analysis Toolbox. AFFECTIVE SCIENCE 2023; 4:781-796. [PMID: 38156250 PMCID: PMC10751270 DOI: 10.1007/s42761-023-00191-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/07/2023] [Indexed: 12/30/2023]
Abstract
Studying facial expressions is a notoriously difficult endeavor. Recent advances in the field of affective computing have yielded impressive progress in automatically detecting facial expressions from pictures and videos. However, much of this work has yet to be widely disseminated in social science domains such as psychology. Current state-of-the-art models require considerable domain expertise that is not traditionally incorporated into social science training programs. Furthermore, there is a notable absence of user-friendly and open-source software that provides a comprehensive set of tools and functions that support facial expression research. In this paper, we introduce Py-Feat, an open-source Python toolbox that provides support for detecting, preprocessing, analyzing, and visualizing facial expression data. Py-Feat makes it easy for domain experts to disseminate and benchmark computer vision models and also for end users to quickly process, analyze, and visualize face expression data. We hope this platform will facilitate increased use of facial expression data in human behavior research. Supplementary Information The online version contains supplementary material available at 10.1007/s42761-023-00191-4.
Collapse
Affiliation(s)
- Jin Hyun Cheong
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Eshin Jolly
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Tiankang Xie
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
- Department of Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| | - Sophie Byrne
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Matthew Kenney
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
| | - Luke J. Chang
- Computational Social and Affective Neuroscience Laboratory, Department of Psychological & Brain Sciences, Dartmouth College, Hanover, NH 03755 USA
- Department of Quantitative Biomedical Sciences, Geisel School of Medicine, Dartmouth College, Hanover, NH 03755 USA
| |
Collapse
|
2
|
Coffman M, Di Martino JM, Aiello R, Carpenter KL, Chang Z, Compton S, Eichner B, Espinosa S, Flowers J, Franz L, Perochon S, Krishnappa Babu PR, Sapiro G, Dawson G. Relationship between quantitative digital behavioral features and clinical profiles in young autistic children. Autism Res 2023; 16:1360-1374. [PMID: 37259909 PMCID: PMC10524806 DOI: 10.1002/aur.2955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 05/06/2023] [Indexed: 06/02/2023]
Abstract
Early behavioral markers for autism include differences in social attention and orienting in response to one's name when called, and differences in body movements and motor abilities. More efficient, scalable, objective, and reliable measures of these behaviors could improve early screening for autism. This study evaluated whether objective and quantitative measures of autism-related behaviors elicited from an app (SenseToKnow) administered on a smartphone or tablet and measured via computer vision analysis (CVA) are correlated with standardized caregiver-report and clinician administered measures of autism-related behaviors and cognitive, language, and motor abilities. This is an essential step in establishing the concurrent validity of a digital phenotyping approach. In a sample of 485 toddlers, 43 of whom were diagnosed with autism, we found that CVA-based gaze variables related to social attention were associated with the level of autism-related behaviors. Two language-related behaviors measured via the app, attention to people during a conversation and responding to one's name being called, were associated with children's language skills. Finally, performance during a bubble popping game was associated with fine motor skills. These findings provide initial support for the concurrent validity of the SenseToKnow app and its potential utility in identifying clinical profiles associated with autism. Future research is needed to determine whether the app can be used as an autism screening tool, can reliably stratify autism-related behaviors, and measure changes in autism-related behaviors over time.
Collapse
Affiliation(s)
- Marika Coffman
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| | - J. Matias Di Martino
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Rachel Aiello
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Kimberly L.H. Carpenter
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Zhuoqing Chang
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Scott Compton
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Brian Eichner
- Department of Pediatrics, Duke University, Durham, NC, USA
| | - Steve Espinosa
- Office of Information Technology, Duke University, Durham, NC, USA
| | - Jacqueline Flowers
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Lauren Franz
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Global Health Institute, Duke University, Durham, NC, USA
| | - Sam Perochon
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
- Ecole Normale Superieure Paris-Saclay, Gif-Sur-Yvette, France
| | | | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Mathematics, and Computer Sciences, Duke University, Durham, NC, USA
| | - Geraldine Dawson
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Department of Psychiatric and Behavioral Sciences, Duke University, Durham, NC, USA
| |
Collapse
|
3
|
Krishnappa Babu PR, Aikat V, Di Martino JM, Chang Z, Perochon S, Espinosa S, Aiello R, L H Carpenter K, Compton S, Davis N, Eichner B, Flowers J, Franz L, Dawson G, Sapiro G. Blink rate and facial orientation reveal distinctive patterns of attentional engagement in autistic toddlers: a digital phenotyping approach. Sci Rep 2023; 13:7158. [PMID: 37137954 PMCID: PMC10156751 DOI: 10.1038/s41598-023-34293-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 04/27/2023] [Indexed: 05/05/2023] Open
Abstract
Differences in social attention are well-documented in autistic individuals, representing one of the earliest signs of autism. Spontaneous blink rate has been used to index attentional engagement, with lower blink rates reflecting increased engagement. We evaluated novel methods using computer vision analysis (CVA) for automatically quantifying patterns of attentional engagement in young autistic children, based on facial orientation and blink rate, which were captured via mobile devices. Participants were 474 children (17-36 months old), 43 of whom were diagnosed with autism. Movies containing social or nonsocial content were presented via an iPad app, and simultaneously, the device's camera recorded the children's behavior while they watched the movies. CVA was used to extract the duration of time the child oriented towards the screen and their blink rate as indices of attentional engagement. Overall, autistic children spent less time facing the screen and had a higher mean blink rate compared to neurotypical children. Neurotypical children faced the screen more often and blinked at a lower rate during the social movies compared to the nonsocial movies. In contrast, autistic children faced the screen less often during social movies than during nonsocial movies and showed no differential blink rate to social versus nonsocial movies.
Collapse
Affiliation(s)
| | - Vikram Aikat
- Department of Computer Science, Duke University, Durham, NC, USA
| | - J Matias Di Martino
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Zhuoqing Chang
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Sam Perochon
- Ecole Normale Supérieure Paris-Saclay, Gif-Sur-Yvette, France
| | - Steven Espinosa
- Office of Information Technology, Duke University, Durham, NC, USA
| | - Rachel Aiello
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
| | - Kimberly L H Carpenter
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
| | - Scott Compton
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
| | - Naomi Davis
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
| | - Brian Eichner
- Department of Pediatrics, Duke University, Durham, NC, USA
| | - Jacqueline Flowers
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
| | - Lauren Franz
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA
- Duke Global Health Institute, Duke University, Durham, NC, USA
| | - Geraldine Dawson
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA.
- Duke Center for Autism and Brain Development, Duke University, Durham, NC, USA.
| | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA.
- Departments of Biomedical Engineering, Mathematics, and Computer Science, Duke University, Durham, NC, USA.
| |
Collapse
|
4
|
Wang S, Ding H, Peng G. Dual Learning for Facial Action Unit Detection Under Nonfull Annotation. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:2225-2237. [PMID: 32881700 DOI: 10.1109/tcyb.2020.3003502] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Most methods for facial action unit (AU) recognition typically require training images that are fully AU labeled. Manual AU annotation is time intensive. To alleviate this, we propose a novel dual learning framework and apply it to AU detection under two scenarios, that is, semisupervised AU detection with partially AU-labeled and fully expression-labeled samples, and weakly supervised AU detection with fully expression-labeled samples alone. We leverage two forms of auxiliary information. The first is the probabilistic duality between the AU detection task and its dual task, in this case, the face synthesis task given AU labels. We also take advantage of the dependencies among multiple AUs, the dependencies between expression and AUs, and the dependencies between facial features and AUs. Specifically, the proposed method consists of a classifier, an image generator, and a discriminator. The classifier and generator yield face-AU-expression tuples, which are forced to coverage of the ground-truth distribution. This joint distribution also includes three kinds of inherent dependencies: 1) the dependencies among multiple AUs; 2) the dependencies between expression and AUs; and 3) the dependencies between facial features and AUs. We reconstruct the inputted face and AU labels and introduce two reconstruction losses. In a semisupervised scenario, the supervised loss is also incorporated into the full objective for AU-labeled samples. In a weakly supervised scenario, we generate pseudo paired data according to the domain knowledge about expression and AUs. Semisupervised and weakly supervised experiments on three widely used datasets demonstrate the superiority of the proposed method for AU detection and facial synthesis tasks over current works.
Collapse
|
5
|
Perochon S, Di Martino M, Aiello R, Baker J, Carpenter K, Chang Z, Compton S, Davis N, Eichner B, Espinosa S, Flowers J, Franz L, Gagliano M, Harris A, Howard J, Kollins SH, Perrin EM, Raj P, Spanos M, Walter B, Sapiro G, Dawson G. A scalable computational approach to assessing response to name in toddlers with autism. J Child Psychol Psychiatry 2021; 62:1120-1131. [PMID: 33641216 PMCID: PMC8397798 DOI: 10.1111/jcpp.13381] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 08/15/2020] [Accepted: 12/04/2020] [Indexed: 12/26/2022]
Abstract
BACKGROUND This study is part of a larger research program focused on developing objective, scalable tools for digital behavioral phenotyping. We evaluated whether a digital app delivered on a smartphone or tablet using computer vision analysis (CVA) can elicit and accurately measure one of the most common early autism symptoms, namely failure to respond to a name call. METHODS During a pediatric primary care well-child visit, 910 toddlers, 17-37 months old, were administered an app on an iPhone or iPad consisting of brief movies during which the child's name was called three times by an examiner standing behind them. Thirty-seven toddlers were subsequently diagnosed with autism spectrum disorder (ASD). Name calls and children's behavior were recorded by the camera embedded in the device, and children's head turns were coded by both CVA and a human. RESULTS CVA coding of response to name was found to be comparable to human coding. Based on CVA, children with ASD responded to their name significantly less frequently than children without ASD. CVA also revealed that children with ASD who did orient to their name exhibited a longer latency before turning their head. Combining information about both the frequency and the delay in response to name improved the ability to distinguish toddlers with and without ASD. CONCLUSIONS A digital app delivered on an iPhone or iPad in real-world settings using computer vision analysis to quantify behavior can reliably detect a key early autism symptom-failure to respond to name. Moreover, the higher resolution offered by CVA identified a delay in head turn in toddlers with ASD who did respond to their name. Digital phenotyping is a promising methodology for early assessment of ASD symptoms.
Collapse
Affiliation(s)
- Sam Perochon
- Department of Electrical and Computer Engineering, Duke University
| | | | - Rachel Aiello
- Department of Psychiatry and Behavioral Sciences, Duke University
| | | | | | - Zhuoqing Chang
- Department of Electrical and Computer Engineering, Duke University
| | - Scott Compton
- Department of Psychiatry and Behavioral Sciences, Duke University
| | - Naomi Davis
- Department of Psychiatry and Behavioral Sciences, Duke University
| | | | | | | | - Lauren Franz
- Department of Psychiatry and Behavioral Sciences, Duke University
| | | | - Adrianne Harris
- Department of Psychiatry and Behavioral Sciences, Duke University.; Department of Psychology & Neuroscience, Duke University
| | - Jill Howard
- Department of Psychiatry and Behavioral Sciences, Duke University
| | - Scott H. Kollins
- Department of Psychiatry and Behavioral Sciences, Duke University
| | - Eliana M. Perrin
- Department of Pediatrics, Duke University.; Duke Center for Childhood Obesity Research
| | - Pradeep Raj
- Department of Electrical and Computer Engineering, Duke University
| | - Marina Spanos
- Department of Psychiatry and Behavioral Sciences, Duke University
| | - Barbara Walter
- Department of Psychiatry and Behavioral Sciences, Duke University
| | | | | |
Collapse
|
6
|
Bovery M, Dawson G, Hashemi J, Sapiro G. A Scalable Off-the-Shelf Framework for Measuring Patterns of Attention in Young Children and its Application in Autism Spectrum Disorder. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2021; 12:722-731. [PMID: 35450132 PMCID: PMC9017594 DOI: 10.1109/taffc.2018.2890610] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Autism spectrum disorder (ASD) is associated with deficits in the processing of social information and difficulties in social interaction, and individuals with ASD exhibit atypical attention and gaze. Traditionally, gaze studies have relied upon precise and constrained means of monitoring attention using expensive equipment in laboratories. In this work we develop a low-cost off-the-shelf alternative for measuring attention that can be used in natural settings. The head and iris positions of 104 16-31 months children, an age range appropriate for ASD screening and diagnosis, 22 of them diagnosed with ASD, were recorded using the front facing camera in an iPad while they watched on the device screen a movie displaying dynamic stimuli, social stimuli on the left and nonsocial stimuli on the right. The head and iris position were then automatically analyzed via computer vision algorithms to detect the direction of attention. Children in the ASD group paid less attention to the movie, showed less attention to the social as compared to the nonsocial stimuli, and often fixated their attention to one side of the screen. The proposed method provides a low-cost means of monitoring attention to properly designed stimuli, demonstrating that the integration of stimuli design and automatic response analysis results in the opportunity to use off-the-shelf cameras to assess behavioral biomarkers.
Collapse
Affiliation(s)
- Matthieu Bovery
- EEA Department, ENS Paris-Saclay, Cachan, FRANCE. He performed this work while visiting Duke University
| | - Geraldine Dawson
- Department of Psychiatry and Behavioral Sciences, Duke Center for Autism and Brain Development, and the Duke Institute for Brain Sciences, Durham, NC
| | - Jordan Hashemi
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, NC.; BME, CS, and Math at Duke University
| |
Collapse
|
7
|
Enhanced convolutional LSTM with spatial and temporal skip connections and temporal gates for facial expression recognition from video. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05557-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Bhatti YK, Jamil A, Nida N, Yousaf MH, Viriri S, Velastin SA. Facial Expression Recognition of Instructor Using Deep Features and Extreme Learning Machine. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:5570870. [PMID: 34007266 PMCID: PMC8110428 DOI: 10.1155/2021/5570870] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 02/22/2021] [Accepted: 04/12/2021] [Indexed: 11/25/2022]
Abstract
Classroom communication involves teacher's behavior and student's responses. Extensive research has been done on the analysis of student's facial expressions, but the impact of instructor's facial expressions is yet an unexplored area of research. Facial expression recognition has the potential to predict the impact of teacher's emotions in a classroom environment. Intelligent assessment of instructor behavior during lecture delivery not only might improve the learning environment but also could save time and resources utilized in manual assessment strategies. To address the issue of manual assessment, we propose an instructor's facial expression recognition approach within a classroom using a feedforward learning model. First, the face is detected from the acquired lecture videos and key frames are selected, discarding all the redundant frames for effective high-level feature extraction. Then, deep features are extracted using multiple convolution neural networks along with parameter tuning which are then fed to a classifier. For fast learning and good generalization of the algorithm, a regularized extreme learning machine (RELM) classifier is employed which classifies five different expressions of the instructor within the classroom. Experiments are conducted on a newly created instructor's facial expression dataset in classroom environments plus three benchmark facial datasets, i.e., Cohn-Kanade, the Japanese Female Facial Expression (JAFFE) dataset, and the Facial Expression Recognition 2013 (FER2013) dataset. Furthermore, the proposed method is compared with state-of-the-art techniques, traditional classifiers, and convolutional neural models. Experimentation results indicate significant performance gain on parameters such as accuracy, F1-score, and recall.
Collapse
Affiliation(s)
- Yusra Khalid Bhatti
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
| | - Afshan Jamil
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
| | - Nudrat Nida
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
| | - Muhammad Haroon Yousaf
- Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
- Swarm Robotics Lab, National Centre for Robotics and Automation (NCRA), Rawalpindi, Pakistan
| | - Serestina Viriri
- Department of Computer Science, University of Kwazulu Natal, Durban, South Africa
| | - Sergio A. Velastin
- School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK
- Department of Computer Science and Engineering, Universidad Carlos III de Madrid, Leganés, Madrid 28911, Spain
| |
Collapse
|
9
|
Carpenter KLH, Hahemi J, Campbell K, Lippmann SJ, Baker JP, Egger HL, Espinosa S, Vermeer S, Sapiro G, Dawson G. Digital Behavioral Phenotyping Detects Atypical Pattern of Facial Expression in Toddlers with Autism. Autism Res 2021; 14:488-499. [PMID: 32924332 PMCID: PMC7920907 DOI: 10.1002/aur.2391] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 08/16/2020] [Accepted: 08/24/2020] [Indexed: 12/21/2022]
Abstract
Commonly used screening tools for autism spectrum disorder (ASD) generally rely on subjective caregiver questionnaires. While behavioral observation is more objective, it is also expensive, time-consuming, and requires significant expertise to perform. As such, there remains a critical need to develop feasible, scalable, and reliable tools that can characterize ASD risk behaviors. This study assessed the utility of a tablet-based behavioral assessment for eliciting and detecting one type of risk behavior, namely, patterns of facial expression, in 104 toddlers (ASD N = 22) and evaluated whether such patterns differentiated toddlers with and without ASD. The assessment consisted of the child sitting on his/her caregiver's lap and watching brief movies shown on a smart tablet while the embedded camera recorded the child's facial expressions. Computer vision analysis (CVA) automatically detected and tracked facial landmarks, which were used to estimate head position and facial expressions (Positive, Neutral, All Other). Using CVA, specific points throughout the movies were identified that reliably differentiate between children with and without ASD based on their patterns of facial movement and expressions (area under the curves for individual movies ranging from 0.62 to 0.73). During these instances, children with ASD more frequently displayed Neutral expressions compared to children without ASD, who had more All Other expressions. The frequency of All Other expressions was driven by non-ASD children more often displaying raised eyebrows and an open mouth, characteristic of engagement/interest. Preliminary results suggest computational coding of facial movements and expressions via a tablet-based assessment can detect differences in affective expression, one of the early, core features of ASD. LAY SUMMARY: This study tested the use of a tablet in the behavioral assessment of young children with autism. Children watched a series of developmentally appropriate movies and their facial expressions were recorded using the camera embedded in the tablet. Results suggest that computational assessments of facial expressions may be useful in early detection of symptoms of autism.
Collapse
Affiliation(s)
- Kimberly L H Carpenter
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Jordan Hahemi
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
| | - Kathleen Campbell
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
- Department of Pediatrics, University of Utah, Salt Lake City, Utah, USA
| | - Steven J Lippmann
- Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Jeffrey P Baker
- Department of Pediatrics, Duke University School of Medicine, Durham, North Carolina, USA
| | - Helen L Egger
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
- NYU Langone Child Study Center, New York University, New York, New York, USA
| | - Steven Espinosa
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
| | - Saritha Vermeer
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Guillermo Sapiro
- Departments of Biomedical Engineering Computer Science, and Mathematics, Duke University, Durham, North Carolina, USA
| | - Geraldine Dawson
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina, USA
- Duke Institute for Brain Sciences, Duke University, Durham, North Carolina, USA
| |
Collapse
|
10
|
Predicting individual emotion from perception-based non-contact sensor big data. Sci Rep 2021; 11:2317. [PMID: 33504868 PMCID: PMC7840765 DOI: 10.1038/s41598-021-81958-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 01/13/2021] [Indexed: 11/08/2022] Open
Abstract
This study proposes a system for estimating individual emotions based on collected indoor environment data for human participants. At the first step, we develop wireless sensor nodes, which collect indoor environment data regarding human perception, for monitoring working environments. The developed system collects indoor environment data obtained from the developed sensor nodes and the emotions data obtained from pulse and skin temperatures as big data. Then, the proposed system estimates individual emotions from collected indoor environment data. This study also investigates whether sensory data are effective for estimating individual emotions. Indoor environmental data obtained by developed sensors and emotions data obtained from vital data were logged over a period of 60 days. Emotions were estimated from indoor environmental data by machine learning method. The experimental results show that the proposed system achieves about 80% or more estimation correspondence by using multiple types of sensors, thereby demonstrating the effectiveness of the proposed system. Our obtained result that emotions can be determined with high accuracy from environmental data is a useful finding for future research approaches.
Collapse
|
11
|
Jia S, Wang S, Hu C, Webster PJ, Li X. Detection of Genuine and Posed Facial Expressions of Emotion: Databases and Methods. Front Psychol 2021; 11:580287. [PMID: 33519600 PMCID: PMC7844089 DOI: 10.3389/fpsyg.2020.580287] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Accepted: 12/09/2020] [Indexed: 11/18/2022] Open
Abstract
Facial expressions of emotion play an important role in human social interactions. However, posed expressions of emotion are not always the same as genuine feelings. Recent research has found that facial expressions are increasingly used as a tool for understanding social interactions instead of personal emotions. Therefore, the credibility assessment of facial expressions, namely, the discrimination of genuine (spontaneous) expressions from posed (deliberate/volitional/deceptive) ones, is a crucial yet challenging task in facial expression understanding. With recent advances in computer vision and machine learning techniques, rapid progress has been made in recent years for automatic detection of genuine and posed facial expressions. This paper presents a general review of the relevant research, including several spontaneous vs. posed (SVP) facial expression databases and various computer vision based detection methods. In addition, a variety of factors that will influence the performance of SVP detection methods are discussed along with open issues and technical challenges in this nascent field.
Collapse
Affiliation(s)
- Shan Jia
- State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing, Wuhan University, Wuhan, China.,Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, United States
| | - Shuo Wang
- Department of Chemical and Biomedical Engineering, West Virginia University, Morgantown, WV, United States
| | - Chuanbo Hu
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, United States
| | - Paula J Webster
- Department of Chemical and Biomedical Engineering, West Virginia University, Morgantown, WV, United States
| | - Xin Li
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, United States
| |
Collapse
|
12
|
Hashemi J, Dawson G, Carpenter KLH, Campbell K, Qiu Q, Espinosa S, Marsan S, Baker JP, Egger HL, Sapiro G. Computer Vision Analysis for Quantification of Autism Risk Behaviors. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2021; 12:215-226. [PMID: 35401938 PMCID: PMC8993160 DOI: 10.1109/taffc.2018.2868196] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Observational behavior analysis plays a key role for the discovery and evaluation of risk markers for many neurodevelopmental disorders. Research on autism spectrum disorder (ASD) suggests that behavioral risk markers can be observed at 12 months of age or earlier, with diagnosis possible at 18 months. To date, these studies and evaluations involving observational analysis tend to rely heavily on clinical practitioners and specialists who have undergone intensive training to be able to reliably administer carefully designed behavioural-eliciting tasks, code the resulting behaviors, and interpret such behaviors. These methods are therefore extremely expensive, time-intensive, and are not easily scalable for large population or longitudinal observational analysis. We developed a self-contained, closed-loop, mobile application with movie stimuli designed to engage the child's attention and elicit specific behavioral and social responses, which are recorded with a mobile device camera and then analyzed via computer vision algorithms. Here, in addition to presenting this paradigm, we validate the system to measure engagement, name-call responses, and emotional responses of toddlers with and without ASD who were presented with the application. Additionally, we show examples of how the proposed framework can further risk marker research with fine-grained quantification of behaviors. The results suggest these objective and automatic methods can be considered to aid behavioral analysis, and can be suited for objective automatic analysis for future studies.
Collapse
Affiliation(s)
- Jordan Hashemi
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| | - Geraldine Dawson
- Department of Psychiatry and Behavioral Sciences, Duke Center for Autism and Brain Development, and the Duke Institute for Brain Sciences, Durham, NC
| | - Kimberly L H Carpenter
- Department of Psychiatry and Behavioral Sciences, Duke Center for Autism and Brain Development, and the Duke Institute for Brain Sciences, Durham, NC
| | | | - Qiang Qiu
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| | - Steven Espinosa
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| | - Samuel Marsan
- Department of Psychiatry and Behavioral Sciences, Durham, NC
| | | | - Helen L Egger
- Department of Child and Adolescent Psychiatry, NYU Langone Health, New York, NY. She performed this work while at Duke University
| | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, NC
| |
Collapse
|
13
|
Hsu GSJ, Xie RC, Ambikapathi A, Chou KJ. A deep learning framework for heart rate estimation from facial videos. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
14
|
Robustness comparison between the capsule network and the convolutional network for facial expression recognition. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01895-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
15
|
Chang Z, Chen Z, Stephen CD, Schmahmann JD, Wu HT, Sapiro G, Gupta AS. Accurate detection of cerebellar smooth pursuit eye movement abnormalities via mobile phone video and machine learning. Sci Rep 2020; 10:18641. [PMID: 33122811 PMCID: PMC7596555 DOI: 10.1038/s41598-020-75661-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 10/19/2020] [Indexed: 02/07/2023] Open
Abstract
Eye movements are disrupted in many neurodegenerative diseases and are frequent and early features in conditions affecting the cerebellum. Characterizing eye movements is important for diagnosis and may be useful for tracking disease progression and response to therapies. Assessments are limited as they require an in-person evaluation by a neurology subspecialist or specialized and expensive equipment. We tested the hypothesis that important eye movement abnormalities in cerebellar disorders (i.e., ataxias) could be captured from iPhone video. Videos of the face were collected from individuals with ataxia (n = 102) and from a comparative population (Parkinson's disease or healthy participants, n = 61). Computer vision algorithms were used to track the position of the eye which was transformed into high temporal resolution spectral features. Machine learning models trained on eye movement features were able to identify abnormalities in smooth pursuit (a key eye behavior) and accurately distinguish individuals with abnormal pursuit from controls (sensitivity = 0.84, specificity = 0.77). A novel machine learning approach generated severity estimates that correlated well with the clinician scores. We demonstrate the feasibility of capturing eye movement information using an inexpensive and widely accessible technology. This may be a useful approach for disease screening and for measuring severity in clinical trials.
Collapse
Affiliation(s)
- Zhuoqing Chang
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
| | - Ziyu Chen
- Department of Mathematics, Duke University, Durham, NC, USA
| | - Christopher D Stephen
- Ataxia Center and Department of Neurology, Massachusetts General Hospital, Harvard Medical School, 100 Cambridge St, Boston, MA, USA
| | - Jeremy D Schmahmann
- Ataxia Center and Department of Neurology, Massachusetts General Hospital, Harvard Medical School, 100 Cambridge St, Boston, MA, USA
| | - Hau-Tieng Wu
- Department of Mathematics, Duke University, Durham, NC, USA
- Department of Statistical Science, Duke University, Durham, NC, USA
| | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
- Department of Mathematics, Duke University, Durham, NC, USA
- Department of Computer Science and Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Anoopum S Gupta
- Ataxia Center and Department of Neurology, Massachusetts General Hospital, Harvard Medical School, 100 Cambridge St, Boston, MA, USA.
| |
Collapse
|
16
|
Mirjalili V, Raschka S, Ross A. PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:9400-9412. [PMID: 32956058 DOI: 10.1109/tip.2020.3024026] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Recent research has established the possibility of deducing soft-biometric attributes such as age, gender and race from an individual's face image with high accuracy. However, this raises privacy concerns, especially when face images collected for biometric recognition purposes are used for attribute analysis without the person's consent. To address this problem, we develop a technique for imparting soft biometric privacy to face images via an image perturbation methodology. The image perturbation is undertaken using a GAN-based Semi-Adversarial Network (SAN) - referred to as PrivacyNet - that modifies an input face image such that it can be used by a face matcher for matching purposes but cannot be reliably used by an attribute classifier. Further, PrivacyNet allows a person to choose specific attributes that have to be obfuscated in the input face images (e.g., age and race), while allowing for other types of attributes to be extracted (e.g., gender). Extensive experiments using multiple face matchers, multiple age/gender/race classifiers, and multiple face datasets demonstrate the generalizability of the proposed multi-attribute privacy enhancing method across multiple face and attribute classifiers.
Collapse
|
17
|
|
18
|
Liu X, Xia Y, Yu H, Dong J, Jian M, Pham TD. Region Based Parallel Hierarchy Convolutional Neural Network for Automatic Facial Nerve Paralysis Evaluation. IEEE Trans Neural Syst Rehabil Eng 2020; 28:2325-2332. [PMID: 32881689 DOI: 10.1109/tnsre.2020.3021410] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this article, we propose a parallel hierarchy convolutional neural network (PHCNN) combining a Long Short-Term Memory (LSTM) network structure to quantitatively assess the grading of facial nerve paralysis (FNP) by considering the region-based asymmetric facial features and temporal variation of the image sequences. FNP, such as Bell's palsy, is the most common facial symptom of neuromotor dysfunctions. It causes the weakness of facial muscles for the normal emotional expression and movements. The subjective judgement by clinicians completely depends on individual experience, which may not lead to a uniform evaluation. Existing computer-aided methods mainly rely on some complicated imaging equipment, which is complicated and expensive for facial functional rehabilitation. Compared with the subjective judgment and complex imaging processing, the objective and intelligent measurement can potentially avoid this issue. Considering dynamic variation in both global and regional facial areas, the proposed hierarchical network with LSTM structure can effectively improve the diagnostic accuracy and extract paralysis detail from the low-level shape, contour to sematic level features. By segmenting the facial area into two palsy regions, the proposed method can discriminate FNP from normal face accurately and significantly reduce the effect caused by age wrinkles and unrepresentative organs with shape and position variations on feature learning. Experiment on the YouTube Facial Palsy Database and Extended CohnKanade Database shows that the proposed method is superior to the state of the art deep learning methods.
Collapse
|
19
|
Chu WS, De la Torre F, Cohn JF. Learning Facial Action Units with Spatiotemporal Cues and Multi-label Sampling. IMAGE AND VISION COMPUTING 2019; 81:1-14. [PMID: 30524157 PMCID: PMC6277040 DOI: 10.1016/j.imavis.2018.10.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Facial action units (AUs) may be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during training the network, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs.
Collapse
Affiliation(s)
- Wen-Sheng Chu
- Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
| | | | - Jeffrey F Cohn
- Department of Psychology, University of Pittsburgh, Pittsburgh, USA
| |
Collapse
|
20
|
Dawson G, Campbell K, Hashemi J, Lippmann SJ, Smith V, Carpenter K, Egger H, Espinosa S, Vermeer S, Baker J, Sapiro G. Atypical postural control can be detected via computer vision analysis in toddlers with autism spectrum disorder. Sci Rep 2018; 8:17008. [PMID: 30451886 PMCID: PMC6242931 DOI: 10.1038/s41598-018-35215-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 10/31/2018] [Indexed: 12/23/2022] Open
Abstract
Evidence suggests that differences in motor function are an early feature of autism spectrum disorder (ASD). One aspect of motor ability that develops during childhood is postural control, reflected in the ability to maintain a steady head and body position without excessive sway. Observational studies have documented differences in postural control in older children with ASD. The present study used computer vision analysis to assess midline head postural control, as reflected in the rate of spontaneous head movements during states of active attention, in 104 toddlers between 16-31 months of age (Mean = 22 months), 22 of whom were diagnosed with ASD. Time-series data revealed robust group differences in the rate of head movements while the toddlers watched movies depicting social and nonsocial stimuli. Toddlers with ASD exhibited a significantly higher rate of head movement as compared to non-ASD toddlers, suggesting difficulties in maintaining midline position of the head while engaging attentional systems. The use of digital phenotyping approaches, such as computer vision analysis, to quantify variation in early motor behaviors will allow for more precise, objective, and quantitative characterization of early motor signatures and potentially provide new automated methods for early autism risk identification.
Collapse
Affiliation(s)
- Geraldine Dawson
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University, Durham, North Carolina, USA.
| | | | - Jordan Hashemi
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University, Durham, North Carolina, USA
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
| | - Steven J Lippmann
- Department of Population Health Sciences, Duke University, Durham, North Carolina, USA
| | - Valerie Smith
- Department of Population Health Sciences, Duke University, Durham, North Carolina, USA
| | - Kimberly Carpenter
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University, Durham, North Carolina, USA
| | - Helen Egger
- NYU Langone Child Study Center, New York University, New York, New York, USA
| | - Steven Espinosa
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
| | - Saritha Vermeer
- Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University, Durham, North Carolina, USA
| | - Jeffrey Baker
- Department of Pediatrics, Duke University, Durham, NC, USA
| | - Guillermo Sapiro
- Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA
- Departments of Biomedical Engineering, Computer Science, and Mathematics, Duke University, Durham, NC, USA
| |
Collapse
|
21
|
Wang S, Peng G, Chen S, Ji Q. Weakly Supervised Facial Action Unit Recognition With Domain Knowledge. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:3265-3276. [PMID: 30273163 DOI: 10.1109/tcyb.2018.2868194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Current facial action unit (AU) recognition typically includes supervised training, where the fully AU annotated training images are required. Due to the nuances of facial appearance and individual differences, AU annotation is a time-consuming, expensive, and error-prone process. Facial expression is relatively simple to label, since facial expressions describe facial behavior globally and the number of expressions appearing on a face is much less than that of AUs. Furthermore, there exist strong dependencies between AUs and expressions, referred to as domain knowledge. Such domain knowledge is inherent in facial anatomy and facial behavior. Therefore, in this paper, we propose a novel weakly supervised AU recognition method to jointly learn multiple AU classifiers with expression annotations but without any AU annotations by leveraging domain knowledge. Specifically, we first summarize the expression-dependent AU ranking from the domain knowledge of conditional probabilities of AUs given expressions. Then, we formulate the weakly supervised AU recognition as a multilabel ranking problem and propose an efficient learning algorithm to solve it. Furthermore, we extend the proposed weakly supervised AU recognition method to a semi-supervised learning scenario when partial AU labeled samples are available. Experimental results on three benchmark databases demonstrate that the proposed method can successfully exploit domain knowledge for multiple AU recognition and, thus, outperforms both state-of-the-art weakly supervised AU recognition method and the semi-supervised AU recognition method.
Collapse
|
22
|
Nelson BW, Allen NB. Extending the Passive-Sensing Toolbox: Using Smart-Home Technology in Psychological Science. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2018; 13:718-733. [PMID: 30217132 DOI: 10.1177/1745691618776008] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
New smart-home devices provide the opportunity to advance psychological science and theory through novel research opportunities in home environments. These technologies extend the in vivo research and intervention capabilities afforded by other assessment techniques such as Ecological Momentary Assessment methods as well as mobile and wearable devices. Smart-home devices contain a multitude of sensors capable of continuously and unobtrusively collecting multimodal data within home contexts. These devices have some complementary strengths and limitations compared with other assessment methods. This article (a) briefly reviews data collection methods in home environments, (b) discusses the unique advantages of smart-home devices, (c) describes the extant smart-home literature, (d) explores how these devices may advance evaluation and refinement of psychological theories, (e) describes examples of psychological processes that are potential targets for smart-home assessment and intervention, (f) considers methodological challenges and barriers, (g) discusses ethical considerations, and (h) concludes with a discussion of future directions for research and the merging of passive-sensing technologies with active self-report methods. This article aims to highlight the potential utility of smart-home devices within psychological research to evaluate psychological theories related to behavior within the home context.
Collapse
Affiliation(s)
- Benjamin W Nelson
- Department of Psychology and The Center for Digital Mental Health, University of Oregon
| | - Nicholas B Allen
- Department of Psychology and The Center for Digital Mental Health, University of Oregon
| |
Collapse
|
23
|
Wang C, Pun T, Chanel G. A Comparative Survey of Methods for Remote Heart Rate Detection From Frontal Face Videos. Front Bioeng Biotechnol 2018; 6:33. [PMID: 29765940 PMCID: PMC5938474 DOI: 10.3389/fbioe.2018.00033] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 03/13/2018] [Indexed: 11/14/2022] Open
Abstract
Remotely measuring physiological activity can provide substantial benefits for both the medical and the affective computing applications. Recent research has proposed different methodologies for the unobtrusive detection of heart rate (HR) using human face recordings. These methods are based on subtle color changes or motions of the face due to cardiovascular activities, which are invisible to human eyes but can be captured by digital cameras. Several approaches have been proposed such as signal processing and machine learning. However, these methods are compared with different datasets, and there is consequently no consensus on method performance. In this article, we describe and evaluate several methods defined in literature, from 2008 until present day, for the remote detection of HR using human face recordings. The general HR processing pipeline is divided into three stages: face video processing, face blood volume pulse (BVP) signal extraction, and HR computation. Approaches presented in the paper are classified and grouped according to each stage. At each stage, algorithms are analyzed and compared based on their performance using the public database MAHNOB-HCI. Results found in this article are limited on MAHNOB-HCI dataset. Results show that extracted face skin area contains more BVP information. Blind source separation and peak detection methods are more robust with head motions for estimating HR.
Collapse
Affiliation(s)
- Chen Wang
- Computer Vision and Multimedia Laboratory, Computer Science Department, University of Geneva, Geneva, Switzerland
| | - Thierry Pun
- Computer Vision and Multimedia Laboratory, Computer Science Department, University of Geneva, Geneva, Switzerland.,Swiss Center for Affective Sciences, Campus Biotech, University of Geneva, Geneva, Switzerland
| | - Guillaume Chanel
- Computer Vision and Multimedia Laboratory, Computer Science Department, University of Geneva, Geneva, Switzerland.,Swiss Center for Affective Sciences, Campus Biotech, University of Geneva, Geneva, Switzerland
| |
Collapse
|
24
|
Simon T, Valmadre J, Matthews I, Sheikh Y. Kronecker-Markov Prior for Dynamic 3D Reconstruction. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2017; 39:2201-2214. [PMID: 27992328 DOI: 10.1109/tpami.2016.2638904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Recovering dynamic 3D structures from 2D image observations is highly under-constrained because of projection and missing data, motivating the use of strong priors to constrain shape deformation. In this paper, we empirically show that the spatiotemporal covariance of natural deformations is dominated by a Kronecker pattern. We demonstrate that this pattern arises as the limit of a spatiotemporal autoregressive process, and derive a Kronecker Markov Random Field as a prior distribution over dynamic structures. This distribution unifies shape and trajectory models of prior art and has the individual models as its marginals. The key assumption of the Kronecker MRF is that the spatiotemporal covariance is separable into the product of a temporal and a shape covariance, and can therefore be modeled using the matrix normal distribution. Analysis on motion capture data validates that this distribution is an accurate approximation with significantly fewer free parameters. Using the trace-norm, we present a convex method to estimate missing data from a single sequence when the marginal shape distribution is unknown. The Kronecker-Markov distribution, fit to a single sequence, outperforms state-of-the-art methods at inferring missing 3D data, and additionally provides covariance estimates of the uncertainty.
Collapse
|
25
|
Liu P, Guo JM, Tseng SH, Wong K, Lee JD, Yao CC, Zhu D. Ocular Recognition for Blinking Eyes. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:5070-5081. [PMID: 28600245 DOI: 10.1109/tip.2017.2713041] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Ocular recognition is expected to provide a higher flexibility in handling practical applications as oppose to the iris recognition, which only works for the ideal open-eye case. However, the accuracy of the recent efforts is still far from satisfactory at uncontrollable conditions, such as eye blinking which implies any poses of eyes. To address these issues, the skin texture, eyelids, and additional geometrical features are employed. In addition, to achieve higher accuracy, sequential forward floating selection is utilized to select the best feature combinations. Finally, the non-linear support vector machine is applied for identification purpose. Experimental results demonstrate that the proposed algorithm achieves the best accuracy for both open eye and blinking eye scenarios. As a result, it offers greater flexibility for the prospective subjects during recognition as well as higher reliability for security.
Collapse
|
26
|
Chu WS, De la Torre F, Cohn JF, Messinger DS. A Branch-and-Bound Framework for Unsupervised Common Event Discovery. Int J Comput Vis 2017; 123:372-391. [PMID: 28943718 DOI: 10.1007/s11263-017-0989-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Event discovery aims to discover a temporal segment of interest, such as human behavior, actions or activities. Most approaches to event discovery within or between time series use supervised learning. This becomes problematic when some relevant event labels are unknown, are difficult to detect, or not all possible combinations of events have been anticipated. To overcome these problems, this paper explores Common Event Discovery (CED), a new problem that aims to discover common events of variable-length segments in an unsupervised manner. A potential solution to CED is searching over all possible pairs of segments, which would incur a prohibitive quartic cost. In this paper, we propose an efficient branch-and-bound (B&B) framework that avoids exhaustive search while guaranteeing a globally optimal solution. To this end, we derive novel bounding functions for various commonality measures and provide extensions to multiple commonality discovery and accelerated search. The B&B framework takes as input any multidimensional signal that can be quantified into histograms. A generalization of the framework can be readily applied to discover events at the same or different times (synchrony and event commonality, respectively). We consider extensions to video search and supervised event detection. The effectiveness of the B&B framework is evaluated in motion capture of deliberate behavior and in video of spontaneous facial behavior in diverse interpersonal contexts: interviews, small groups of young adults, and parent-infant face-to-face interaction.
Collapse
Affiliation(s)
| | | | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, USA
- Department of Psychology, University of Pittsburgh, USA
| | | |
Collapse
|
27
|
Abstract
This work focuses on developing a 2D Canny edge-based deformable image registration (Canny DIR) algorithm to register in vivo white light images taken at various time points. This method uses a sparse interpolation deformation algorithm to sparsely register regions of the image with strong edge information. A stability criterion is enforced which removes regions of edges that do not deform in a smooth uniform manner. Using a synthetic mouse surface ground truth model, the accuracy of the Canny DIR algorithm was evaluated under axial rotation in the presence of deformation. The accuracy was also tested using fluorescent dye injections, which were then used for gamma analysis to establish a second ground truth. The results indicate that the Canny DIR algorithm performs better than rigid registration, intensity corrected Demons, and distinctive features for all evaluation matrices and ground truth scenarios. In conclusion Canny DIR performs well in the presence of the unique lighting and shading variations associated with white-light-based image registration.
Collapse
Affiliation(s)
- Vasant Kearney
- Department of Radiation Oncology, University of California, San Francisco, CA, USA. Department of Bioengineering, University of Texas Arlington, Arlington, TX, USA
| | | | | | | | | |
Collapse
|
28
|
De la Torre F, Cohn JF. Confidence Preserving Machine for Facial Action Unit Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:4753-4767. [PMID: 27479964 PMCID: PMC5272912 DOI: 10.1109/tip.2016.2594486] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Facial action unit (AU) detection from video has been a long-standing problem in the automated facial expression analysis. While progress has been made, accurate detection of facial AUs remains challenging due to ubiquitous sources of errors, such as inter-personal variability, pose, and low-intensity AUs. In this paper, we refer to samples causing such errors as hard samples, and the remaining as easy samples. To address learning with the hard samples, we propose the confidence preserving machine (CPM), a novel two-stage learning framework that combines multiple classifiers following an "easy-to-hard" strategy. During the training stage, CPM learns two confident classifiers. Each classifier focuses on separating easy samples of one class from all else, and thus preserves confidence on predicting each class. During the test stage, the confident classifiers provide "virtual labels" for easy test samples. Given the virtual labels, we propose a quasi-semi-supervised (QSS) learning strategy to learn a person-specific classifier. The QSS strategy employs a spatio-temporal smoothness that encourages similar predictions for samples within a spatio-temporal neighborhood. In addition, to further improve detection performance, we introduce two CPM extensions: iterative CPM that iteratively augments training samples to train the confident classifiers, and kernel CPM that kernelizes the original CPM model to promote nonlinearity. Experiments on four spontaneous data sets GFT, BP4D, DISFA, and RU-FACS illustrate the benefits of the proposed CPM models over baseline methods and the state-of-the-art semi-supervised learning and transfer learning methods.
Collapse
|
29
|
De la Torre F, Cohn JF. Joint Patch and Multi-label Learning for Facial Action Unit and Holistic Expression Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:3931-3946. [PMID: 28113424 DOI: 10.1109/tip.2016.2570550] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Most action unit (AU) detection methods use one-versus-all classifiers without considering dependences between features or AUs. In this paper, we introduce a joint patch and multi-label learning (JPML) framework that models the structured joint dependence behind features, AUs, and their interplay. In particular, JPML leverages group sparsity to identify important facial patches, and learns a multi-label classifier constrained by the likelihood of co-occurring AUs. To describe such likelihood, we derive two AU relations, positive correlation and negative competition, by statistically analyzing more than 350,000 video frames annotated with multiple AUs. To the best of our knowledge, this is the first work that jointly addresses patch learning and multi-label learning for AU detection. In addition, we show that JPML can be extended to recognize holistic expressions by learning common and specific patches, which afford a more compact representation than the standard expression recognition methods. We evaluate JPML on three benchmark datasets CK+, BP4D, and GFT, using within-and cross-dataset scenarios. In four of five experiments, JPML achieved the highest averaged F1 scores in comparison with baseline and alternative methods that use either patch learning or multi-label learning alone.
Collapse
|
30
|
Chu WS, Zeng J, De la Torre F, Cohn JF, Messinger DS. Unsupervised Synchrony Discovery in Human Interaction. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2015; 2015:3146-3154. [PMID: 27346988 PMCID: PMC4918688 DOI: 10.1109/iccv.2015.360] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
People are inherently social. Social interaction plays an important and natural role in human behavior. Most computational methods focus on individuals alone rather than in social context. They also require labelled training data. We present an unsupervised approach to discover interpersonal synchrony, referred as to two or more persons preforming common actions in overlapping video frames or segments. For computational efficiency, we develop a branch-and-bound (B&B) approach that affords exhaustive search while guaranteeing a globally optimal solution. The proposed method is entirely general. It takes from two or more videos any multi-dimensional signal that can be represented as a histogram. We derive three novel bounding functions and provide efficient extensions, including multi-synchrony detection and accelerated search, using a warm-start strategy and parallelism. We evaluate the effectiveness of our approach in multiple databases, including human actions using the CMU Mocap dataset [1], spontaneous facial behaviors using group-formation task dataset [37] and parent-infant interaction dataset [28].
Collapse
Affiliation(s)
| | | | | | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University; University of Pittsburgh, USA
| | | |
Collapse
|
31
|
Zhao K, Chu WS, De la Torre F, Cohn JF, Zhang H. Joint Patch and Multi-label Learning for Facial Action Unit Detection. PROCEEDINGS. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION 2015; 2015:2207-2216. [PMID: 27382243 PMCID: PMC4930865 DOI: 10.1109/cvpr.2015.7298833] [Citation(s) in RCA: 108] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The face is one of the most powerful channel of nonverbal communication. The most commonly used taxonomy to describe facial behaviour is the Facial Action Coding System (FACS). FACS segments the visible effects of facial muscle activation into 30+ action units (AUs). AUs, which may occur alone and in thousands of combinations, can describe nearly all-possible facial expressions. Most existing methods for automatic AU detection treat the problem using one-vs-all classifiers and fail to exploit dependencies among AU and facial features. We introduce joint-patch and multi-label learning (JPML) to address these issues. JPML leverages group sparsity by selecting a sparse subset of facial patches while learning a multi-label classifier. In four of five comparisons on three diverse datasets, CK+, GFT, and BP4D, JPML produced the highest average F1 scores in comparison with state-of-the art.
Collapse
Affiliation(s)
- Kaili Zhao
- School of Comm. and Info. Engineering, Beijing University of Posts and Telecom., Beijing China
| | - Wen-Sheng Chu
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213
| | | | - Jeffrey F Cohn
- Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213; Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15260
| | - Honggang Zhang
- School of Comm. and Info. Engineering, Beijing University of Posts and Telecom., Beijing China
| |
Collapse
|