1
|
Bamman D, Samberg R, So RJ, Zhou N. Measuring diversity in Hollywood through the large-scale computational analysis of film. Proc Natl Acad Sci U S A 2024; 121:e2409770121. [PMID: 39495931 DOI: 10.1073/pnas.2409770121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 08/28/2024] [Indexed: 11/06/2024] Open
Abstract
Movies are a massively popular and influential form of media, but their computational study at scale has largely been off-limits to researchers in the United States due to the Digital Millennium Copyright Act. In this work, we illustrate use of a new regulatory framework to enable computational research on film that permits circumvention of technological protection measures on digital video discs (DVDs). We use this exemption to legally digitize a collection of 2,307 films representing the top 50 movies by U.S. box office over the period 1980 to 2022, along with award nominees. We design a computational pipeline for measuring the representation of gender and race/ethnicity in film, drawing on computer vision models for recognizing actors and human perceptions of gender and race/ethnicity. Doing so allows us to learn substantive facts about representation and diversity in Hollywood over this period, confirming earlier studies that see an increase in diversity over the past decade, while allowing us to use computational methods to uncover a range of ad hoc analytical findings. Our work illustrates the affordances of the data-driven analysis of film at a large scale.
Collapse
Affiliation(s)
- David Bamman
- School of Information, University of California, Berkeley, CA 94704
| | | | - Richard Jean So
- Department of English, McGill University, Montreal, H3A 0G4, QC, Canada
| | - Naitian Zhou
- School of Information, University of California, Berkeley, CA 94704
| |
Collapse
|
2
|
DePaola NF, Wang KE, Frageau J, Huston TL. Racial Diversity of Patient Population Represented on United States Plastic Surgeons' Webpages. Ann Plast Surg 2024; 92:S210-S217. [PMID: 38556676 DOI: 10.1097/sap.0000000000003855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
ABSTRACT Current literature demonstrates a lack of racial diversity in plastic surgery media. However, to our knowledge, no study has yet examined the racial diversity of Webpage content as if from a patient-search perspective. The objective of this study is to determine if there is a racial discrepancy between the US Census, American Society of Plastic Surgeons (ASPS) statistics, and the media featuring implied patients on US plastic surgeons' Webpages from a patient-focused approach. A Google search was completed using the term "(state) plastic surgeon." The first 10 relevant Web sites were collected for each state, and homepages were analyzed. In line with previous studies, the implied patients in media were classified into 1 of 6 skin tone categories: I, ivory; II, beige; III, light brown; IV, olive; V, brown; and VI, dark brown. These correlate to Fitzpatrick phototypes; however, the Fitzpatrick scale measures skin's response to UV exposure. Skin tone was used as a guide to measure racial representation in the media, with the caveat that skin tone does not absolutely correlate to racial identity. Categories I-III were further classified as "white" and IV-VI as "nonwhite." These data were compared with the 2020 ASPS demographics report and US Census. Four thousand eighty individuals were analyzed from 504 Webpages, the majority of which were those of private practice physicians. A total of 91.62% of individuals were classified as "white" and 8.38% "nonwhite." The distribution by category was as follows: I = 265, II = 847, III = 2626, IV = 266, V = 71, and VI = 5. Using χ2 analyses, a statistically significant difference was found between the racial representation within this sample and that of the 2020 US Census nationally (P < 0.001), regionally (P < 0.001), and subregionally (P < 0.001); the 2020 ASPS Cosmetic Summary Data (P < 0.001); and the 2020 ASPS Reconstructive Summary Data (P < 0.001). This study highlights the significant difference between racial representation on plastic surgeons' Webpages and the demographics of patients they serve. Further analyses should identify the impact of these representational disparities on patient care and clinical outcomes, as well as examine how best to measure racial diversity and disparities in patient-oriented media.
Collapse
Affiliation(s)
- Nicole F DePaola
- From the Renaissance School of Medicine at Stony Brook University
| | - Katherine E Wang
- From the Renaissance School of Medicine at Stony Brook University
| | - James Frageau
- From the Renaissance School of Medicine at Stony Brook University
| | - Tara L Huston
- Division of Plastic and Reconstructive Surgery, Stony Brook University Hospital, Stony Brook, NY
| |
Collapse
|
3
|
Zhang X, Wang J, Lane JM, Xu X, Sörensen S. Investigating Racial Disparities in Cancer Crowdfunding: A Comprehensive Study of Medical GoFundMe Campaigns. J Med Internet Res 2023; 25:e51089. [PMID: 38085562 PMCID: PMC10751626 DOI: 10.2196/51089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/29/2023] [Accepted: 10/31/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND In recent years, there has been growing concern about prejudice in crowdfunding; however, empirical research remains limited, particularly in the context of medical crowdfunding. This study addresses the pressing issue of racial disparities in medical crowdfunding, with a specific focus on cancer crowdfunding on the GoFundMe platform. OBJECTIVE This study aims to investigate racial disparities in cancer crowdfunding using average donation amount, number of donations, and success of the fundraising campaign as outcomes. METHODS Drawing from a substantial data set of 104,809 campaigns in the United States, we used DeepFace facial recognition technology to determine racial identities and used regression models to examine racial factors in crowdfunding performance. We also examined the moderating effect of the proportion of White residents on crowdfunding bias and used 2-tailed t tests to measure the influence of racial anonymity on crowdfunding success. Owing to the large sample size, we set the cutoff for significance at P<.001. RESULTS In the regression and supplementary analyses, the racial identity of the fundraiser significantly predicted average donations (P<.001), indicating that implicit bias may play a role in donor behavior. Gender (P=.04) and campaign description length (P=.62) did not significantly predict the average donation amounts. The race of the fundraiser was not significantly associated with the number of donations (P=.42). The success rate of cancer crowdfunding campaigns, although generally low (11.77%), showed a significant association with the race of the fundraiser (P<.001). After controlling for the covariates of the fundraiser gender, fundraiser age, local White proportion, length of campaign description, and fundraising goal, the average donation amount to White individuals was 17.68% higher than for Black individuals. Moreover, campaigns that did not disclose racial information demonstrated a marginally higher average donation amount (3.92%) than those identified as persons of color. Furthermore, the racial composition of the fundraiser's county of residence was found to exert influence (P<.001); counties with a higher proportion of White residents exhibited reduced racial disparities in crowdfunding outcomes. CONCLUSIONS This study contributes to a deeper understanding of racial disparities in cancer crowdfunding. It highlights the impact of racial identity, geographic context, and the potential for implicit bias in donor behavior. As web-based platforms evolve, addressing racial inequality and promoting fairness in health care financing remain critical goals. Insights from this research suggest strategies such as maintaining racial anonymity and ensuring that campaigns provide strong evidence of deservingness. Moreover, broader societal changes are necessary to eliminate the financial distress that drives individuals to seek crowdfunding support.
Collapse
Affiliation(s)
- Xupin Zhang
- School of Economics and Management, East China Normal University, Shanghai, China
| | - Jingjing Wang
- School of Economics and Management, East China Normal University, Shanghai, China
| | - Jamil M Lane
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Xin Xu
- School of Economics and Management, East China Normal University, Shanghai, China
| | - Silvia Sörensen
- Warner School for Education and Human Development, University of Rochester, Rochester, NY, United States
| |
Collapse
|
4
|
Martinez JE. Facecraft: Race Reification in Psychological Research With Faces. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2023:17456916231194953. [PMID: 37819250 DOI: 10.1177/17456916231194953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Faces are socially important surfaces of the body on which various meanings are attached. The widespread physiognomic belief that faces inherently contain socially predictive value is why they make a generative stimulus for perception research. However, critical problems arise in studies that simultaneously investigate faces and race. Researchers studying race and racism inadvertently engage in various research practices that transform faces with specific phenotypes into straightforward representatives of their presumed race category, thereby taking race and its phenotypic associations for granted. I argue that research practices that map race categories onto faces using bioessentialist ideas of racial phenotypes constitute a form of racecraft ideology, the dubious reasoning of which presupposes the reality of race and mystifies the causal relation between race and racism. In considering how to study racism without reifying race in face studies, this article places these practices in context, describes how they reproduce racecraft ideology and impair theoretical inferences, and then suggests counterpractices for minimizing this problem.
Collapse
Affiliation(s)
- Joel E Martinez
- Data Science Initiative, Harvard University
- Department of Psychology, Harvard University
| |
Collapse
|
5
|
Robinson JP, Qin C, Henon Y, Timoner S, Fu Y. Balancing Biases and Preserving Privacy on Balanced Faces in the Wild. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:4365-4377. [PMID: 37467097 DOI: 10.1109/tip.2023.3282837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/21/2023]
Abstract
There are demographic biases present in current facial recognition (FR) models. To measure these biases across different ethnic and gender subgroups, we introduce our Balanced Faces in the Wild (BFW) dataset. This dataset allows for the characterization of FR performance per subgroup. We found that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. Additionally, performance within subgroups often varies significantly from the global average. Therefore, specific error rates only hold for populations that match the validation data. To mitigate imbalanced performances, we propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks. This scheme boosts the average performance and preserves identity information while removing demographic knowledge. Removing demographic knowledge prevents potential biases from affecting decision-making and protects privacy by eliminating demographic information. We explore the proposed method and demonstrate that subgroup classifiers can no longer learn from features projected using our domain adaptation scheme. For access to the source code and data, please visit https://github.com/visionjo/facerec-bias-bfw.
Collapse
|
6
|
Eyes versus Eyebrows: A Comprehensive Evaluation Using the Multiscale Analysis and Curvature-Based Combination Methods in Partial Face Recognition. ALGORITHMS 2022. [DOI: 10.3390/a15060208] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
This work aimed to find the most discriminative facial regions between the eyes and eyebrows for periocular biometric features in a partial face recognition system. We propose multiscale analysis methods combined with curvature-based methods. The goal of this combination was to capture the details of these features at finer scales and offer them in-depth characteristics using curvature. The eye and eyebrow images cropped from four face 2D image datasets were evaluated. The recognition performance was calculated using the nearest neighbor and support vector machine classifiers. Our proposed method successfully produced richer details in finer scales, yielding high recognition performance. The highest accuracy results were 76.04% and 98.61% for the limited dataset and 96.88% and 93.22% for the larger dataset for the eye and eyebrow images, respectively. Moreover, we compared the results between our proposed methods and other works, and we achieved similar high accuracy results using only eye and eyebrow images.
Collapse
|
7
|
Zeng D, Wu Z, Ding C, Ren Z, Yang Q, Xie S. Labeled-Robust Regression: Simultaneous Data Recovery and Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5026-5039. [PMID: 33151887 DOI: 10.1109/tcyb.2020.3026101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Rank minimization is widely used to extract low-dimensional subspaces. As a convex relaxation of the rank minimization, the problem of nuclear norm minimization has been attracting widespread attention. However, the standard nuclear norm minimization usually results in overcompression of data in all subspaces and eliminates the discrimination information between different categories of data. To overcome these drawbacks, in this article, we introduce the label information into the nuclear norm minimization problem and propose a labeled-robust principal component analysis (L-RPCA) to realize nuclear norm minimization on multisubspace data. Compared with the standard nuclear norm minimization, our method can effectively utilize the discriminant information in multisubspace rank minimization and avoid excessive elimination of local information and multisubspace characteristics of the data. Then, an effective labeled-robust regression (L-RR) method is proposed to simultaneously recover the data and labels of the observed data. Experiments on real datasets show that our proposed methods are superior to other state-of-the-art methods.
Collapse
|
8
|
Golder S, Stevens R, O'Connor K, James R, Gonzalez-Hernandez G. Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review. J Med Internet Res 2022; 24:e35788. [PMID: 35486433 PMCID: PMC9107046 DOI: 10.2196/35788] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 03/08/2022] [Accepted: 03/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background A growing amount of health research uses social media data. Those critical of social media research often cite that it may be unrepresentative of the population; however, the suitability of social media data in digital epidemiology is more nuanced. Identifying the demographics of social media users can help establish representativeness. Objective This study aims to identify the different approaches or combination of approaches to extract race or ethnicity from social media and report on the challenges of using these methods. Methods We present a scoping review to identify methods used to extract the race or ethnicity of Twitter users from Twitter data sets. We searched 17 electronic databases from the date of inception to May 15, 2021, and carried out reference checking and hand searching to identify relevant studies. Sifting of each record was performed independently by at least two researchers, with any disagreement discussed. Studies were required to extract the race or ethnicity of Twitter users using either manual or computational methods or a combination of both. Results Of the 1249 records sifted, we identified 67 (5.36%) that met our inclusion criteria. Most studies (51/67, 76%) have focused on US-based users and English language tweets (52/67, 78%). A range of data was used, including Twitter profile metadata, such as names, pictures, information from bios (including self-declarations), or location or content of the tweets. A range of methodologies was used, including manual inference, linkage to census data, commercial software, language or dialect recognition, or machine learning or natural language processing. However, not all studies have evaluated these methods. Those that evaluated these methods found accuracy to vary from 45% to 93% with significantly lower accuracy in identifying categories of people of color. The inference of race or ethnicity raises important ethical questions, which can be exacerbated by the data and methods used. The comparative accuracies of the different methods are also largely unknown. Conclusions There is no standard accepted approach or current guidelines for extracting or inferring the race or ethnicity of Twitter users. Social media researchers must carefully interpret race or ethnicity and not overpromise what can be achieved, as even manual screening is a subjective, imperfect method. Future research should establish the accuracy of methods to inform evidence-based best practice guidelines for social media researchers and be guided by concerns of equity and social justice.
Collapse
Affiliation(s)
- Su Golder
- Department of Health Sciences, University of York, York, United Kingdom
| | - Robin Stevens
- School of Communication and Journalism, University of Southern California, Los Angeles, CA, United States
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Richard James
- School of Nursing Liaison and Clinical Outreach Coordinator, University of Pennsylvania, Philadelphia, PA, United States
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
9
|
Face Recognition Based on Deep Learning and FPGA for Ethnicity Identification. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12052605] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the last decade, there has been a surge of interest in addressing complex Computer Vision (CV) problems in the field of face recognition (FR). In particular, one of the most difficult ones is based on the accurate determination of the ethnicity of mankind. In this regard, a new classification method using Machine Learning (ML) tools is proposed in this paper. Specifically, a new Deep Learning (DL) approach based on a Deep Convolutional Neural Network (DCNN) model is developed, which outperforms a reliable determination of the ethnicity of people based on their facial features. However, it is necessary to make use of specialized high-performance computing (HPC) hardware to build a workable DCNN-based FR system due to the low computation power given by the current central processing units (CPUs). Recently, the latter approach has increased the efficiency of the network in terms of power usage and execution time. Then, the usage of field-programmable gate arrays (FPGAs) was considered in this work. The performance of the new DCNN-based FR method using FPGA was compared against that using graphics processing units (GPUs). The experimental results considered an image dataset composed of 3141 photographs of citizens from three distinct countries. To our knowledge, this is the first image collection gathered specifically to address the ethnicity identification problem. Additionally, the ethnicity dataset was made publicly available as a novel contribution to this work. Finally, the experimental results proved the high performance provided by the proposed DCNN model using FPGAs, achieving an accuracy level of 96.9 percent and an F1 score of 94.6 percent while using a reasonable amount of energy and hardware resources.
Collapse
|
10
|
Automatic Ethnicity Classification from Middle Part of the Face Using Convolutional Neural Networks. INFORMATICS 2022. [DOI: 10.3390/informatics9010018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In the field of face biometrics, finding the identity of a person in an image is most researched, but there are other, soft biometric information that are equally as important, such as age, gender, ethnicity or emotion. Nowadays, ethnicity classification has a wide application area and is a prolific area of research. This paper gives an overview of recent advances in ethnicity classification with focus on convolutional neural networks (CNNs) and proposes a new ethnicity classification method using only the middle part of the face and CNN. The paper also compares the differences in results of CNN with and without plotted landmarks. The proposed model was tested using holdout testing method on UTKFace dataset and FairFace dataset. The accuracy of the model was 80.34% for classification into five classes and 61.74% for classification into seven classes, which is slightly better than state-of-the-art, but it is also important to note that results in this paper are obtained by using only the middle part of the face which reduces the time and resources necessary.
Collapse
|
11
|
Duan M, Li K, Li K, Tian Q. A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3418285] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.
Collapse
Affiliation(s)
| | | | - Keqin Li
- State University of New York, USA
| | | |
Collapse
|
12
|
Recognizing Human Races through Machine Learning—A Multi-Network, Multi-Features Study. MATHEMATICS 2021. [DOI: 10.3390/math9020195] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The human face holds a privileged position in multi-disciplinary research as it conveys much information—demographical attributes (age, race, gender, ethnicity), social signals, emotion expression, and so forth. Studies have shown that due to the distribution of ethnicity/race in training datasets, biometric algorithms suffer from “cross race effect”—their performance is better on subjects closer to the “country of origin” of the algorithm. The contributions of this paper are two-fold: (a) first, we gathered, annotated and made public a large-scale database of (over 175,000) facial images by automatically crawling the Internet for celebrities’ images belonging to various ethnicity/races, and (b) we trained and compared four state of the art convolutional neural networks on the problem of race and ethnicity classification. To the best of our knowledge, this is the largest, data-balanced, publicly-available face database annotated with race and ethnicity information. We also studied the impact of various face traits and image characteristics on the race/ethnicity deep learning classification methods and compared the obtained results with the ones extracted from psychological studies and anthropomorphic studies. Extensive tests were performed in order to determine the facial features to which the networks are sensitive to. These tests and a recognition rate of 96.64% on the problem of human race classification demonstrate the effectiveness of the proposed solution.
Collapse
|
13
|
Carletti V, Greco A, Percannella G, Vento M. Age from Faces in the Deep Learning Revolution. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2113-2132. [PMID: 30990174 DOI: 10.1109/tpami.2019.2910522] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Face analysis includes a variety of specific problems as face detection, person identification, gender and ethnicity recognition, just to name the most common ones; in the last two decades, significant research efforts have been devoted to the challenging task of age estimation from faces, as witnessed by the high number of published papers. The explosion of the deep learning paradigm, that is determining a spectacular increasing of the performance, is in the public eye; consequently, the number of approaches based on deep learning is impressively growing and this also happened for age estimation. The exciting results obtained have been recently surveyed on almost all the specific face analysis problems; the only exception stands for age estimation, whose last survey dates back to 2010 and does not include any deep learning based approach to the problem. This paper provides an analysis of the deep methods proposed in the last six years; these are analysed from different points of view: the network architecture together with the learning procedure, the used datasets, data preprocessing and augmentation, and the exploitation of additional data coming from gender, race and face expression. The review is completed by discussing the results obtained on public datasets, so as the impact of different aspects on system performance, together with still open issues.
Collapse
|
14
|
Stereotypes and Structure in the Interaction between Facial Emotional Expression and Sex Characteristics. ADAPTIVE HUMAN BEHAVIOR AND PHYSIOLOGY 2020. [DOI: 10.1007/s40750-020-00141-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
15
|
Katti H, Arun SP. Are you from North or South India? A hard face-classification task reveals systematic representational differences between humans and machines. J Vis 2020; 19:1. [PMID: 31260515 PMCID: PMC6607925 DOI: 10.1167/19.7.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We make a rich variety of judgments on faces, but the underlying features are poorly understood. Here we describe a challenging geographical-origin classification problem that elucidates feature representations in both humans and machine algorithms. In Experiment 1, we collected a diverse set of 1,647 faces from India labeled with their fine-grained geographical origin (North vs. South India), characterized the categorization performance of 129 human subjects on these faces, and compared this with the performance of machine vision algorithms. Our main finding is that while many machine algorithms achieved an overall performance comparable to that of humans (64%), their error patterns across faces were qualitatively different despite training. To elucidate the face parts used by humans for classification, we trained linear classifiers on overcomplete sets of features derived from each face part. This revealed mouth shape to be the most discriminative part compared to eyes, nose, or external contour. In Experiment 2, we confirmed that humans relied the most on mouth shape for classification using an additional experiment in which subjects classified faces with occluded parts. In Experiment 3, we compared human performance for briefly viewed faces and for inverted faces. Interestingly, human performance on inverted faces was predicted better by computational models compared to upright faces, suggesting that humans use relatively more generic features on inverted faces. Taken together, our results show that studying hard classification tasks can lead to useful insights into both machine and human vision.
Collapse
Affiliation(s)
- Harish Katti
- Centre for Neuroscience, Indian Institute of Science, Bangalore, India
| | - S P Arun
- Centre for Neuroscience, Indian Institute of Science, Bangalore, India
| |
Collapse
|
16
|
Detection of Emotion Using Multi-Block Deep Learning in a Self-Management Interview App. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9224830] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recently, domestic universities have constructed and operated online mock interview systems for students’ preparation for employment. Students can have a mock interview anywhere and at any time through the online mock interview system, and can improve any problems during the interviews via images stored in real time. For such practice, it is necessary to analyze the emotional state of the student based on the situation, and to provide coaching through accurate analysis of the interview. In this paper, we propose detection of user emotions using multi-block deep learning in a self-management interview application. Unlike the basic structure for learning about whole-face images, the multi-block deep learning method helps the user learn after sampling the core facial areas (eyes, nose, mouth, etc.), which are important factors for emotion analysis from face detection. Through the multi-block process, sampling is carried out using multiple AdaBoost learning. For optimal block image screening and verification, similarity measurement is also performed during this process. A performance evaluation of the proposed model compares the proposed system with AlexNet, which has mainly been used for facial recognition in the past. As comparison items, the recognition rate and extraction time of the specific area are compared. The extraction time of the specific area decreased by 2.61%, and the recognition rate increased by 3.75%, indicating that the proposed facial recognition method is excellent. It is expected to provide good-quality, customized interview education for job seekers by establishing a systematic interview system using the proposed deep learning method.
Collapse
|
17
|
Becerra-Riera F, Morales-González A, Méndez-Vázquez H. A survey on facial soft biometrics for video surveillance and forensic applications. Artif Intell Rev 2019. [DOI: 10.1007/s10462-019-09689-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
18
|
Abstract
Race recognition (RR), which has many applications such as in surveillance systems, image/video understanding, analysis, etc., is a difficult problem to solve completely. To contribute towards solving that problem, this article investigates using a deep learning model. An efficient Race Recognition Framework (RRF) is proposed that includes information collector (IC), face detection and preprocessing (FD&P), and RR modules. For the RR module, this study proposes two independent models. The first model is RR using a deep convolutional neural network (CNN) (the RR-CNN model). The second model (the RR-VGG model) is a fine-tuning model for RR based on VGG, the famous trained model for object recognition. In order to examine the performance of our proposed framework, we perform an experiment on our dataset named VNFaces, composed specifically of images collected from Facebook pages of Vietnamese people, to compare the accuracy between RR-CNN and RR-VGG. The experimental results show that for the VNFaces dataset, the RR-VGG model with augmented input images yields the best accuracy at 88.87% while RR-CNN, an independent and lightweight model, yields 88.64% accuracy. The extension experiments conducted prove that our proposed models could be applied to other race dataset problems such as Japanese, Chinese, or Brazilian with over 90% accuracy; the fine-tuning RR-VGG model achieved the best accuracy and is recommended for most scenarios.
Collapse
|
19
|
Sun Y, Zhang M, Sun Z, Tan T. Demographic Analysis from Biometric Data: Achievements, Challenges, and New Frontiers. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:332-351. [PMID: 28212078 DOI: 10.1109/tpami.2017.2669035] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Biometrics is the technique of automatically recognizing individuals based on their biological or behavioral characteristics. Various biometric traits have been introduced and widely investigated, including fingerprint, iris, face, voice, palmprint, gait and so forth. Apart from identity, biometric data may convey various other personal information, covering affect, age, gender, race, accent, handedness, height, weight, etc. Among these, analysis of demographics (age, gender, and race) has received tremendous attention owing to its wide real-world applications, with significant efforts devoted and great progress achieved. This survey first presents biometric demographic analysis from the standpoint of human perception, then provides a comprehensive overview of state-of-the-art advances in automated estimation from both academia and industry. Despite these advances, a number of challenging issues continue to inhibit its full potential. We second discuss these open problems, and finally provide an outlook into the future of this very active field of research by sharing some promising opportunities.
Collapse
|
20
|
|
21
|
Swearingen T, Ross A. Label propagation approach for predicting missing biographic labels in face‐based biometric records. IET BIOMETRICS 2017. [DOI: 10.1049/iet-bmt.2017.0117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Thomas Swearingen
- Computer Science and Engineering DepartmentMichigan State UniversityEast LansingMIUSA
| | - Arun Ross
- Computer Science and Engineering DepartmentMichigan State UniversityEast LansingMIUSA
| |
Collapse
|
22
|
Learned Features are Better for Ethnicity Classification. CYBERNETICS AND INFORMATION TECHNOLOGIES 2017. [DOI: 10.1515/cait-2017-0036] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Ethnicity is a key demographic attribute of human beings and it plays a vital role in automatic facial recognition and have extensive real world applications such as Human Computer Interaction (HCI); demographic based classification; biometric based recognition; security and defense to name a few. In this paper, we present a novel approach for extracting ethnicity from the facial images. The proposed method makes use of a pre trained Convolutional Neural Network (CNN) to extract the features, then Support Vector Machine (SVM) with linear kernel is used as a classifier. This technique uses translational invariant hierarchical features learned by the network, in contrast to previous works, which use hand crafted features such as Local Binary Pattern (LBP); Gabor, etc. Thorough experiments are presented on ten different facial databases, which strongly suggest that our approach is robust to different expressions and illuminations conditions. Here we consider ethnicity classification as a three class problem including Asian, African-American and Caucasian. Average classification accuracy over all databases is 98.28%, 99.66% and 99.05% for Asian, African-American and Caucasian respectively. All the codes are available for reproducing the results on request.
Collapse
|
23
|
Facial semantic representation for ethnical Chinese minorities based on geometric similarity. INT J MACH LEARN CYB 2017. [DOI: 10.1007/s13042-017-0726-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
24
|
Chen H, Gao M, Ricanek K, Xu W, Fang B. A Novel Race Classification Method Based on Periocular Features Fusion. INT J PATTERN RECOGN 2017. [DOI: 10.1142/s0218001417500264] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Race identification is an essential ability for human eyes. Race classification by machine based on face image can be used in some practical application fields. Employing holistic face analysis, local feature extraction and 3D model, many race classification methods have been introduced. In this paper, we propose a novel fusion feature based on periocular region features for classifying East Asian from Caucasian. With the periocular region landmarks, we extract five local textures or geometrical features in some interesting regions which contain available discriminating race information. And then, these effective features are fused into a remarkable feature by Adaboost training. On the composed OFD-FERET face database, our method gets perfect performance on average accuracy rate. Meanwhile, we do a plenty of additional experiments to discuss the effect on the performance caused by gender, landmark detection, glasses and image size.
Collapse
Affiliation(s)
- Hengxin Chen
- College of Computer Science, Chongqing University, Chongqing, 400044, P. R. China
| | - Mingqi Gao
- College of Computer Science, Chongqing University, Chongqing, 400044, P. R. China
| | - Karl Ricanek
- Department of Computer Science, University of North Carolina at Wilmington, Wilmington, NC 28403, USA
| | - Weiliang Xu
- College of Computer Science, Chongqing University, Chongqing, 400044, P. R. China
| | - Bin Fang
- College of Computer Science, Chongqing University, Chongqing, 400044, P. R. China
| |
Collapse
|
25
|
Attamimi M, Ando Y, Nakamura T, Nagai T, Mochihashi D, Kobayashi I, Asoh H. Learning word meanings and grammar for verbalization of daily life activities using multilayered multimodal latent Dirichlet allocation and Bayesian hidden Markov models. Adv Robot 2016. [DOI: 10.1080/01691864.2016.1172507] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
26
|
Nixon MS, Correia PL, Nasrollahi K, Moeslund TB, Hadid A, Tistarelli M. On soft biometrics. Pattern Recognit Lett 2015. [DOI: 10.1016/j.patrec.2015.08.006] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
27
|
Automatic facial attribute analysis via adaptive sparse representation of random patches. Pattern Recognit Lett 2015. [DOI: 10.1016/j.patrec.2015.05.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
Espinoza-Cuadros F, Fernández-Pozo R, Toledano DT, Alcázar-Ramírez JD, López-Gonzalo E, Hernández-Gómez LA. Speech Signal and Facial Image Processing for Obstructive Sleep Apnea Assessment. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:489761. [PMID: 26664493 PMCID: PMC4664800 DOI: 10.1155/2015/489761] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 10/15/2015] [Accepted: 10/20/2015] [Indexed: 11/17/2022]
Abstract
Obstructive sleep apnea (OSA) is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA). OSA is generally diagnosed through a costly procedure requiring an overnight stay of the patient at the hospital. This has led to proposing less costly procedures based on the analysis of patients' facial images and voice recordings to help in OSA detection and severity assessment. In this paper we investigate the use of both image and speech processing to estimate the apnea-hypopnea index, AHI (which describes the severity of the condition), over a population of 285 male Spanish subjects suspected to suffer from OSA and referred to a Sleep Disorders Unit. Photographs and voice recordings were collected in a supervised but not highly controlled way trying to test a scenario close to an OSA assessment application running on a mobile device (i.e., smartphones or tablets). Spectral information in speech utterances is modeled by a state-of-the-art low-dimensional acoustic representation, called i-vector. A set of local craniofacial features related to OSA are extracted from images after detecting facial landmarks using Active Appearance Models (AAMs). Support vector regression (SVR) is applied on facial features and i-vectors to estimate the AHI.
Collapse
Affiliation(s)
| | - Rubén Fernández-Pozo
- GAPS Signal Processing Applications Group, Universidad Politécnica de Madrid, 28040 Madrid, Spain
| | - Doroteo T. Toledano
- ATVS Biometric Recognition Group, Universidad Autónoma de Madrid, Madrid, Spain
| | | | - Eduardo López-Gonzalo
- GAPS Signal Processing Applications Group, Universidad Politécnica de Madrid, 28040 Madrid, Spain
| | - Luis A. Hernández-Gómez
- GAPS Signal Processing Applications Group, Universidad Politécnica de Madrid, 28040 Madrid, Spain
| |
Collapse
|