1
|
Ahmad I, Rashid J, Faheem M, Akram A, Khan NA, Amin RU. Autism spectrum disorder detection using facial images: A performance comparison of pretrained convolutional neural networks. Healthc Technol Lett 2024; 11:227-239. [PMID: 39100502 PMCID: PMC11294932 DOI: 10.1049/htl2.12073] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 08/06/2024] Open
Abstract
Autism spectrum disorder (ASD) is a complex psychological syndrome characterized by persistent difficulties in social interaction, restricted behaviours, speech, and nonverbal communication. The impacts of this disorder and the severity of symptoms vary from person to person. In most cases, symptoms of ASD appear at the age of 2 to 5 and continue throughout adolescence and into adulthood. While this disorder cannot be cured completely, studies have shown that early detection of this syndrome can assist in maintaining the behavioural and psychological development of children. Experts are currently studying various machine learning methods, particularly convolutional neural networks, to expedite the screening process. Convolutional neural networks are considered promising frameworks for the diagnosis of ASD. This study employs different pre-trained convolutional neural networks such as ResNet34, ResNet50, AlexNet, MobileNetV2, VGG16, and VGG19 to diagnose ASD and compared their performance. Transfer learning was applied to every model included in the study to achieve higher results than the initial models. The proposed ResNet50 model achieved the highest accuracy, 92%, compared to other transfer learning models. The proposed method also outperformed the state-of-the-art models in terms of accuracy and computational cost.
Collapse
Affiliation(s)
- Israr Ahmad
- Department of Automation ScienceBeihang UniversityBeijingChina
| | - Javed Rashid
- Department of IT ServicesUniversity of OkaraOkaraPunjabPakistan
- MLC LabOkaraPunjabPakistan
| | - Muhammad Faheem
- Department of Computing SciencesSchool of Technology and Innovations, University of VaasaVaasaFinland
| | - Arslan Akram
- MLC LabOkaraPunjabPakistan
- Department of Computer ScienceUniversity of OkaraOkaraPunjabPakistan
| | - Nafees Ahmad Khan
- MLC LabOkaraPunjabPakistan
- Department of Computer ScienceUniversity of OkaraOkaraPunjabPakistan
| | - Riaz ul Amin
- MLC LabOkaraPunjabPakistan
- Department of Computer ScienceUniversity of OkaraOkaraPunjabPakistan
| |
Collapse
|
2
|
Saurav S, Saini R, Singh S. Fast facial expression recognition using Boosted Histogram of Oriented Gradient (BHOG) features. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01112-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
3
|
Methods for Facial Expression Recognition with Applications in Challenging Situations. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9261438. [PMID: 35665283 PMCID: PMC9159845 DOI: 10.1155/2022/9261438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 04/12/2022] [Accepted: 04/18/2022] [Indexed: 11/17/2022]
Abstract
In the last few years, a great deal of interesting research has been achieved on automatic facial emotion recognition (FER). FER has been used in a number of ways to make human-machine interactions better, including human center computing and the new trends of emotional artificial intelligence (EAI). Researchers in the EAI field aim to make computers better at predicting and analyzing the facial expressions and behavior of human under different scenarios and cases. Deep learning has had the greatest influence on such a field since neural networks have evolved significantly in recent years, and accordingly, different architectures are being developed to solve more and more difficult problems. This article will address the latest advances in computational intelligence-related automated emotion recognition using recent deep learning models. We show that both deep learning-based FER and models that use architecture-related methods, such as databases, can collaborate well in delivering highly accurate results.
Collapse
|
4
|
Ouafa C, Tayeb LM. Facial Expression Recognition Using Convolution Neural Network Fusion and Texture Descriptors Representation. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2022. [DOI: 10.1142/s146902682250002x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Facial expression recognition is an interesting research direction of pattern recognition and computer vision. It has been increasingly used in artificial intelligence, human–computer interaction and security monitoring. In recent years, Convolution Neural Network (CNN) as a deep learning technique and multiple classifier combination method has been applied to gain accurate results in classifying face expressions. In this paper, we propose a multimodal classification approach based on a local texture descriptor representation and a combination of CNN to recognize facial expression. Initially, in order to reduce the influence of redundant information, the preprocessing stage is performed using face detection, face image cropping and texture descriptors of Local Binary Pattern (LBP), Local Gradient Code (LGC), Local Directional Pattern (LDP) and Gradient Direction Pattern (GDP) calculation. Second, we construct a cascade CNN architecture using the multimodal data of each descriptor (CNNLBP, CNNLGC, CNNGDP and CNNLDP) to extract facial features and classify emotions. Finally, we apply aggregation techniques (sum and product rule) for each modality to combine the four multimodal outputs and thus obtain the final decision of our system. Experimental results using CK[Formula: see text] and JAFFE database show that the proposed multimodal classification system achieves superior recognition performance compared to the existing studies with classification accuracy of 97, 93% and 94, 45%, respectively.
Collapse
Affiliation(s)
- Chebah Ouafa
- Laboratory of Research in Informatics (LRI), Department of Computer Science, Badji Mokhtar University, BP 12 Annaba 23000, Algeria
| | - Laskri Mohamed Tayeb
- Laboratory of Research in Informatics (LRI), Department of Computer Science, Badji Mokhtar University, BP 12 Annaba 23000, Algeria
| |
Collapse
|
5
|
Parra-Dominguez GS, Sanchez-Yanez RE, Garcia-Capulin CH. Towards Facial Gesture Recognition in Photographs of Patients with Facial Palsy. Healthcare (Basel) 2022; 10:659. [PMID: 35455835 PMCID: PMC9031481 DOI: 10.3390/healthcare10040659] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/25/2022] [Accepted: 03/28/2022] [Indexed: 11/16/2022] Open
Abstract
Humans express their emotions verbally and through actions, and hence emotions play a fundamental role in facial expressions and body gestures. Facial expression recognition is a popular topic in security, healthcare, entertainment, advertisement, education, and robotics. Detecting facial expressions via gesture recognition is a complex and challenging problem, especially in persons who suffer face impairments, such as patients with facial paralysis. Facial palsy or paralysis refers to the incapacity to move the facial muscles on one or both sides of the face. This work proposes a methodology based on neural networks and handcrafted features to recognize six gestures in patients with facial palsy. The proposed facial palsy gesture recognition system is designed and evaluated on a publicly available database with good results as a first attempt to perform this task in the medical field. We conclude that, to recognize facial gestures in patients with facial paralysis, the severity of the damage has to be considered because paralyzed organs exhibit different behavior than do healthy ones, and any recognition system must be capable of discerning these behaviors.
Collapse
|
6
|
Alsaade FW, Alzahrani MS. Classification and Detection of Autism Spectrum Disorder Based on Deep Learning Algorithms. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8709145. [PMID: 35265118 PMCID: PMC8901307 DOI: 10.1155/2022/8709145] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 01/14/2022] [Accepted: 02/07/2022] [Indexed: 11/20/2022]
Abstract
Autism spectrum disorder (ASD) is a type of mental illness that can be detected by using social media data and biomedical images. Autism spectrum disorder (ASD) is a neurological disease correlated with brain growth that later impacts the physical impression of the face. Children with ASD have dissimilar facial landmarks, which set them noticeably apart from typically developed (TD) children. Novelty of the proposed research is to design a system that is based on autism spectrum disorder detection on social media and face recognition. To identify such landmarks, deep learning techniques may be used, but they require a precise technology for extracting and producing the proper patterns of the face features. This study assists communities and psychiatrists in experimentally detecting autism based on facial features, by using an uncomplicated web application based on a deep learning system, that is, a convolutional neural network with transfer learning and the flask framework. Xception, Visual Geometry Group Network (VGG19), and NASNETMobile are the pretrained models that were used for the classification task. The dataset that was used to test these models was collected from the Kaggle platform and consisted of 2,940 face images. Standard evaluation metrics such as accuracy, specificity, and sensitivity were used to evaluate the results of the three deep learning models. The Xception model achieved the highest accuracy result of 91%, followed by VGG19 (80%) and NASNETMobile (78%).
Collapse
Affiliation(s)
- Fawaz Waselallah Alsaade
- College of Computer Science and Information Technology, King Faisal University, P.O. Box 4000, Al-Ahsa, Saudi Arabia
| | - Mohammed Saeed Alzahrani
- College of Computer Science and Information Technology, King Faisal University, P.O. Box 4000, Al-Ahsa, Saudi Arabia
| |
Collapse
|
7
|
Chen LY, Tsai TH, Ho A, Li CH, Ke LJ, Peng LN, Lin MH, Hsiao FY, Chen LK. Predicting neuropsychiatric symptoms of persons with dementia in a day care center using a facial expression recognition system. Aging (Albany NY) 2022; 14:1280-1291. [PMID: 35113806 PMCID: PMC8876896 DOI: 10.18632/aging.203869] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/17/2022] [Indexed: 11/25/2022]
Abstract
BACKGROUND Behavioral and psychological symptoms of dementia (BPSD) affect 90% of persons with dementia (PwD), resulting in various adverse outcomes and aggravating care burdens among their caretakers. This study aimed to explore the potential of artificial intelligence-based facial expression recognition systems (FERS) in predicting BPSDs among PwD. METHODS A hybrid of human labeling and a preconstructed deep learning model was used to differentiate basic facial expressions of individuals to predict the results of Neuropsychiatric Inventory (NPI) assessments by stepwise linear regression (LR), random forest (RF) with importance ranking, and ensemble method (EM) of equal importance, while the accuracy was determined by mean absolute error (MAE) and root-mean-square error (RMSE) methods. RESULTS Twenty-three PwD from an adult day care center were enrolled with ≥ 11,500 FERS data series and 38 comparative NPI scores. The overall accuracy was 86% on facial expression recognition. Negative facial expressions and variance in emotional switches were important features of BPSDs. A strong positive correlation was identified in each model (EM: r = 0.834, LR: r = 0.821, RF: r = 0.798 by the patientwise method; EM: r = 0.891, LR: r = 0.870, RF: r = 0.886 by the MinimPy method), and EM exhibited the lowest MAE and RMSE. CONCLUSIONS FERS successfully predicted the BPSD of PwD by negative emotions and the variance in emotional switches. This finding enables early detection and management of BPSDs, thus improving the quality of dementia care.
Collapse
Affiliation(s)
- Liang-Yu Chen
- Aging and Health Research Center, Taipei, Taiwan
- Institute of Public Health, National Yang-Ming Chiao-Tung University, Taipei, Taiwan
- Center for Geriatrics and Gerontology, Taipei, Taiwan
- uAge Day Care Center, Taipei Veterans General Hospital, Taipei, Taiwan
| | | | - Andy Ho
- Value Lab, Acer Incorporated, New Taipei City, Taiwan
| | - Chun-Hsien Li
- Value Lab, Acer Incorporated, New Taipei City, Taiwan
| | - Li-Ju Ke
- uAge Day Care Center, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Li-Ning Peng
- Aging and Health Research Center, Taipei, Taiwan
- Center for Geriatrics and Gerontology, Taipei, Taiwan
| | - Ming-Hsien Lin
- Aging and Health Research Center, Taipei, Taiwan
- Center for Geriatrics and Gerontology, Taipei, Taiwan
| | - Fei-Yuan Hsiao
- Graduate Institute of Clinical Pharmacy, National Taiwan University, Taipei, Taiwan
- School of Pharmacy, National Taiwan University, Taipei, Taiwan
- Department of Pharmacy, National Taiwan University Hospital, Taipei, Taiwan
| | - Liang-Kung Chen
- Aging and Health Research Center, Taipei, Taiwan
- Center for Geriatrics and Gerontology, Taipei, Taiwan
- Taipei Municipal Gan-Dau Hospital, Taipei, Taiwan
| |
Collapse
|
8
|
A new multi-feature fusion based convolutional neural network for facial expression recognition. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02575-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
9
|
Anomaly localization in regular textures based on deep convolutional generative adversarial networks. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02475-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Expression Recognition Method Using Improved VGG16 Network Model in Robot Interaction. JOURNAL OF ROBOTICS 2021. [DOI: 10.1155/2021/9326695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Aiming at the problems of poor representation ability and less feature data when traditional expression recognition methods are applied to intelligent applications, an expression recognition method based on improved VGG16 network is proposed. Firstly, the VGG16 network is improved by using large convolution kernel instead of small convolution kernel and reducing some fully connected layers to reduce the complexity and parameters of the model. Then, the high-dimensional abstract feature data output by the improved VGG16 is input into the convolution neural network (CNN) for training, so as to output the expression types with high accuracy. Finally, the expression recognition method combined with the improved VGG16 and CNN model is applied to the human-computer interaction of the NAO robot. The robot makes different interactive actions according to different expressions. The experimental results based on CK + dataset show that the improved VGG16 network has strong supervised learning ability. It can extract features well for different expression types, and its overall recognition accuracy is close to 90%. Through multiple tests, the interactive results show that the robot can stably recognize emotions and make corresponding action interactions.
Collapse
|
11
|
Akter T, Ali MH, Khan MI, Satu MS, Uddin MJ, Alyami SA, Ali S, Azad AKM, Moni MA. Improved Transfer-Learning-Based Facial Recognition Framework to Detect Autistic Children at an Early Stage. Brain Sci 2021; 11:734. [PMID: 34073085 PMCID: PMC8230000 DOI: 10.3390/brainsci11060734] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 05/24/2021] [Accepted: 05/24/2021] [Indexed: 12/28/2022] Open
Abstract
Autism spectrum disorder (ASD) is a complex neuro-developmental disorder that affects social skills, language, speech and communication. Early detection of ASD individuals, especially children, could help to devise and strategize right therapeutic plan at right time. Human faces encode important markers that can be used to identify ASD by analyzing facial features, eye contact, and so on. In this work, an improved transfer-learning-based autism face recognition framework is proposed to identify kids with ASD in the early stages more precisely. Therefore, we have collected face images of children with ASD from the Kaggle data repository, and various machine learning and deep learning classifiers and other transfer-learning-based pre-trained models were applied. We observed that our improved MobileNet-V1 model demonstrates the best accuracy of 90.67% and the lowest 9.33% value of both fall-out and miss rate compared to the other classifiers and pre-trained models. Furthermore, this classifier is used to identify different ASD groups investigating only autism image data using k-means clustering technique. Thus, the improved MobileNet-V1 model showed the highest accuracy (92.10%) for k = 2 autism sub-types. We hope this model will be useful for physicians to detect autistic children more explicitly at the early stage.
Collapse
Affiliation(s)
- Tania Akter
- Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka 1342, Bangladesh; (T.A.); (M.H.A.)
- Department of Computer Science and Engineering, Gono Bishwabidyalay, Savar, Dhaka 1344, Bangladesh;
| | - Mohammad Hanif Ali
- Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka 1342, Bangladesh; (T.A.); (M.H.A.)
| | - Md. Imran Khan
- Department of Computer Science and Engineering, Gono Bishwabidyalay, Savar, Dhaka 1344, Bangladesh;
| | - Md. Shahriare Satu
- Department of Management Information Systems, Noakhali Science and Technology University, Sonapur, Noakhali 3814, Bangladesh;
| | - Md. Jamal Uddin
- Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj Town Road, Gopalgonj 8100, Bangladesh;
| | - Salem A. Alyami
- Department of Mathematics and Statistics, Imam Mohammad Ibn Saud Islamic University, Riyadh 13318, Saudi Arabia;
| | - Sarwar Ali
- Department of Electrical and Electronics Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh;
| | - AKM Azad
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia;
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, Faculty of Medicine, University of New South Wales, Sydney, NSW 2052, Australia
- Healthy Aging Theme, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| |
Collapse
|
12
|
Lu L. Multi‐angle face expression recognition based on generative adversarial networks. Comput Intell 2021. [DOI: 10.1111/coin.12437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Lihua Lu
- School of Computer Science Northwestern Polytechnical University Xi'an China
| |
Collapse
|
13
|
Abrami A, Gunzler S, Kilbane C, Ostrand R, Ho B, Cecchi G. Automated Computer Vision Assessment of Hypomimia in Parkinson Disease: Proof-of-Principle Pilot Study. J Med Internet Res 2021; 23:e21037. [PMID: 33616535 PMCID: PMC7939934 DOI: 10.2196/21037] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/30/2020] [Accepted: 12/18/2020] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Facial expressions require the complex coordination of 43 different facial muscles. Parkinson disease (PD) affects facial musculature leading to "hypomimia" or "masked facies." OBJECTIVE We aimed to determine whether modern computer vision techniques can be applied to detect masked facies and quantify drug states in PD. METHODS We trained a convolutional neural network on images extracted from videos of 107 self-identified people with PD, along with 1595 videos of controls, in order to detect PD hypomimia cues. This trained model was applied to clinical interviews of 35 PD patients in their on and off drug motor states, and seven journalist interviews of the actor Alan Alda obtained before and after he was diagnosed with PD. RESULTS The algorithm achieved a test set area under the receiver operating characteristic curve of 0.71 on 54 subjects to detect PD hypomimia, compared to a value of 0.75 for trained neurologists using the United Parkinson Disease Rating Scale-III Facial Expression score. Additionally, the model accuracy to classify the on and off drug states in the clinical samples was 63% (22/35), in contrast to an accuracy of 46% (16/35) when using clinical rater scores. Finally, each of Alan Alda's seven interviews were successfully classified as occurring before (versus after) his diagnosis, with 100% accuracy (7/7). CONCLUSIONS This proof-of-principle pilot study demonstrated that computer vision holds promise as a valuable tool for PD hypomimia and for monitoring a patient's motor state in an objective and noninvasive way, particularly given the increasing importance of telemedicine.
Collapse
Affiliation(s)
- Avner Abrami
- IBM Research - Computational Biology Center, Yorktown Heights, NY, United States
| | - Steven Gunzler
- Parkinson's and Movement Disorders Center, Neurological Institute, University Hospitals Cleveland Medical Center, Cleveland, OH, United States
| | - Camilla Kilbane
- Parkinson's and Movement Disorders Center, Neurological Institute, University Hospitals Cleveland Medical Center, Cleveland, OH, United States
| | - Rachel Ostrand
- IBM Research - Computational Biology Center, Yorktown Heights, NY, United States
| | - Bryan Ho
- Department of Neurology, Tufts Medical Center, Boston, MA, United States
| | - Guillermo Cecchi
- IBM Research - Computational Biology Center, Yorktown Heights, NY, United States
| |
Collapse
|
14
|
An Artificial Intelligence Based Approach Towards Inclusive Healthcare Provisioning in Society 5.0: A Perspective on Brain Disorder. Brain Inform 2021. [DOI: 10.1007/978-3-030-86993-9_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
15
|
|