1
|
Akinpelu S, Viriri S, Adegun A. An enhanced speech emotion recognition using vision transformer. Sci Rep 2024; 14:13126. [PMID: 38849422 PMCID: PMC11161461 DOI: 10.1038/s41598-024-63776-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 06/02/2024] [Indexed: 06/09/2024] Open
Abstract
In human-computer interaction systems, speech emotion recognition (SER) plays a crucial role because it enables computers to understand and react to users' emotions. In the past, SER has significantly emphasised acoustic properties extracted from speech signals. The use of visual signals for enhancing SER performance, however, has been made possible by recent developments in deep learning and computer vision. This work utilizes a lightweight Vision Transformer (ViT) model to propose a novel method for improving speech emotion recognition. We leverage the ViT model's capabilities to capture spatial dependencies and high-level features in images which are adequate indicators of emotional states from mel spectrogram input fed into the model. To determine the efficiency of our proposed approach, we conduct a comprehensive experiment on two benchmark speech emotion datasets, the Toronto English Speech Set (TESS) and the Berlin Emotional Database (EMODB). The results of our extensive experiment demonstrate a considerable improvement in speech emotion recognition accuracy attesting to its generalizability as it achieved 98%, 91%, and 93% (TESS-EMODB) accuracy respectively on the datasets. The outcomes of the comparative experiment show that the non-overlapping patch-based feature extraction method substantially improves the discipline of speech emotion recognition. Our research indicates the potential for integrating vision transformer models into SER systems, opening up fresh opportunities for real-world applications requiring accurate emotion recognition from speech compared with other state-of-the-art techniques.
Collapse
Affiliation(s)
- Samson Akinpelu
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, 4001, South Africa
| | - Serestina Viriri
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, 4001, South Africa.
| | - Adekanmi Adegun
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, 4001, South Africa
| |
Collapse
|
2
|
Claret AF, Casali KR, Cunha TS, Moraes MC. Automatic Classification of Emotions Based on Cardiac Signals: A Systematic Literature Review. Ann Biomed Eng 2023; 51:2393-2414. [PMID: 37543539 DOI: 10.1007/s10439-023-03341-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 07/28/2023] [Indexed: 08/07/2023]
Abstract
Emotions play a pivotal role in human cognition, exerting influence across diverse domains of individuals' lives. The widespread adoption of artificial intelligence and machine learning has spurred interest in systems capable of automatically recognizing and classifying emotions and affective states. However, the accurate identification of human emotions remains a formidable challenge, as they are influenced by various factors and accompanied by physiological changes. Numerous solutions have emerged to enable emotion recognition, leveraging the characterization of biological signals, including the utilization of cardiac signals acquired from low-cost and wearable sensors. The objective of this work was to comprehensively investigate the current trends in the field by conducting a Systematic Literature Review (SLR) that focuses specifically on the detection, recognition, and classification of emotions based on cardiac signals, to gain insights into the prevailing techniques employed for signal acquisition, the extracted features, the elicitation process, and the classification methods employed in these studies. A SLR was conducted using four research databases, and articles were assessed concerning the proposed research questions. Twenty seven articles met the selection criteria and were assessed for the feasibility of using cardiac signals, acquired from low-cost and wearable devices, for emotion recognition. Several emotional elicitation methods were found in the literature, including the algorithms applied for automatic classification, as well as the key challenges associated with emotion recognition relying solely on cardiac signals. This study extends the current body of knowledge and enables future research by providing insights into suitable techniques for designing automatic emotion recognition applications. It emphasizes the importance of utilizing low-cost, wearable, and unobtrusive devices to acquire cardiac signals for accurate and accessible emotion recognition.
Collapse
Affiliation(s)
- Anderson Faria Claret
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, Brazil
| | - Karina Rabello Casali
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, Brazil
| | - Tatiana Sousa Cunha
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, Brazil.
| | - Matheus Cardoso Moraes
- Institute of Science and Technology, Federal University of São Paulo, São José dos Campos, Brazil
| |
Collapse
|
3
|
Cheng CF, Lin CJ. Building a Low-Cost Wireless Biofeedback Solution: Applying Design Science Research Methodology. SENSORS (BASEL, SWITZERLAND) 2023; 23:2920. [PMID: 36991630 PMCID: PMC10052076 DOI: 10.3390/s23062920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/04/2023] [Accepted: 03/06/2023] [Indexed: 06/19/2023]
Abstract
In recent years, affective computing has emerged as a promising approach to studying user experience, replacing subjective methods that rely on participants' self-evaluation. Affective computing uses biometrics to recognize people's emotional states as they interact with a product. However, the cost of medical-grade biofeedback systems is prohibitive for researchers with limited budgets. An alternative solution is to use consumer-grade devices, which are more affordable. However, these devices require proprietary software to collect data, complicating data processing, synchronization, and integration. Additionally, researchers need multiple computers to control the biofeedback system, increasing equipment costs and complexity. To address these challenges, we developed a low-cost biofeedback platform using inexpensive hardware and open-source libraries. Our software can serve as a system development kit for future studies. We conducted a simple experiment with one participant to validate the platform's effectiveness, using one baseline and two tasks that elicited distinct responses. Our low-cost biofeedback platform provides a reference architecture for researchers with limited budgets who wish to incorporate biometrics into their studies. This platform can be used to develop affective computing models in various domains, including ergonomics, human factors engineering, user experience, human behavioral studies, and human-robot interaction.
Collapse
|
4
|
Muhammad F, Hussain M, Aboalsamh H. A Bimodal Emotion Recognition Approach through the Fusion of Electroencephalography and Facial Sequences. Diagnostics (Basel) 2023; 13:977. [PMID: 36900121 PMCID: PMC10000366 DOI: 10.3390/diagnostics13050977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 01/26/2023] [Accepted: 02/06/2023] [Indexed: 03/08/2023] Open
Abstract
In recent years, human-computer interaction (HCI) systems have become increasingly popular. Some of these systems demand particular approaches for discriminating actual emotions through the use of better multimodal methods. In this work, a deep canonical correlation analysis (DCCA) based multimodal emotion recognition method is presented through the fusion of electroencephalography (EEG) and facial video clips. A two-stage framework is implemented, where the first stage extracts relevant features for emotion recognition using a single modality, while the second stage merges the highly correlated features from the two modalities and performs classification. Convolutional neural network (CNN) based Resnet50 and 1D-CNN (1-Dimensional CNN) have been utilized to extract features from facial video clips and EEG modalities, respectively. A DCCA-based approach was used to fuse highly correlated features, and three basic human emotion categories (happy, neutral, and sad) were classified using the SoftMax classifier. The proposed approach was investigated based on the publicly available datasets called MAHNOB-HCI and DEAP. Experimental results revealed an average accuracy of 93.86% and 91.54% on the MAHNOB-HCI and DEAP datasets, respectively. The competitiveness of the proposed framework and the justification for exclusivity in achieving this accuracy were evaluated by comparison with existing work.
Collapse
Affiliation(s)
- Farah Muhammad
- Department of Computer Science, College of Computer Science and Information, King Saud University, Riyadh 11451, Saudi Arabia
| | | | | |
Collapse
|
5
|
Błażejowska G, Gruba Ł, Indurkhya B, Gunia A. A Study on the Role of Affective Feedback in Robot-Assisted Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:1181. [PMID: 36772223 PMCID: PMC9918924 DOI: 10.3390/s23031181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/12/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
In recent years, there have been many approaches to using robots to teach computer programming. In intelligent tutoring systems and computer-aided learning, there is also some research to show that affective feedback to the student increases learning efficiency. However, a few studies on the role of incorporating an emotional personality in the robot in robot-assisted learning have found different results. To explore this issue further, we conducted a pilot study to investigate the effect of positive verbal encouragement and non-verbal emotive behaviour of the Miro-E robot during a robot-assisted programming session. The participants were tasked to program the robot's behaviour. In the experimental group, the robot monitored the participants' emotional state via their facial expressions, and provided affective feedback to the participants after completing each task. In the control group, the robot responded in a neutral way. The participants filled out a questionnaire before and after the programming session. The results show a positive reaction of the participants to the robot and the exercise. Though the number of participants was small, as the experiment was conducted during the pandemic, a qualitative analysis of the data was carried out. We found that the greatest affective outcome of the session was for students who had little experience or interest in programming before. We also found that the affective expressions of the robot had a negative impact on its likeability, revealing vestiges of the uncanny valley effect.
Collapse
Affiliation(s)
| | | | - Bipin Indurkhya
- Cognitive Science Department, Institute of Philosophy, Jagiellonian University, 31-007 Krakow, Poland
| | - Artur Gunia
- Cognitive Science Department, Institute of Philosophy, Jagiellonian University, 31-007 Krakow, Poland
| |
Collapse
|
6
|
Saurav S, Saini R, Singh S. Fast facial expression recognition using Boosted Histogram of Oriented Gradient (BHOG) features. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01112-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Ebrahimian S, Nahvi A, Tashakori M, Salmanzadeh H, Mohseni O, Leppänen T. Multi-Level Classification of Driver Drowsiness by Simultaneous Analysis of ECG and Respiration Signals Using Deep Neural Networks. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:10736. [PMID: 36078452 PMCID: PMC9518416 DOI: 10.3390/ijerph191710736] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 08/24/2022] [Accepted: 08/26/2022] [Indexed: 06/15/2023]
Abstract
The high number of fatal crashes caused by driver drowsiness highlights the need for developing reliable drowsiness detection methods. An ideal driver drowsiness detection system should estimate multiple levels of drowsiness accurately without intervening in the driving task. This paper proposes a multi-level drowsiness detection system by a deep neural network-based classification system using a combination of electrocardiogram and respiration signals. The proposed method is based on a combination of convolutional neural networks (CNNs) and long short-term memory (LSTM) networks for classifying drowsiness by concurrently using heart rate variability (HRV), power spectral density of HRV, and respiration rate signal as inputs. Two models, a CNN-based model and a hybrid CNN-LSTM-based model were used for multi-level classifications. The performance of the proposed method was evaluated on experimental data collected from 30 subjects in a simulated driving environment. The performance and the results of both models are presented and compared. The best performance for both three-level and five-level drowsiness classifications was achieved by the CNN-LSTM model. The results indicate that the three-level and five-level classifications of drowsiness can be achieved with 91 and 67% accuracy, respectively.
Collapse
Affiliation(s)
- Serajeddin Ebrahimian
- Department of Applied Physics, University of Eastern Finland, 70210 Kuopio, Finland
- Virtual Reality Laboratory, K. N. Toosi University of Technology, Tehran 19697-6449, Iran
- Diagnostic Imaging Center, Kuopio University Hospital, 70210 Kuopio, Finland
| | - Ali Nahvi
- Virtual Reality Laboratory, K. N. Toosi University of Technology, Tehran 19697-6449, Iran
| | - Masoumeh Tashakori
- Department of Applied Physics, University of Eastern Finland, 70210 Kuopio, Finland
- Virtual Reality Laboratory, K. N. Toosi University of Technology, Tehran 19697-6449, Iran
| | - Hamed Salmanzadeh
- Department of Industrial Engineering, K. N. Toosi University of Technology, Tehran 19697-6449, Iran
| | - Omid Mohseni
- Lauflabor Locomotion Lab, Institute of Sports Science, Centre for Cognitive Science, Technische Universität Darmstadt, 64283 Darmstadt, Germany
| | - Timo Leppänen
- Department of Applied Physics, University of Eastern Finland, 70210 Kuopio, Finland
- Diagnostic Imaging Center, Kuopio University Hospital, 70210 Kuopio, Finland
- School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane 4072, Australia
| |
Collapse
|
8
|
Jemioło P, Storman D, Mamica M, Szymkowski M, Żabicka W, Wojtaszek-Główka M, Ligęza A. Datasets for Automated Affect and Emotion Recognition from Cardiovascular Signals Using Artificial Intelligence- A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:2538. [PMID: 35408149 PMCID: PMC9002643 DOI: 10.3390/s22072538] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/21/2022] [Accepted: 03/22/2022] [Indexed: 02/04/2023]
Abstract
Our review aimed to assess the current state and quality of publicly available datasets used for automated affect and emotion recognition (AAER) with artificial intelligence (AI), and emphasising cardiovascular (CV) signals. The quality of such datasets is essential to create replicable systems for future work to grow. We investigated nine sources up to 31 August 2020, using a developed search strategy, including studies considering the use of AI in AAER based on CV signals. Two independent reviewers performed the screening of identified records, full-text assessment, data extraction, and credibility. All discrepancies were resolved by discussion. We descriptively synthesised the results and assessed their credibility. The protocol was registered on the Open Science Framework (OSF) platform. Eighteen records out of 195 were selected from 4649 records, focusing on datasets containing CV signals for AAER. Included papers analysed and shared data of 812 participants aged 17 to 47. Electrocardiography was the most explored signal (83.33% of datasets). Authors utilised video stimulation most frequently (52.38% of experiments). Despite these results, much information was not reported by researchers. The quality of the analysed papers was mainly low. Researchers in the field should concentrate more on methodology.
Collapse
Affiliation(s)
- Paweł Jemioło
- AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering, al. A. Mickiewicza 30, 30-059 Krakow, Poland; (M.M.); (M.S.)
| | - Dawid Storman
- Chair of Epidemiology and Preventive Medicine, Department of Hygiene and Dietetics, Jagiellonian University Medical College, ul. M. Kopernika 7, 31-034 Krakow, Poland;
| | - Maria Mamica
- AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering, al. A. Mickiewicza 30, 30-059 Krakow, Poland; (M.M.); (M.S.)
| | - Mateusz Szymkowski
- AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering, al. A. Mickiewicza 30, 30-059 Krakow, Poland; (M.M.); (M.S.)
| | - Wioletta Żabicka
- Students’ Scientific Research Group of Systematic Reviews, Jagiellonian University Medical College, ul. M. Kopernika 7, 31-034 Krakow, Poland; (W.Ż.); (M.W.-G.)
| | - Magdalena Wojtaszek-Główka
- Students’ Scientific Research Group of Systematic Reviews, Jagiellonian University Medical College, ul. M. Kopernika 7, 31-034 Krakow, Poland; (W.Ż.); (M.W.-G.)
| | - Antoni Ligęza
- AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering, al. A. Mickiewicza 30, 30-059 Krakow, Poland; (M.M.); (M.S.)
| |
Collapse
|
9
|
Comparative Analysis of Emotion Classification Based on Facial Expression and Physiological Signals Using Deep Learning. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This study aimed to classify emotion based on facial expression and physiological signals using deep learning and to compare the analyzed results. We asked 53 subjects to make facial expressions, expressing four types of emotion. Next, the emotion-inducing video was watched for 1 min, and the physiological signals were obtained. We defined four emotions as positive and negative emotions and designed three types of deep-learning models that can classify emotions. Each model used facial expressions and physiological signals as inputs, and a model in which these two types of input were applied simultaneously was also constructed. The accuracy of the model was 81.54% when physiological signals were used, 99.9% when facial expressions were used, and 86.2% when both were used. Constructing a deep-learning model with only facial expressions showed good performance. The results of this study confirm that the best approach for classifying emotion is using only facial expressions rather than data from multiple inputs. However, this is an opinion presented only in terms of accuracy without considering the computational cost, and it is suggested that physiological signals and multiple inputs be used according to the situation and research purpose.
Collapse
|
10
|
Yan M, Deng Z, He B, Zou C, Wu J, Zhu Z. Emotion classification with multichannel physiological signals using hybrid feature and adaptive decision fusion. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103235] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
11
|
Idrobo-Ávila E, Loaiza-Correa H, Muñoz-Bolaños F, van Noorden L, Vargas-Cañas R. A Proposal for a Data-Driven Approach to the Influence of Music on Heart Dynamics. Front Cardiovasc Med 2021; 8:699145. [PMID: 34490368 PMCID: PMC8417899 DOI: 10.3389/fcvm.2021.699145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 07/20/2021] [Indexed: 11/13/2022] Open
Abstract
Electrocardiographic signals (ECG) and heart rate viability measurements (HRV) provide information in a range of specialist fields, extending to musical perception. The ECG signal records heart electrical activity, while HRV reflects the state or condition of the autonomic nervous system. HRV has been studied as a marker of diverse psychological and physical diseases including coronary heart disease, myocardial infarction, and stroke. HRV has also been used to observe the effects of medicines, the impact of exercise and the analysis of emotional responses and evaluation of effects of various quantifiable elements of sound and music on the human body. Variations in blood pressure, levels of stress or anxiety, subjective sensations and even changes in emotions constitute multiple aspects that may well-react or respond to musical stimuli. Although both ECG and HRV continue to feature extensively in research in health and perception, methodologies vary substantially. This makes it difficult to compare studies, with researchers making recommendations to improve experiment planning and the analysis and reporting of data. The present work provides a methodological framework to examine the effect of sound on ECG and HRV with the aim of associating musical structures and noise to the signals by means of artificial intelligence (AI); it first presents a way to select experimental study subjects in light of the research aims and then offers possibilities for selecting and producing suitable sound stimuli; once sounds have been selected, a guide is proposed for optimal experimental design. Finally, a framework is introduced for analysis of data and signals, based on both conventional as well as data-driven AI tools. AI is able to study big data at a single stroke, can be applied to different types of data, and is capable of generalisation and so is considered the main tool in the analysis.
Collapse
Affiliation(s)
- Ennio Idrobo-Ávila
- Escuela de Ingeniería Eléctrica y Electrónica, PSI - Percepción y Sistemas Inteligentes, Universidad del Valle, Cali, Colombia
| | - Humberto Loaiza-Correa
- Escuela de Ingeniería Eléctrica y Electrónica, PSI - Percepción y Sistemas Inteligentes, Universidad del Valle, Cali, Colombia
| | - Flavio Muñoz-Bolaños
- Departamento de Ciencias Fisiológicas, CIFIEX - Ciencias Fisiológicas Experimentales, Universidad del Cauca, Popayán, Colombia
| | - Leon van Noorden
- Department of Art, Music, and Theatre Sciences, IPEM—Institute for Systematic Musicology, Ghent University, Ghent, Belgium
| | - Rubiel Vargas-Cañas
- Departamento de Física, SIDICO - Sistemas Dinámicos, Instrumentación y Control, Universidad del Cauca, Popayán, Colombia
| |
Collapse
|
12
|
Oh S, Kim DK. Machine-Deep-Ensemble Learning Model for Classifying Cybersickness Caused by Virtual Reality Immersion. CYBERPSYCHOLOGY BEHAVIOR AND SOCIAL NETWORKING 2021; 24:729-736. [PMID: 34375142 DOI: 10.1089/cyber.2020.0613] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
This study aims to classify cybersickness (CS) caused by virtual reality (VR) immersion through a machine-deep-ensemble learning model. The heart rate variability and respiratory signal parameters of 20 subjects were measured, while watching a VR video for ∼5 minutes. After the experiment, the subjects were examined for CS and questioned to determine their CS states. Based on the results, we constructed a machine-deep-ensemble learning model that could identify and classify VR immersion CS among subjects. The ensemble model comprised four stacked machine learning models (support vector machine [SVM], k-nearest neighbor [KNN], random forest, and AdaBoost), which were used to derive prediction data, and then, classified the prediction data using a convolution neural network. This model was a multiclass classification model, allowing us to classify subjects' CS into three states (neutral, non-CS, and CS). The accuracy of SVM, KNN, random forest, and AdaBoost was 94.23 percent, 92.44 percent, 93.20 percent, and 90.33 percent, respectively, and the ensemble model could classify the three states with an accuracy of 96.48 percent. This implied that the ensemble model has a higher classification performance than when each model is used individually. Our results confirm that CS caused by VR immersion can be detected as physiological signal data with high accuracy. Moreover, our proposed model can determine the presence or absence of CS as well as the neutral state. Clinical Trial Registration Number: 20-2021-1.
Collapse
Affiliation(s)
- SeungJun Oh
- Department of Sports ICT Convergence, Sangmyung University Graduate School, Seoul, Republic of Korea
| | - Dong-Keun Kim
- Department of Human-Centered Artificial Intelligence, Institute of Intelligence Informatics Technology, Sangmyung University, Seoul, Republic of Korea
| |
Collapse
|
13
|
EmNet: a deep integrated convolutional neural network for facial emotion recognition in the wild. APPL INTELL 2021. [DOI: 10.1007/s10489-020-02125-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
14
|
Petrescu L, Petrescu C, Oprea A, Mitruț O, Moise G, Moldoveanu A, Moldoveanu F. Machine Learning Methods for Fear Classification Based on Physiological Features. SENSORS 2021; 21:s21134519. [PMID: 34282759 PMCID: PMC8271969 DOI: 10.3390/s21134519] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/29/2021] [Accepted: 06/29/2021] [Indexed: 12/22/2022]
Abstract
This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.
Collapse
Affiliation(s)
- Livia Petrescu
- Faculty of Biology, University of Bucharest, 050095 Bucharest, Romania
- Correspondence:
| | - Cătălin Petrescu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania; (C.P.); (A.O.); (O.M.); (A.M.); (F.M.)
| | - Ana Oprea
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania; (C.P.); (A.O.); (O.M.); (A.M.); (F.M.)
| | - Oana Mitruț
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania; (C.P.); (A.O.); (O.M.); (A.M.); (F.M.)
| | - Gabriela Moise
- Faculty of Letters and Sciences, Petroleum-Gas University of Ploiesti, 100680 Ploiesti, Romania;
| | - Alin Moldoveanu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania; (C.P.); (A.O.); (O.M.); (A.M.); (F.M.)
| | - Florica Moldoveanu
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania; (C.P.); (A.O.); (O.M.); (A.M.); (F.M.)
| |
Collapse
|
15
|
Jaramillo-Quintanar D, Cruz-Albarran IA, Guzman-Sandoval VM, Morales-Hernandez LA. Smart Sensor Based on Biofeedback to Measure Child Relaxation in Out-of-Home Care. SENSORS 2020; 20:s20154194. [PMID: 32731523 PMCID: PMC7435878 DOI: 10.3390/s20154194] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 12/28/2022]
Abstract
Children from out-of-home care are a vulnerable population that faces high stress and anxiety levels due to stressful experiences, such as being abused, being raped, and violence. This problem could have negative effects on their bio-psycho-social well-being if they are not provided with comprehensive psychological treatment. Numerous methods have been developed to help them relax, but there are no current approaches for assessing the relaxation level they reach. Based on this, a novel smart sensor that can evaluate the level of relaxation a child experiences is developed in this paper. It evaluates changes in thermal biomarkers (forehead, right and left cheek, chin, and maxillary) and heart rate (HR). Then, through a k-nearest neighbors (K-NN) intelligent classifier, four possible levels of relaxation can be obtained: no-relax, low-relax, relax, and very-relax. Additionally, an application (called i-CARE) for anxiety management, which is based on biofeedback diaphragmatic breathing, guided imagery, and video games, is evaluated. After testing the developed smart sensor, an 89.7% accuracy is obtained. The smart sensor used provides a reliable measurement of relaxation levels and the i-CARE application is effective for anxiety management, both of which are focused on children exposed to out-of-home care conditions.
Collapse
Affiliation(s)
- Daniel Jaramillo-Quintanar
- Mechatronics, Engineering Faculty, Campus San Juan del Rio, University Autonomous of Queretaro, San Juan del Rio, Queretaro 76803, Mexico; (D.J.-Q.); (I.A.C.-A.)
| | - Irving A. Cruz-Albarran
- Mechatronics, Engineering Faculty, Campus San Juan del Rio, University Autonomous of Queretaro, San Juan del Rio, Queretaro 76803, Mexico; (D.J.-Q.); (I.A.C.-A.)
| | | | - Luis A. Morales-Hernandez
- Mechatronics, Engineering Faculty, Campus San Juan del Rio, University Autonomous of Queretaro, San Juan del Rio, Queretaro 76803, Mexico; (D.J.-Q.); (I.A.C.-A.)
- Correspondence:
| |
Collapse
|
16
|
Raheel A, Majid M, Alnowami M, Anwar SM. Physiological Sensors Based Emotion Recognition While Experiencing Tactile Enhanced Multimedia. SENSORS 2020; 20:s20144037. [PMID: 32708056 PMCID: PMC7411620 DOI: 10.3390/s20144037] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 05/12/2020] [Accepted: 05/14/2020] [Indexed: 12/18/2022]
Abstract
Emotion recognition has increased the potential of affective computing by getting an instant feedback from users and thereby, have a better understanding of their behavior. Physiological sensors have been used to recognize human emotions in response to audio and video content that engages single (auditory) and multiple (two: auditory and vision) human senses, respectively. In this study, human emotions were recognized using physiological signals observed in response to tactile enhanced multimedia content that engages three (tactile, vision, and auditory) human senses. The aim was to give users an enhanced real-world sensation while engaging with multimedia content. To this end, four videos were selected and synchronized with an electric fan and a heater, based on timestamps within the scenes, to generate tactile enhanced content with cold and hot air effect respectively. Physiological signals, i.e., electroencephalography (EEG), photoplethysmography (PPG), and galvanic skin response (GSR) were recorded using commercially available sensors, while experiencing these tactile enhanced videos. The precision of the acquired physiological signals (including EEG, PPG, and GSR) is enhanced using pre-processing with a Savitzky-Golay smoothing filter. Frequency domain features (rational asymmetry, differential asymmetry, and correlation) from EEG, time domain features (variance, entropy, kurtosis, and skewness) from GSR, heart rate and heart rate variability from PPG data are extracted. The K nearest neighbor classifier is applied to the extracted features to classify four (happy, relaxed, angry, and sad) emotions. Our experimental results show that among individual modalities, PPG-based features gives the highest accuracy of 78.57% as compared to EEG- and GSR-based features. The fusion of EEG, GSR, and PPG features further improved the classification accuracy to 79.76% (for four emotions) when interacting with tactile enhanced multimedia.
Collapse
Affiliation(s)
- Aasim Raheel
- Department of Computer Engineering, University of Engineering and Technology, Taxila 47050, Pakistan;
| | - Muhammad Majid
- Department of Computer Engineering, University of Engineering and Technology, Taxila 47050, Pakistan;
- Correspondence:
| | - Majdi Alnowami
- Department of Nuclear Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia;
| | - Syed Muhammad Anwar
- Department of Software Engineering, University of Engineering and Technology, Taxila 47050, Pakistan;
| |
Collapse
|