1
|
Das S, Catterall J, Stone R, Clough AR. "The reasons you believe …": An exploratory study of text driven evidence gathering and prediction from first responder records justifying state authorised intervention for mental health episodes. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108257. [PMID: 38901271 DOI: 10.1016/j.cmpb.2024.108257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 05/13/2024] [Accepted: 05/28/2024] [Indexed: 06/22/2024]
Abstract
Objective First responders' mandatory reports of mental health episodes requiring emergency hospital care contain rich information about patients and their needs. In Queensland (Australia) much of the information contained in Emergency Examination Authorities (EEAs) remains unused. We propose and demonstrate a methodology to extract and translate vital information embedded in reports like EEAs and to use it to investigate the extreme propensity of incidence of serious mental health episodes. Methods The proposed method integrates clinical, demographic, spatial and free text information into a single data collection. The data is subjected to exploratory analysis for spatial pattern recognition leading to an observational epidemiology model for the association of maximum spatial recurrence of EEA episodes. Results Sentiment analysis revealed that among EEA presentations hospital and health service (HHS) region #4 had the lowest proportion of positive sentiments (18 %) compared to 33 % for HHS region #1 pointing to spatial differentiation of sentiments immanent in mandated free text which required more detailed analysis. At the postcode geographical level, we found that variation in maximum spatial recurrence of EEAs was significantly positively associated with spatial range of sentiments (0.29, p < 0.001) and the postcode-referenced sex ratio (0.45, p = 0.01). The volatility of sentiments significantly correlated with extremes of recurrence of EEA episodes. The predicted (probabilistic) incidence rate when mapped reflected this correlation. Conclusions The paper demonstrates the efficacy of integrating, machine extracted, human sentiments (as potential surrogates) with conventional exposure variables for evidence-based methods for mental health spatial epidemiology. Such insights from informatics-driven epidemiological observations may inform the strategic allocation of health system resources to address the highest levels of need and to improve the standard of care for mental patients while also enhancing their safe and humane treatment and management.
Collapse
Affiliation(s)
- Sourav Das
- School of Electrical Engineering, Computing, and Mathematical Sciences, Curtin University, Perth, WA, Australia.
| | - Janet Catterall
- Liaison Librarian, Library and Information Services, Division of Student Life, James Cook University, PO Box 6811. Cairns, QLD, Australia
| | - Richard Stone
- Director of Emergency Medicine, Cairns Hospital, Cairns and Hinterland Hospital and Health Service, Cairns, QLD, Australia
| | - Alan R Clough
- Professorial Research Fellow, College of Public Health, Medical and Veterinary Sciences, and Australian Institute of Tropical Health and Medicine, James Cook University, PO Box 6811. Cairns, QLD, Australia
| |
Collapse
|
2
|
Martin EA, Lian W, Oltmanns JR, Jonas KG, Samaras D, Hallquist MN, Ruggero CJ, Clouston SAP, Kotov R. Behavioral meaures of psychotic disorders: Using automatic facial coding to detect nonverbal expressions in video. J Psychiatr Res 2024; 176:9-17. [PMID: 38830297 DOI: 10.1016/j.jpsychires.2024.05.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 04/11/2024] [Accepted: 05/29/2024] [Indexed: 06/05/2024]
Abstract
Emotional deficits in psychosis are prevalent and difficult to treat. In particular, much remains unknown about facial expression abnormalities, and a key reason is that expressions are very labor-intensive to code. Automatic facial coding (AFC) can remove this barrier. The current study sought to both provide evidence for the utility of AFC in psychosis for research purposes and to provide evidence that AFC are valid measures of clinical constructs. Changes of facial expressions and head position of participants-39 with schizophrenia/schizoaffective disorder (SZ), 46 with other psychotic disorders (OP), and 108 never psychotic individuals (NP)-were assessed via FaceReader, a commercially available automated facial expression analysis software, using video recorded during a clinical interview. We first examined the behavioral measures of the psychotic disorder groups and tested if they can discriminate between the groups. Next, we evaluated links of behavioral measures with clinical symptoms, controlling for group membership. We found the SZ group was characterized by significantly less variation in neutral expressions, happy expressions, arousal, and head movements compared to NP. These measures discriminated SZ from NP well (AUC = 0.79, sensitivity = 0.79, specificity = 0.67) but discriminated SZ from OP less well (AUC = 0.66, sensitivity = 0.77, specificity = 0.46). We also found significant correlations between clinician-rated symptoms and most behavioral measures (particularly happy expressions, arousal, and head movements). Taken together, these results suggest that AFC can provide useful behavioral measures of psychosis, which could improve research on non-verbal expressions in psychosis and, ultimately, enhance treatment.
Collapse
Affiliation(s)
- Elizabeth A Martin
- Department of Psychological Science, University of California, Irvine, CA, USA.
| | - Wenxuan Lian
- Department of Materials Science and Engineering and Department of Applied Math and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Joshua R Oltmanns
- Department of Psychiatry, Stony Brook University, Stony Brook, NY, USA
| | - Katherine G Jonas
- Department of Psychiatry, Stony Brook University, Stony Brook, NY, USA
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Michael N Hallquist
- Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Camilo J Ruggero
- Department of Psychology, University of Texas at Dallas, Richardson, TX, USA
| | - Sean A P Clouston
- Program in Public Health and Department of Family, Population, and Preventive Medicine, Renaissance School of Medicine, Stony Brook University, Stony Brook, NY, USA
| | - Roman Kotov
- Department of Psychiatry, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
3
|
Kumar A, Vishwakarma A, Bajaj V. ML3CNet: Non-local means-assisted automatic framework for lung cancer subtypes classification using histopathological images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 251:108207. [PMID: 38723437 DOI: 10.1016/j.cmpb.2024.108207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 03/20/2024] [Accepted: 04/30/2024] [Indexed: 05/31/2024]
Abstract
BACKGROUND AND OBJECTIVE Lung cancer (LC) has a high fatality rate that continuously affects human lives all over the world. Early detection of LC prolongs human life and helps to prevent the disease. Histopathological inspection is a common method to diagnose LC. Visual inspection of histopathological diagnosis necessitates more inspection time, and the decision depends on the subjective perception of clinicians. Usually, machine learning techniques mostly depend on traditional feature extraction which is labor-intensive and may not be appropriate for enormous data. In this work, a convolutional neural network (CNN)-based architecture is proposed for the more effective classification of lung tissue subtypes using histopathological images. METHODS Authors have utilized the first-time nonlocal mean (NLM) filter to suppress the effect of noise from histopathological images. NLM filter efficiently eliminated noise while preserving the edges of images. Then, the obtained denoised images are given as input to the proposed multi-headed lung cancer classification convolutional neural network (ML3CNet). Furthermore, the model quantization technique is utilized to reduce the size of the proposed model for the storage of the data. Reduction in model size requires less memory and speeds up data processing. RESULTS The effectiveness of the proposed model is compared with the other existing state-of-the-art methods. The proposed ML3CNet achieved an average classification accuracy of 99.72%, sensitivity of 99.66%, precision of 99.64%, specificity of 99.84%, F-1 score of 0.9965, and area under the curve of 0.9978. The quantized accuracy of 98.92% is attained by the proposed model. To validate the applicability of the proposed ML3CNet, it has also been tested on the colon cancer dataset. CONCLUSION The findings reveal that the proposed approach can be beneficial to automatically classify LC subtypes that might assist healthcare workers in making decisions more precisely. The proposed model can be implemented on the hardware using Raspberry Pi for practical realization.
Collapse
Affiliation(s)
- Anurodh Kumar
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 482005, India.
| | - Amit Vishwakarma
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 482005, India.
| | - Varun Bajaj
- PDPM Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 482005, India; Maulana Azad National Institute of Technology Bhopal, Bhopal, 462003, India.
| |
Collapse
|
4
|
Nia AF, Tang V, Talou GM, Billinghurst M. Synthesizing affective neurophysiological signals using generative models: A review paper. J Neurosci Methods 2024; 406:110129. [PMID: 38614286 DOI: 10.1016/j.jneumeth.2024.110129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/04/2024] [Accepted: 04/03/2024] [Indexed: 04/15/2024]
Abstract
The integration of emotional intelligence in machines is an important step in advancing human-computer interaction. This demands the development of reliable end-to-end emotion recognition systems. However, the scarcity of public affective datasets presents a challenge. In this literature review, we emphasize the use of generative models to address this issue in neurophysiological signals, particularly Electroencephalogram (EEG) and Functional Near-Infrared Spectroscopy (fNIRS). We provide a comprehensive analysis of different generative models used in the field, examining their input formulation, deployment strategies, and methodologies for evaluating the quality of synthesized data. This review serves as a comprehensive overview, offering insights into the advantages, challenges, and promising future directions in the application of generative models in emotion recognition systems. Through this review, we aim to facilitate the progression of neurophysiological data augmentation, thereby supporting the development of more efficient and reliable emotion recognition systems.
Collapse
Affiliation(s)
- Alireza F Nia
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand.
| | - Vanessa Tang
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| | - Gonzalo Maso Talou
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| | - Mark Billinghurst
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| |
Collapse
|
5
|
Meng L, Liang X, Zhang B, Liang J. Development of a scale for the impact of emotion management on young athletes' training efficiency. Heliyon 2024; 10:e30069. [PMID: 38699037 PMCID: PMC11064430 DOI: 10.1016/j.heliyon.2024.e30069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/11/2024] [Accepted: 04/18/2024] [Indexed: 05/05/2024] Open
Abstract
In this study, we developed a scale to evaluate emotion management and its benefits for young athletes in China, and to analyze the impact of emotion management on their training efficiency. Following an extensive literature review, we used AMOS structural equation model software to develop a scale for evaluating the effects and benefits of emotion management on young athletes' training efficiency. Results showed that young athletes' emotion management training and its benefits can be divided into five dimensions: benefit evaluation, emotional cognition, emotion influence, emotion control, and emotion regulation. The internal consistency reliability of the formal scale was 0.895, and the internal consistency reliability of each subscale was between 0.734 and 0.901. The split-half reliability was 0.769, and the split-half reliability of each subscale was between 0.623 and 0.864. The KMO value was 0.904, P = 0.00 (p < 0.05), and the cumulative interpretation rate was 61.782 % of the total variance. The lowest factor load of a scale item was 0.436, and the highest factor load was 0.846. The common degree of all items was between 0.402 and 0.762, indicating that the scale has good validity. A SEM model verified that the scale has good construct validity. Significant correlational differences were observed among the levels. The results of the SEM structural equation model analysis showed that the model's NC = 2.660 (1 < NC < 3 indicates that the model has a simple fit), PGFI = 0.722, PNFI = 0.699, IFI = 0.851, PRA = 0.927, RMR = 0.006, and RMSEA = 0.07, thus, these indexes reached the standard of excellent model fitting. The strongest correlation was found between emotional cognition and benefit evaluation (R = 0.690), and the weakest correlation was found between emotion influence and benefit evaluation (R = 0.079). These findings provide a basis for measuring the effect of emotion management on training efficiency in the training process of young athletes and offer a theoretical reference for their emotional development while in training.
Collapse
Affiliation(s)
- Lingfei Meng
- Beijing Sports University, Beijing, 100084, China
| | - Xiao Liang
- Southwest University of Political Science and Law, Chongqing, 401120, China
| | - Biyu Zhang
- Beijing Sports University, Beijing, 100084, China
| | | |
Collapse
|
6
|
Guo R, Guo H, Wang L, Chen M, Yang D, Li B. Development and application of emotion recognition technology - a systematic literature review. BMC Psychol 2024; 12:95. [PMID: 38402398 PMCID: PMC10894494 DOI: 10.1186/s40359-024-01581-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 02/11/2024] [Indexed: 02/26/2024] Open
Abstract
BACKGROUND There is a mutual influence between emotions and diseases. Thus, the subject of emotions has gained increasing attention. OBJECTIVE The primary objective of this study was to conduct a comprehensive review of the developments in emotion recognition technology over the past decade. This review aimed to gain insights into the trends and real-world effects of emotion recognition technology by examining its practical applications in different settings, including hospitals and home environments. METHODS This study followed the Preferred Reporting Items for Systematic Reviews (PRISMA) guidelines and included a search of 4 electronic databases, namely, PubMed, Web of Science, Google Scholar and IEEE Xplore, to identify eligible studies published between 2013 and 2023. The quality of the studies was assessed using the Critical Appraisal Skills Programme (CASP) criteria. The key information from the studies, including the study populations, application scenarios, and technological methods employed, was summarized and analyzed. RESULTS In a systematic literature review of the 44 studies that we analyzed the development and impact of emotion recognition technology in the field of medicine from three distinct perspectives: "application scenarios," "techniques of multiple modalities," and "clinical applications." The following three impacts were identified: (i) The advancement of emotion recognition technology has facilitated remote emotion recognition and treatment in hospital and home environments by healthcare professionals. (ii) There has been a shift from traditional subjective emotion assessment methods to multimodal emotion recognition methods that are grounded in objective physiological signals. This technological progress is expected to enhance the accuracy of medical diagnosis. (iii) The evolving relationship between emotions and disease throughout diagnosis, intervention, and treatment processes holds clinical significance for real-time emotion monitoring. CONCLUSION These findings indicate that the integration of emotion recognition technology with intelligent devices has led to the development of application systems and models, which provide technological support for the recognition of and interventions for emotions. However, the continuous recognition of emotional changes in dynamic or complex environments will be a focal point of future research.
Collapse
Affiliation(s)
- Runfang Guo
- The First Affiliated Hospital of Bengbu Medical University, Bengbu Medical University, 287 Changhuai Road, Bengbu, China
- School of Public Health, Bengbu Medical University, Bengbu, China
| | - Hongfei Guo
- School of Humanities, Southeast University, Nanjing, China
| | - Liwen Wang
- School of Public Health, Bengbu Medical University, Bengbu, China
| | - Mengmeng Chen
- School of Health Management, Bengbu Medical University, Bengbu, China
| | - Dong Yang
- School of Public Health, Bengbu Medical University, Bengbu, China
| | - Bin Li
- The First Affiliated Hospital of Bengbu Medical University, Bengbu Medical University, 287 Changhuai Road, Bengbu, China.
- School of Public Health, Bengbu Medical University, Bengbu, China.
| |
Collapse
|
7
|
Olmez Y, Koca GO, Sengur A, Acharya UR. PS-VTS: particle swarm with visit table strategy for automated emotion recognition with EEG signals. Health Inf Sci Syst 2023; 11:22. [PMID: 37151916 PMCID: PMC10160266 DOI: 10.1007/s13755-023-00224-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 04/24/2023] [Indexed: 05/09/2023] Open
Abstract
Recognizing emotions accurately in real life is crucial in human-computer interaction (HCI) systems. Electroencephalogram (EEG) signals have been extensively employed to identify emotions. The researchers have used several EEG-based emotion identification datasets to validate their proposed models. In this paper, we have employed a novel metaheuristic optimization approach for accurate emotion classification by applying it to select both channel and rhythm of EEG data. In this work, we have proposed the particle swarm with visit table strategy (PS-VTS) metaheuristic technique to improve the effectiveness of EEG-based human emotion identification. First, the EEG signals are denoised using a low pass filter, and then rhythm extraction is done using discrete wavelet transform (DWT). The continuous wavelet transform (CWT) approach transforms each rhythm signal into a rhythm image. The pre-trained MobilNetv2 model has been pre-trained for deep feature extraction, and a support vector machine (SVM) is used to classify the emotions. Two models are developed for optimal channels and rhythm sets. In Model 1, optimal channels are selected separately for each rhythm, and global optima are determined in the optimization process according to the best channel sets of the rhythms. The best rhythms are first determined for each channel, and then the optimal channel-rhythm set is selected in Model 2. Our proposed model obtained an accuracy of 99.2871% and 97.8571% for the classification of HA (high arousal)-LA (low arousal) and HV (high valence)-LV (low valence), respectively with the DEAP dataset. Our generated model obtained the highest classification accuracy compared to the previously reported methods.
Collapse
Affiliation(s)
- Yagmur Olmez
- Department of Mechatronics Engineering, University of Firat, 23119 Elazig, Turkey
| | - Gonca Ozmen Koca
- Department of Mechatronics Engineering, University of Firat, 23119 Elazig, Turkey
| | - Abdulkadir Sengur
- Department of Electrical and Electronics Engineering, University of Firat, 23119 Elazig, Turkey
| | - U. Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
8
|
Li N, Ross R. Invoking and identifying task-oriented interlocutor confusion in human-robot interaction. Front Robot AI 2023; 10:1244381. [PMID: 38054199 PMCID: PMC10694506 DOI: 10.3389/frobt.2023.1244381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/31/2023] [Indexed: 12/07/2023] Open
Abstract
Successful conversational interaction with a social robot requires not only an assessment of a user's contribution to an interaction, but also awareness of their emotional and attitudinal states as the interaction unfolds. To this end, our research aims to systematically trigger, but then interpret human behaviors to track different states of potential user confusion in interaction so that systems can be primed to adjust their policies in light of users entering confusion states. In this paper, we present a detailed human-robot interaction study to prompt, investigate, and eventually detect confusion states in users. The study itself employs a Wizard-of-Oz (WoZ) style design with a Pepper robot to prompt confusion states for task-oriented dialogues in a well-defined manner. The data collected from 81 participants includes audio and visual data, from both the robot's perspective and the environment, as well as participant survey data. From these data, we evaluated the correlations of induced confusion conditions with multimodal data, including eye gaze estimation, head pose estimation, facial emotion detection, silence duration time, and user speech analysis-including emotion and pitch analysis. Analysis shows significant differences of participants' behaviors in states of confusion based on these signals, as well as a strong correlation between confusion conditions and participants own self-reported confusion scores. The paper establishes strong correlations between confusion levels and these observable features, and lays the ground or a more complete social and affect oriented strategy for task-oriented human-robot interaction. The contributions of this paper include the methodology applied, dataset, and our systematic analysis.
Collapse
Affiliation(s)
- Na Li
- School of Computer Science, Technological University, Dublin, Ireland
| | | |
Collapse
|
9
|
Kiprijanovska I, Stankoski S, Broulidakis MJ, Archer J, Fatoorechi M, Gjoreski M, Nduka C, Gjoreski H. Towards smart glasses for facial expression recognition using OMG and machine learning. Sci Rep 2023; 13:16043. [PMID: 37749176 PMCID: PMC10520037 DOI: 10.1038/s41598-023-43135-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 09/20/2023] [Indexed: 09/27/2023] Open
Abstract
This study aimed to evaluate the use of novel optomyography (OMG) based smart glasses, OCOsense, for the monitoring and recognition of facial expressions. Experiments were conducted on data gathered from 27 young adult participants, who performed facial expressions varying in intensity, duration, and head movement. The facial expressions included smiling, frowning, raising the eyebrows, and squeezing the eyes. The statistical analysis demonstrated that: (i) OCO sensors based on the principles of OMG can capture distinct variations in cheek and brow movements with a high degree of accuracy and specificity; (ii) Head movement does not have a significant impact on how well these facial expressions are detected. The collected data were also used to train a machine learning model to recognise the four facial expressions and when the face enters a neutral state. We evaluated this model in conditions intended to simulate real-world use, including variations in expression intensity, head movement and glasses position relative to the face. The model demonstrated an overall accuracy of 93% (0.90 f1-score)-evaluated using a leave-one-subject-out cross-validation technique.
Collapse
Affiliation(s)
| | | | | | | | | | - Martin Gjoreski
- Faculty of Informatics, Università della Svizzera Italiana, 6900, Lugano, Switzerland
| | | | - Hristijan Gjoreski
- Emteq Ltd., Brighton, BN1 9SB, UK
- Faculty of Electrical Engineering and Information Technologies, Ss. Cyril and Methodius University in Skopje, 1000, Skopje, North Macedonia
| |
Collapse
|
10
|
Zhu M, Jin H, Bai Z, Li Z, Song Y. Image-Evoked Emotion Recognition for Hearing-Impaired Subjects with EEG Signals. SENSORS (BASEL, SWITZERLAND) 2023; 23:5461. [PMID: 37420628 DOI: 10.3390/s23125461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/03/2023] [Accepted: 06/07/2023] [Indexed: 07/09/2023]
Abstract
In recent years, there has been a growing interest in the study of emotion recognition through electroencephalogram (EEG) signals. One particular group of interest are individuals with hearing impairments, who may have a bias towards certain types of information when communicating with those in their environment. To address this, our study collected EEG signals from both hearing-impaired and non-hearing-impaired subjects while they viewed pictures of emotional faces for emotion recognition. Four kinds of feature matrices, symmetry difference, and symmetry quotient based on original signal and differential entropy (DE) were constructed, respectively, to extract the spatial domain information. The multi-axis self-attention classification model was proposed, which consists of local attention and global attention, combining the attention model with convolution through a novel architectural element for feature classification. Three-classification (positive, neutral, negative) and five-classification (happy, neutral, sad, angry, fearful) tasks of emotion recognition were carried out. The experimental results show that the proposed method is superior to the original feature method, and the multi-feature fusion achieved a good effect in both hearing-impaired and non-hearing-impaired subjects. The average classification accuracy for hearing-impaired subjects and non-hearing-impaired subjects was 70.2% (three-classification) and 50.15% (five-classification), and 72.05% (three-classification) and 51.53% (five-classification), respectively. In addition, by exploring the brain topography of different emotions, we found that the discriminative brain regions of the hearing-impaired subjects were also distributed in the parietal lobe, unlike those of the non-hearing-impaired subjects.
Collapse
Affiliation(s)
- Mu Zhu
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China
| | - Haonan Jin
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China
| | - Zhongli Bai
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China
| | - Zhiwei Li
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China
| | - Yu Song
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China
| |
Collapse
|
11
|
Huang ZY, Chiang CC, Chen JH, Chen YC, Chung HL, Cai YP, Hsu HC. A study on computer vision for facial emotion recognition. Sci Rep 2023; 13:8425. [PMID: 37225755 DOI: 10.1038/s41598-023-35446-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 05/18/2023] [Indexed: 05/26/2023] Open
Abstract
Artificial intelligence has been successfully applied in various fields, one of which is computer vision. In this study, a deep neural network (DNN) was adopted for Facial emotion recognition (FER). One of the objectives in this study is to identify the critical facial features on which the DNN model focuses for FER. In particular, we utilized a convolutional neural network (CNN), the combination of squeeze-and-excitation network and the residual neural network, for the task of FER. We utilized AffectNet and the Real-World Affective Faces Database (RAF-DB) as the facial expression databases that provide learning samples for the CNN. The feature maps were extracted from the residual blocks for further analysis. Our analysis shows that the features around the nose and mouth are critical facial landmarks for the neural networks. Cross-database validations were conducted between the databases. The network model trained on AffectNet achieved 77.37% accuracy when validated on the RAF-DB, while the network model pretrained on AffectNet and then transfer learned on the RAF-DB results in validation accuracy of 83.37%. The outcomes of this study would improve the understanding of neural networks and assist with improving computer vision accuracy.
Collapse
Affiliation(s)
- Zi-Yu Huang
- Department of Mechanical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
| | - Chia-Chin Chiang
- Department of Mechanical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
| | - Jian-Hao Chen
- Graduate Institute of Applied Physics, National Chengchi University, Taipei, Taiwan
| | - Yi-Chian Chen
- Department of Occupational Safety and Hygiene, Fooyin University, Kaohsiung, Taiwan.
| | - Hsin-Lung Chung
- Department of Mechanical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
| | - Yu-Ping Cai
- Department of Nursing, Hsin Sheng Junior College of Medical Care and Management, Taoyuan, Taiwan
| | - Hsiu-Chuan Hsu
- Graduate Institute of Applied Physics, National Chengchi University, Taipei, Taiwan.
- Department of Computer Science, National Chengchi University, Taipei, Taiwan.
| |
Collapse
|
12
|
Huang CW, Wu BCY, Nguyen PA, Wang HH, Kao CC, Lee PC, Rahmanti AR, Hsu JC, Yang HC, Li YCJ. Emotion recognition in doctor-patient interactions from real-world clinical video database: Initial development of artificial empathy. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 233:107480. [PMID: 36965299 DOI: 10.1016/j.cmpb.2023.107480] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/28/2023] [Accepted: 03/10/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE The promising use of artificial intelligence (AI) to emulate human empathy may help a physician engage with a more empathic doctor-patient relationship. This study demonstrates the application of artificial empathy based on facial emotion recognition to evaluate doctor-patient relationships in clinical practice. METHODS A prospective study used recorded video data of doctor-patient clinical encounters in dermatology outpatient clinics, Taipei Municipal Wanfang Hospital, and Taipei Medical University Hospital collected from March to December 2019. Two cameras recorded the facial expressions of four doctors and 348 adult patients during regular clinical practice. Facial emotion recognition was used to analyze the basic emotions of doctors and patients with a temporal resolution of 1 second. In addition, a physician-patient satisfaction questionnaire was administered after each clinical session, and two standard patients gave impartial feedback to avoid bias. RESULTS Data from 326 clinical session videos showed that (1) Doctors expressed more emotions than patients (t [326] > = 2.998, p < = 0.003), including anger, happiness, disgust, and sadness; the only emotion that patients showed more than doctors was surprise (t [326] = -4.428, p < .001) (p < .001). (2) Patients felt happier during the latter half of the session (t [326] = -2.860, p = .005), indicating a good doctor-patient relationship. CONCLUSIONS Artificial empathy can offer objective observations on how doctors' and patients' emotions change. With the ability to detect emotions in 3/4 view and profile images, artificial empathy could be an accessible evaluation tool to study doctor-patient relationships in practical clinical settings.
Collapse
Affiliation(s)
- Chih-Wei Huang
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Taipei Medical University Ringgold standard institution - Center for Simulation in Medical Education, Taipei 116, Taiwan
| | - Bethany C Y Wu
- National Taiwan University Children and Family Research Center Sponsored by CTBC Charity Foundation, Taipei, Taiwan
| | - Phung Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Hsiao-Han Wang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan; Department of Dermatology, Wanfang Hospital, Taipei Medical University, Taiwan; Department of Dermatology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | | | - Pei-Chen Lee
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan
| | - Annisa Ristya Rahmanti
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Department of Health Policy and Management, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Jason C Hsu
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan; International PhD Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan; Research Center of Data Science on Healthcare Industry, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Hsuan-Chia Yang
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan.
| | - Yu-Chuan Jack Li
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan; Department of Dermatology, Wanfang Hospital, Taipei Medical University, Taiwan.
| |
Collapse
|
13
|
Yuvaraj R, Baranwal A, Prince AA, Murugappan M, Mohammed JS. Emotion Recognition from Spatio-Temporal Representation of EEG Signals via 3D-CNN with Ensemble Learning Techniques. Brain Sci 2023; 13:brainsci13040685. [PMID: 37190650 DOI: 10.3390/brainsci13040685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/17/2023] Open
Abstract
The recognition of emotions is one of the most challenging issues in human-computer interaction (HCI). EEG signals are widely adopted as a method for recognizing emotions because of their ease of acquisition, mobility, and convenience. Deep neural networks (DNN) have provided excellent results in emotion recognition studies. Most studies, however, use other methods to extract handcrafted features, such as Pearson correlation coefficient (PCC), Principal Component Analysis, Higuchi Fractal Dimension (HFD), etc., even though DNN is capable of generating meaningful features. Furthermore, most earlier studies largely ignored spatial information between the different channels, focusing mainly on time domain and frequency domain representations. This study utilizes a pre-trained 3D-CNN MobileNet model with transfer learning on the spatio-temporal representation of EEG signals to extract features for emotion recognition. In addition to fully connected layers, hybrid models were explored using other decision layers such as multilayer perceptron (MLP), k-nearest neighbor (KNN), extreme learning machine (ELM), XGBoost (XGB), random forest (RF), and support vector machine (SVM). Additionally, this study investigates the effects of post-processing or filtering output labels. Extensive experiments were conducted on the SJTU Emotion EEG Dataset (SEED) (three classes) and SEED-IV (four classes) datasets, and the results obtained were comparable to the state-of-the-art. Based on the conventional 3D-CNN with ELM classifier, SEED and SEED-IV datasets showed a maximum accuracy of 89.18% and 81.60%, respectively. Post-filtering improved the emotional classification performance in the hybrid 3D-CNN with ELM model for SEED and SEED-IV datasets to 90.85% and 83.71%, respectively. Accordingly, spatial-temporal features extracted from the EEG, along with ensemble classifiers, were found to be the most effective in recognizing emotions compared to state-of-the-art methods.
Collapse
Affiliation(s)
- Rajamanickam Yuvaraj
- National Institute of Education, Nanyang Technological University, Singapore 637616, Singapore
| | - Arapan Baranwal
- Department of Computer Science and Information Systems, BITS Pilani, Sancoale 403726, Goa, India
| | - A Amalin Prince
- Department of Electrical and Electronics Engineering, BITS Pilani, Sancoale 403726, Goa, India
| | - M Murugappan
- Intelligent Signal Processing (ISP) Research Lab, Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, Block 4, Doha 13133, Kuwait
- Department of Electronics and Communication Engineering, Faculty of Engineering, Vels Institute of Sciences, Technology, and Advanced Studies, Chennai 600117, Tamilnadu, India
- Centre for Excellence in Unmanned Aerial Systems (CoEUAS), Universiti Malaysia Perlis, Kangar 02600, Perlis, Malaysia
| | - Javeed Shaikh Mohammed
- Department of Biomedical Technology, College of Applied Medical Sciences, Prince Sattam bin Abdulaziz University, Al Kharj 11942, Saudi Arabia
| |
Collapse
|
14
|
Kshirsagar S, Pendyala A, Falk TH. Task-specific speech enhancement and data augmentation for improved multimodal emotion recognition under noisy conditions. FRONTIERS IN COMPUTER SCIENCE 2023. [DOI: 10.3389/fcomp.2023.1039261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023] Open
Abstract
Automatic emotion recognition (AER) systems are burgeoning and systems based on either audio, video, text, or physiological signals have emerged. Multimodal systems, in turn, have shown to improve overall AER accuracy and to also provide some robustness against artifacts and missing data. Collecting multiple signal modalities, however, can be very intrusive, time consuming, and expensive. Recent advances in deep learning based speech-to-text and natural language processing systems, however, have enabled the development of reliable multimodal systems based on speech and text while only requiring the collection of audio data. Audio data, however, is extremely sensitive to environmental disturbances, such as additive noise, thus faces some challenges when deployed “in the wild.” To overcome this issue, speech enhancement algorithms have been deployed at the input signal level to improve testing accuracy in noisy conditions. Speech enhancement algorithms can come in different flavors and can be optimized for different tasks (e.g., for human perception vs. machine performance). Data augmentation, in turn, has also been deployed at the model level during training time to improve accuracy in noisy testing conditions. In this paper, we explore the combination of task-specific speech enhancement and data augmentation as a strategy to improve overall multimodal emotion recognition in noisy conditions. We show that AER accuracy under noisy conditions can be improved to levels close to those seen in clean conditions. When compared against a system without speech enhancement or data augmentation, an increase in AER accuracy of 40% was seen in a cross-corpus test, thus showing promising results for “in the wild” AER.
Collapse
|
15
|
Muhammad F, Hussain M, Aboalsamh H. A Bimodal Emotion Recognition Approach through the Fusion of Electroencephalography and Facial Sequences. Diagnostics (Basel) 2023; 13:diagnostics13050977. [PMID: 36900121 PMCID: PMC10000366 DOI: 10.3390/diagnostics13050977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 01/26/2023] [Accepted: 02/06/2023] [Indexed: 03/08/2023] Open
Abstract
In recent years, human-computer interaction (HCI) systems have become increasingly popular. Some of these systems demand particular approaches for discriminating actual emotions through the use of better multimodal methods. In this work, a deep canonical correlation analysis (DCCA) based multimodal emotion recognition method is presented through the fusion of electroencephalography (EEG) and facial video clips. A two-stage framework is implemented, where the first stage extracts relevant features for emotion recognition using a single modality, while the second stage merges the highly correlated features from the two modalities and performs classification. Convolutional neural network (CNN) based Resnet50 and 1D-CNN (1-Dimensional CNN) have been utilized to extract features from facial video clips and EEG modalities, respectively. A DCCA-based approach was used to fuse highly correlated features, and three basic human emotion categories (happy, neutral, and sad) were classified using the SoftMax classifier. The proposed approach was investigated based on the publicly available datasets called MAHNOB-HCI and DEAP. Experimental results revealed an average accuracy of 93.86% and 91.54% on the MAHNOB-HCI and DEAP datasets, respectively. The competitiveness of the proposed framework and the justification for exclusivity in achieving this accuracy were evaluated by comparison with existing work.
Collapse
|
16
|
Pan J, Fang W, Zhang Z, Chen B, Zhang Z, Wang S. Multimodal Emotion Recognition Based on Facial Expressions, Speech, and EEG. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2023; 5:396-403. [PMID: 38899017 PMCID: PMC11186647 DOI: 10.1109/ojemb.2023.3240280] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/21/2022] [Accepted: 01/23/2023] [Indexed: 06/21/2024] Open
Abstract
Goal: As an essential human-machine interactive task, emotion recognition has become an emerging area over the decades. Although previous attempts to classify emotions have achieved high performance, several challenges remain open: 1) How to effectively recognize emotions using different modalities remains challenging. 2) Due to the increasing amount of computing power required for deep learning, how to provide real-time detection and improve the robustness of deep neural networks is important. Method: In this paper, we propose a deep learning-based multimodal emotion recognition (MER) called Deep-Emotion, which can adaptively integrate the most discriminating features from facial expressions, speech, and electroencephalogram (EEG) to improve the performance of the MER. Specifically, the proposed Deep-Emotion framework consists of three branches, i.e., the facial branch, speech branch, and EEG branch. Correspondingly, the facial branch uses the improved GhostNet neural network proposed in this paper for feature extraction, which effectively alleviates the overfitting phenomenon in the training process and improves the classification accuracy compared with the original GhostNet network. For work on the speech branch, this paper proposes a lightweight fully convolutional neural network (LFCNN) for the efficient extraction of speech emotion features. Regarding the study of EEG branches, we proposed a tree-like LSTM (tLSTM) model capable of fusing multi-stage features for EEG emotion feature extraction. Finally, we adopted the strategy of decision-level fusion to integrate the recognition results of the above three modes, resulting in more comprehensive and accurate performance. Result and Conclusions: Extensive experiments on the CK+, EMO-DB, and MAHNOB-HCI datasets have demonstrated the advanced nature of the Deep-Emotion method proposed in this paper, as well as the feasibility and superiority of the MER approach.
Collapse
Affiliation(s)
- Jiahui Pan
- School of SoftwareSouth China Normal UniversityGuangzhou510631China
| | - Weijie Fang
- School of SoftwareSouth China Normal UniversityGuangzhou510631China
| | - Zhihang Zhang
- School of SoftwareSouth China Normal UniversityGuangzhou510631China
| | - Bingzhi Chen
- School of SoftwareSouth China Normal UniversityGuangzhou510631China
| | - Zheng Zhang
- Shenzhen Medical Biometrics Perception and Analysis Engineering LaboratoryHarbin Institute of TechnologyShenzhen518055China
| | - Shuihua Wang
- School of Computing and Mathematical SciencesUniversity of LeicesterLeicesterLE17RHU.K.
| |
Collapse
|
17
|
Mukhiddinov M, Djuraev O, Akhmedov F, Mukhamadiyev A, Cho J. Masked Face Emotion Recognition Based on Facial Landmarks and Deep Learning Approaches for Visually Impaired People. SENSORS (BASEL, SWITZERLAND) 2023; 23:1080. [PMID: 36772117 PMCID: PMC9921901 DOI: 10.3390/s23031080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/10/2023] [Accepted: 01/15/2023] [Indexed: 06/18/2023]
Abstract
Current artificial intelligence systems for determining a person's emotions rely heavily on lip and mouth movement and other facial features such as eyebrows, eyes, and the forehead. Furthermore, low-light images are typically classified incorrectly because of the dark region around the eyes and eyebrows. In this work, we propose a facial emotion recognition method for masked facial images using low-light image enhancement and feature analysis of the upper features of the face with a convolutional neural network. The proposed approach employs the AffectNet image dataset, which includes eight types of facial expressions and 420,299 images. Initially, the facial input image's lower parts are covered behind a synthetic mask. Boundary and regional representation methods are used to indicate the head and upper features of the face. Secondly, we effectively adopt a facial landmark detection method-based feature extraction strategy using the partially covered masked face's features. Finally, the features, the coordinates of the landmarks that have been identified, and the histograms of the oriented gradients are then incorporated into the classification procedure using a convolutional neural network. An experimental evaluation shows that the proposed method surpasses others by achieving an accuracy of 69.3% on the AffectNet dataset.
Collapse
Affiliation(s)
- Mukhriddin Mukhiddinov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Oybek Djuraev
- Department of Hardware and Software of Control Systems in Telecommunication, Tashkent University of Information Technologies Named after Muhammad al-Khwarizmi, Tashkent 100084, Uzbekistan
| | - Farkhod Akhmedov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Abdinabi Mukhamadiyev
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Jinsoo Cho
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
18
|
Bai Z, Liu J, Hou F, Chen Y, Cheng M, Mao Z, Song Y, Gao Q. Emotion recognition with residual network driven by spatial-frequency characteristics of EEG recorded from hearing-impaired adults in response to video clips. Comput Biol Med 2023; 152:106344. [PMID: 36470142 DOI: 10.1016/j.compbiomed.2022.106344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 10/31/2022] [Accepted: 11/21/2022] [Indexed: 12/03/2022]
Abstract
In recent years, emotion recognition based on electroencephalography (EEG) signals has attracted plenty of attention. Most of the existing works focused on normal or depressed people. Due to the lack of hearing ability, it is difficult for hearing-impaired people to express their emotions through language in their social activities. In this work, we collected the EEG signals of hearing-impaired subjects when they were watching six kinds of emotional video clips (happiness, inspiration, neutral, anger, fear, and sadness) for emotion recognition. The biharmonic spline interpolation method was utilized to convert the traditional frequency domain features, Differential Entropy (DE), Power Spectral Density (PSD), and Wavelet Entropy (WE) into the spatial domain. The patch embedding (PE) method was used to segment the feature map into the same patch to obtain the differences in the distribution of emotional information among brain regions. For feature classification, a compact residual network with Depthwise convolution (DC) and Pointwise convolution (PC) is proposed to separate spatial and channel mixing dimensions to better extract information between channels. Dependent subject experiments based on 70% training sets and 30% testing sets were performed. The results showed that the average classification accuracies by PE (DE), PE (PSD), and PE (WE) were 91.75%, 85.53%, and 75.68%, respectively which were improved by 11.77%, 23.54%, and 16.61% compared with DE, PSD, and WE. Moreover, the comparison experiments were carried out on the SEED and DEAP datasets with PE (DE), which achieved average accuracies of 90.04% (positive, neutral, and negative) and 88.75% (high valence and low valence). By exploring the emotional brain regions, we found that the frontal, parietal, and temporal lobes of hearing-impaired people were associated with emotional activity compared to normal people whose main emotional brain area was the frontal lobe.
Collapse
Affiliation(s)
- Zhongli Bai
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Junjie Liu
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Fazheng Hou
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Yirui Chen
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Meiyi Cheng
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Zemin Mao
- Technical College for the Deaf, Tianjin University of Technology, Tianjin, 300384, China.
| | - Yu Song
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
| | - Qiang Gao
- Tianjin Key Laboratory for Control Theory and Applications in Complicated Systems, TUT Maritime College, Tianjin University of Technology, Tianjin, 300384, China.
| |
Collapse
|
19
|
de Lope J, Graña M. An ongoing review of speech emotion recognition. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
20
|
Paccotacya-Yanque RYG, Huanca-Anquise CA, Escalante-Calcina J, Ramos-Lovón WR, Cuno-Parari ÁE. A speech corpus of Quechua Collao for automatic dimensional emotion recognition. Sci Data 2022; 9:778. [PMID: 36566260 PMCID: PMC9789950 DOI: 10.1038/s41597-022-01855-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 11/21/2022] [Indexed: 12/25/2022] Open
Abstract
Automatic speech emotion recognition is an important research topic for human-computer interaction and affective computing. Over ten million people speak the Quechua language throughout South America, and one of the most known variants is the Quechua Collao one. However, this language can be considered a low resource for machine emotion recognition, creating a barrier for Quechua speakers who want to use this technology. Therefore, the contribution of this work is a 15 hours speech corpus in Quechua Collao, which is made publicly available to the research community. The corpus was created from a set of words and sentences explicitly collected for this task, divided into nine categorical emotions: happy, sad, bored, fear, sleepy, calm, excited, angry, and neutral. The annotation was performed on a 5-value discrete scale according to 3 dimensions: valence, arousal, and dominance. To demonstrate the usefulness of the corpus, we have performed speech emotion recognition using machine learning methods and neural networks.
Collapse
Affiliation(s)
- Rosa Y. G. Paccotacya-Yanque
- grid.441685.a0000 0004 0385 0297Universidad Nacional de San Agustín de Arequipa, School of Computer Science, Arequipa, Peru
| | - Candy A. Huanca-Anquise
- grid.441685.a0000 0004 0385 0297Universidad Nacional de San Agustín de Arequipa, School of Computer Science, Arequipa, Peru
| | - Judith Escalante-Calcina
- grid.441685.a0000 0004 0385 0297Universidad Nacional de San Agustín de Arequipa, School of Computer Science, Arequipa, Peru
| | - Wilber R. Ramos-Lovón
- grid.441685.a0000 0004 0385 0297Universidad Nacional de San Agustín de Arequipa, School of Computer Science, Arequipa, Peru
| | - Álvaro E. Cuno-Parari
- grid.441685.a0000 0004 0385 0297Universidad Nacional de San Agustín de Arequipa, School of Computer Science, Arequipa, Peru
| |
Collapse
|
21
|
Deniz E, Sobahi N, Omar N, Sengur A, Acharya UR. Automated robust human emotion classification system using hybrid EEG features with ICBrainDB dataset. Health Inf Sci Syst 2022; 10:31. [PMID: 36387749 PMCID: PMC9649575 DOI: 10.1007/s13755-022-00201-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 10/23/2022] [Indexed: 11/11/2022] Open
Abstract
Emotion identification is an essential task for human-computer interaction systems. Electroencephalogram (EEG) signals have been widely used in emotion recognition. So far, there have been several EEG-based emotion recognition datasets that the researchers have used to validate their developed models. Hence, we have used a new ICBrainDB EEG dataset to classify angry, neutral, happy, and sad emotions in this work. Signal processing-based wavelet transform (WT), tunable Q-factor wavelet transform (TQWT), and image processing-based histogram of oriented gradients (HOG), local binary pattern (LBP), and convolutional neural network (CNN) features have been used extracted from the EEG signals. The WT is used to extract the rhythms from each channel of the EEG signal. The instantaneous frequency and spectral entropy are computed from each EEG rhythm signal. The average, and standard deviation of instantaneous frequency, and spectral entropy of each rhythm of the signal are the final feature vectors. The spectral entropy in each channel of the EEG signal after performing the TQWT is used to create the feature vectors in the second signal side method. Each EEG channel is transformed into time-frequency plots using the synchrosqueezed wavelet transform. Then, the feature vectors are constructed individually using windowed HOG and LBP features. Also, each channel of the EEG data is fed to a pretrained CNN to extract the features. In the feature selection process, the ReliefF feature selector is employed. Various feature classification algorithms namely, k-nearest neighbor (KNN), support vector machines, and neural networks are used for the automated classification of angry, neutral, happy, and sad emotions. Our developed model obtained an average accuracy of 90.7% using HOG features and a KNN classifier with a tenfold cross-validation strategy.
Collapse
Affiliation(s)
- Erkan Deniz
- Electrical and Electronics Engineering Department, Technology Faculty, Firat University, Elazig, Turkey
| | - Nebras Sobahi
- Department of Electrical and Computer Engineering, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Naaman Omar
- Department of Information Technology Management, Administration Technical College, Duhok Polytechnic University, Duhok, Iraq
| | - Abdulkadir Sengur
- Electrical and Electronics Engineering Department, Technology Faculty, Firat University, Elazig, Turkey
| | - U. Rajendra Acharya
- Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore, 599489 Singapore
- Biomedical Engineering, School of Science and Technology, SUSS University, Singapore, Singapore
- Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
22
|
Dogan A, Barua PD, Baygin M, Tuncer T, Dogan S, Yaman O, Dogru AH, Acharya RU. Automated accurate emotion classification using Clefia pattern-based features with EEG signals. INTERNATIONAL JOURNAL OF HEALTHCARE MANAGEMENT 2022. [DOI: 10.1080/20479700.2022.2141694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Abdullah Dogan
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Prabal Datta Barua
- School of Business (Information System), University of Southern Queensland, Toowoomba, Australia
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, Turkey
| | - Mehmet Baygin
- Department of Computer Engineering, College of Engineering, Ardahan University, Ardahan, Turkey
| | - Turker Tuncer
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
| | - Sengul Dogan
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
| | - Orhan Yaman
- Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig, Turkey
| | - Ali Hikmet Dogru
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey
| | - Rajendra U. Acharya
- Ngee Ann Polytechnic, Department of Electronics and Computer Engineering, Singapore
- Department of Biomedical Engineering, School of Science and Technology, SUSS University, Singapore
- Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
23
|
Establishing an Intelligent Emotion Analysis System for Long-Term Care Application Based on LabVIEW. SUSTAINABILITY 2022. [DOI: 10.3390/su14148932] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this study, the authors implemented an intelligent long-term care system based on deep learning techniques, using an AI model that can be integrated with the Lab’s Virtual Instrumentation Engineering Workbench (LabVIEW) application for sentiment analysis. The input data collected is a database of numerous facial features and environmental variables that have been processed and analyzed; the output decisions are the corresponding controls for sentiment analysis and prediction. Convolutional neural network (CNN) is used to deal with the complex process of deep learning. After the convolutional layer simplifies the processing of the image matrix, the results are computed by the fully connected layer. Furthermore, the Multilayer Perceptron (MLP) model embedded in LabVIEW is constructed for numerical transformation, analysis, and predictive control; it predicts the corresponding control of emotional and environmental variables. Moreover, LabVIEW is used to design sensor components, data displays, and control interfaces. Remote sensing and control is achieved by using LabVIEW’s built-in web publishing tools.
Collapse
|
24
|
A Recognition Method of Athletes’ Mental State in Sports Training Based on Support Vector Machine Model. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2022. [DOI: 10.1155/2022/1566664] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Athletes participate in competitive competitions, and the ultimate goal is to better display their personal competitive level in the competition so as to achieve the goal of defeating their opponents and winning the competition. In all types of competitions, most matches are instantaneous, and opportunities are fleeting. The instantaneous nature and fierce competition of sports competitions require athletes who participate in sports competitions to have a high psychological quality. It can be seen that the quality of the mental state directly determines the performance of the athletes in usual training and competition. In the process of sports, if athletes can obtain real-time changes in their mental states when they encounter various situations, they can formulate more targeted and effective training or competition strategies according to the athletes’ states. For the opponent, by analyzing the opponent’s psychological state during exercise, the game strategy can be adjusted in real time in a targeted manner, and the probability of winning the game can be provided. Based on this background, this paper proposes to use support vector machine (SVM) to identify the mental state of athletes during exercise. This paper first collects the data of body movements and facial expressions of athletes during training or competition. Use multimodal data to train an SVM model. Output the emotional state of athletes at different stages based on test data. In order to verify the applicability of the method in this paper to the athlete subjects, several comparative models were used in the experiment to verify the performance of the used models. The experimental results show that the accuracy rate of emotion recognition obtained by this method is more than 80%. This shows that the research in this paper has certain application value.
Collapse
|
25
|
Zhang W. Intelligent Recognition and Analysis of Negative Emotions of Undergraduates Under COVID-19. Front Public Health 2022; 10:913255. [PMID: 35664114 PMCID: PMC9157568 DOI: 10.3389/fpubh.2022.913255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 04/25/2022] [Indexed: 11/30/2022] Open
Abstract
Background The outbreak and spread of COVID-19 has brought a tremendous impact on undergraduates' study and life, and also caused anxiety, depression, fear and loneliness among undergraduates. If these individual negative emotions are not timely guided and treated, it is easy to cause the amplification of social negative emotions, resulting in individual and collective irrational behavior, and ultimately destroy social stability and trust foundation. Therefore, how to strengthen the analysis and guidance of negative emotions of undergraduates has become an important issue to be urgently solved in the training of undergraduates. Method This paper presents a weight and structure double-determination method. Based on this method, a Radial Basis Function Neural Networks (RBFNN) classifier is constructed for recognizing negative emotions of undergraduates. After classifying the input psychological crisis intervention scale samples by the RBFNN classifier, recognition of negative emotions for undergraduates are divided into normal, mild depression, moderate depression and severe depression. Experiments Afterwards, we analyze negative emotions of undergraduates and give some psychological adjustment strategies. In addition, the experiment results demonstrate that the proposed method has a good performance in terms of classification accuracy, classification time and recognition rate of negative emotions among undergraduates.
Collapse
Affiliation(s)
- Weifeng Zhang
- School of Educational Science, Xinxiang University, Xinxiang, China
| |
Collapse
|