1
|
Alroobaea R. Cross-corpus speech emotion recognition with transformers: Leveraging handcrafted features and data augmentation. Comput Biol Med 2024; 179:108841. [PMID: 39002317 DOI: 10.1016/j.compbiomed.2024.108841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 04/16/2024] [Accepted: 05/06/2024] [Indexed: 07/15/2024]
Abstract
Speech emotion recognition (SER) stands as a prominent and dynamic research field in data science due to its extensive application in various domains such as psychological assessment, mobile services, and computer games, mobile services. In previous research, numerous studies utilized manually engineered features for emotion classification, resulting in commendable accuracy. However, these features tend to underperform in complex scenarios, leading to reduced classification accuracy. These scenarios include: 1. Datasets that contain diverse speech patterns, dialects, accents, or variations in emotional expressions. 2. Data with background noise. 3. Scenarios where the distribution of emotions varies significantly across datasets can be challenging. 4. Combining datasets from different sources introduce complexities due to variations in recording conditions, data quality, and emotional expressions. Consequently, there is a need to improve the classification performance of SER techniques. To address this, a novel SER framework was introduced in this study. Prior to feature extraction, signal preprocessing and data augmentation methods were applied to augment the available data, resulting in the derivation of 18 informative features from each signal. The discriminative feature set was obtained using feature selection techniques which was then utilized as input for emotion recognition using the SAVEE, RAVDESS, and EMO-DB datasets. Furthermore, this research also implemented a cross-corpus model that incorporated all speech files related to common emotions from three datasets. The experimental outcomes demonstrated the superior performance of SER framework compared to existing frameworks in the field. Notably, the framework presented in this study achieved remarkable accuracy rates across various datasets. Specifically, the proposed model obtained an accuracy of 95%, 94%,97%, and 97% on SAVEE, RAVDESS, EMO-DB and cross-corpus datasets respectively. These results underscore the significant contribution of our proposed framework to the field of SER.
Collapse
Affiliation(s)
- Roobaea Alroobaea
- Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia.
| |
Collapse
|
2
|
Ju X, Li M, Tian W, Hu D. EEG-based emotion recognition using a temporal-difference minimizing neural network. Cogn Neurodyn 2024; 18:405-416. [PMID: 38699602 PMCID: PMC11061074 DOI: 10.1007/s11571-023-10004-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 07/25/2023] [Accepted: 08/21/2023] [Indexed: 05/05/2024] Open
Abstract
Electroencephalogram (EEG) emotion recognition plays an important role in human-computer interaction. An increasing number of algorithms for emotion recognition have been proposed recently. However, it is still challenging to make efficient use of emotional activity knowledge. In this paper, based on prior knowledge that emotion varies slowly across time, we propose a temporal-difference minimizing neural network (TDMNN) for EEG emotion recognition. We use maximum mean discrepancy (MMD) technology to evaluate the difference in EEG features across time and minimize the difference by a multibranch convolutional recurrent network. State-of-the-art performances are achieved using the proposed method on the SEED, SEED-IV, DEAP and DREAMER datasets, demonstrating the effectiveness of including prior knowledge in EEG emotion recognition.
Collapse
Affiliation(s)
- Xiangyu Ju
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha, China
| | - Ming Li
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha, China
| | - Wenli Tian
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha, China
| | - Dewen Hu
- College of Intelligence Science and Technology, National University of Defense Technology, Changsha, China
| |
Collapse
|
3
|
Farashi S, Sarihi A, Ramezani M, Shahidi S, Mazdeh M. Parkinson's disease tremor prediction using EEG data analysis-A preliminary and feasibility study. BMC Neurol 2023; 23:420. [PMID: 38001410 PMCID: PMC10668446 DOI: 10.1186/s12883-023-03468-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
PURPOSE Tremor is one of the hallmarks of Parkinson's disease (PD) that does not respond effectively to conventional medications. In this regard, as a complementary solution, methods such as deep brain stimulation have been proposed. To apply the intervention with minimal side effects, it is necessary to predict tremor initiation. The purpose of the current study was to propose a novel methodology for predicting resting tremors using analysis of EEG time-series. METHODS A modified algorithm for tremor onset detection from accelerometer data was proposed. Furthermore, a machine learning methodology for predicting PD hand tremors from EEG time-series was proposed. The most discriminative features extracted from EEG data based on statistical analyses and post-hoc tests were used to train the classifier for distinguishing pre-tremor conditions. RESULTS Statistical analyses with post-hoc tests showed that features such as form factor and statistical features were the most discriminative features. Furthermore, limited numbers of EEG channels (F3, F7, P4, CP2, FC6, and C4) and EEG bands (Delta and Gamma) were sufficient for an accurate tremor prediction based on EEG data. Based on the selected feature set, a KNN classifier obtained the best pre-tremor prediction performance with an accuracy of 73.67%. CONCLUSION This feasibility study was the first attempt to show the predicting ability of EEG time-series for PD hand tremor prediction. Considering the limitations of this study, future research with longer data, and different brain dynamics are needed for clinical applications.
Collapse
Affiliation(s)
- Sajjad Farashi
- Neurophysiology Research Center, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Abdolrahman Sarihi
- Neurophysiology Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
- Department of Physiology, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mahdi Ramezani
- Department of Anatomical Sciences, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Siamak Shahidi
- Neurophysiology Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
- Department of Physiology, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mehrdokht Mazdeh
- Department of Neurology, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| |
Collapse
|
4
|
Gosala B, Dindayal Kapgate P, Jain P, Nath Chaurasia R, Gupta M. Wavelet transforms for feature engineering in EEG data processing: An application on Schizophrenia. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
5
|
Li JW, Lin D, Che Y, Lv JJ, Chen RJ, Wang LJ, Zeng XX, Ren JC, Zhao HM, Lu X. An innovative EEG-based emotion recognition using a single channel-specific feature from the brain rhythm code method. Front Neurosci 2023; 17:1221512. [PMID: 37547144 PMCID: PMC10397731 DOI: 10.3389/fnins.2023.1221512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 06/30/2023] [Indexed: 08/08/2023] Open
Abstract
Introduction Efficiently recognizing emotions is a critical pursuit in brain-computer interface (BCI), as it has many applications for intelligent healthcare services. In this work, an innovative approach inspired by the genetic code in bioinformatics, which utilizes brain rhythm code features consisting of δ, θ, α, β, or γ, is proposed for electroencephalography (EEG)-based emotion recognition. Methods These features are first extracted from the sequencing technique. After evaluating them using four conventional machine learning classifiers, an optimal channel-specific feature that produces the highest accuracy in each emotional case is identified, so emotion recognition through minimal data is realized. By doing so, the complexity of emotion recognition can be significantly reduced, making it more achievable for practical hardware setups. Results The best classification accuracies achieved for the DEAP and MAHNOB datasets range from 83-92%, and for the SEED dataset, it is 78%. The experimental results are impressive, considering the minimal data employed. Further investigation of the optimal features shows that their representative channels are primarily on the frontal region, and associated rhythmic characteristics are typical of multiple kinds. Additionally, individual differences are found, as the optimal feature varies with subjects. Discussion Compared to previous studies, this work provides insights into designing portable devices, as only one electrode is appropriate to generate satisfactory performances. Consequently, it would advance the understanding of brain rhythms, which offers an innovative solution for classifying EEG signals in diverse BCI applications, including emotion recognition.
Collapse
Affiliation(s)
- Jia Wen Li
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan University of Science and Technology, Wuhan, China
- Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin, China
| | - Di Lin
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- New Engineering Industry College, Putian University, Putian, China
| | - Yan Che
- Engineering Research Center of Big Data Application in Private Health Medicine, Fujian Province University, Putian, China
- New Engineering Industry College, Putian University, Putian, China
| | - Ju Jian Lv
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Rong Jun Chen
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Lei Jun Wang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Xian Xian Zeng
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Jin Chang Ren
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
- National Subsea Centre, Robert Gordon University, Aberdeen, United Kingdom
| | - Hui Min Zhao
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | - Xu Lu
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| |
Collapse
|
6
|
EEG channel selection-based binary particle swarm optimization with recurrent convolutional autoencoder for emotion recognition. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
|
7
|
Quan J, Li Y, Wang L, He R, Yang S, Guo L. EEG-based cross-subject emotion recognition using multi-source domain transfer learning. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|