1
|
Li X, Zhang Y, Peng Y, Kong W. Enhanced performance of EEG-based brain-computer interfaces by joint sample and feature importance assessment. Health Inf Sci Syst 2024; 12:9. [PMID: 38375134 PMCID: PMC10874355 DOI: 10.1007/s13755-024-00271-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/07/2024] [Indexed: 02/21/2024] Open
Abstract
Electroencephalograph (EEG) has been a reliable data source for building brain-computer interface (BCI) systems; however, it is not reasonable to use the feature vector extracted from multiple EEG channels and frequency bands to perform recognition directly due to the two deficiencies. One is that EEG data is weak and non-stationary, which easily causes different EEG samples to have different quality. The other is that different feature dimensions corresponding to different brain regions and frequency bands have different correlations to a certain mental task, which is not sufficiently investigated. To this end, a Joint Sample and Feature importance Assessment (JSFA) model was proposed to simultaneously explore the different impacts of EEG samples and features in mental state recognition, in which the former is based on the self-paced learning technique while the latter is completed by the feature self-weighting technique. The efficacy of JSFA is extensively evaluated on two EEG data sets, i.e., SEED-IV and SEED-VIG. One is a classification task for emotion recognition and the other is a regression task for driving fatigue detection. Experimental results demonstrate that JSFA can effectively identify the importance of different EEG samples and features, leading to enhanced recognition performance of corresponding BCI systems.
Collapse
|
2
|
Cheng C, Liu W, Feng L, Jia Z. Emotion recognition using hierarchical spatial-temporal learning transformer from regional to global brain. Neural Netw 2024; 179:106624. [PMID: 39163821 DOI: 10.1016/j.neunet.2024.106624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 06/10/2024] [Accepted: 08/09/2024] [Indexed: 08/22/2024]
Abstract
Emotion recognition is an essential but challenging task in human-computer interaction systems due to the distinctive spatial structures and dynamic temporal dependencies associated with each emotion. However, current approaches fail to accurately capture the intricate effects of electroencephalogram (EEG) signals across different brain regions on emotion recognition. Therefore, this paper designs a transformer-based method, denoted by R2G-STLT, which relies on a spatial-temporal transformer encoder with regional to global hierarchical learning that learns the representative spatiotemporal features from the electrode level to the brain-region level. The regional spatial-temporal transformer (RST-Trans) encoder is designed to obtain spatial information and context dependence at the electrode level aiming to learn the regional spatiotemporal features. Then, the global spatial-temporal transformer (GST-Trans) encoder is utilized to extract reliable global spatiotemporal features, reflecting the impact of various brain regions on emotion recognition tasks. Moreover, the multi-head attention mechanism is placed into the GST-Trans encoder to empower it to capture the long-range spatial-temporal information among the brain regions. Finally, subject-independent experiments are conducted on each frequency band of the DEAP, SEED, and SEED-IV datasets to assess the performance of the proposed model. Results indicate that the R2G-STLT model surpasses several state-of-the-art approaches.
Collapse
|
3
|
Chen D, Wen G, Li H, Yang P, Chen C, Wang B. CDGT: Constructing diverse graph transformers for emotion recognition from facial videos. Neural Netw 2024; 179:106573. [PMID: 39096753 DOI: 10.1016/j.neunet.2024.106573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 05/20/2024] [Accepted: 07/24/2024] [Indexed: 08/05/2024]
Abstract
Recognizing expressions from dynamic facial videos can find more natural affect states of humans, and it becomes a more challenging task in real-world scenes due to pose variations of face, partial occlusions and subtle dynamic changes of emotion sequences. Existing transformer-based methods often focus on self-attention to model the global relations among spatial features or temporal features, which cannot well focus on important expression-related locality structures from both spatial and temporal features for the in-the-wild expression videos. To this end, we incorporate diverse graph structures into transformers and propose a CDGT method to construct diverse graph transformers for efficient emotion recognition from in-the-wild videos. Specifically, our method contains a spatial dual-graphs transformer and a temporal hyperbolic-graph transformer. The former deploys a dual-graph constrained attention to capture latent emotion-related graph geometry structures among local spatial tokens for efficient feature representation, especially for the video frames with pose variations and partial occlusions. The latter adopts a hyperbolic-graph constrained self-attention that explores important temporal graph structure information under hyperbolic space to model more subtle changes of dynamic emotion. Extensive experimental results on in-the-wild video-based facial expression databases show that our proposed CDGT outperforms other state-of-the-art methods.
Collapse
|
4
|
Chen S, Wang Y, Lin X, Sun X, Li W, Ma W. Cross-subject emotion recognition in brain-computer interface based on frequency band attention graph convolutional adversarial neural networks. J Neurosci Methods 2024; 411:110276. [PMID: 39237038 DOI: 10.1016/j.jneumeth.2024.110276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 08/19/2024] [Accepted: 09/01/2024] [Indexed: 09/07/2024]
Abstract
BACKGROUND Emotion is an important area in neuroscience. Cross-subject emotion recognition based on electroencephalogram (EEG) data is challenging due to physiological differences between subjects. Domain gap, which refers to the different distributions of EEG data at different subjects, has attracted great attention for cross-subject emotion recognition. COMPARISON WITH EXISTING METHODS This study focuses on narrowing the domain gap between subjects through the emotional frequency bands and the relationship information between EEG channels. Emotional frequency band features represent the energy distribution of EEG data in different frequency ranges, while relationship information between EEG channels provides spatial distribution information about EEG data. NEW METHOD To achieve this, this paper proposes a model called the Frequency Band Attention Graph convolutional Adversarial neural Network (FBAGAN). This model includes three components: a feature extractor, a classifier, and a discriminator. The feature extractor consists of a layer with a frequency band attention mechanism and a graph convolutional neural network. The mechanism effectively extracts frequency band information by assigning weights and Graph Convolutional Networks can extract relationship information between EEG channels by modeling the graph structure. The discriminator then helps minimize the gap in the frequency information and relationship information between the source and target domains, improving the model's ability to generalize. RESULTS The FBAGAN model is extensively tested on the SEED, SEED-IV, and DEAP datasets. The accuracy and standard deviation scores are 88.17% and 4.88, respectively, on the SEED dataset, and 77.35% and 3.72 on the SEED-IV dataset. On the DEAP dataset, the model achieves 69.64% for Arousal and 65.18% for Valence. These results outperform most existing models. CONCLUSIONS The experiments indicate that FBAGAN effectively addresses the challenges of transferring EEG channel domain and frequency band domain, leading to improved performance.
Collapse
|
5
|
Si X, Huang D, Liang Z, Sun Y, Huang H, Liu Q, Yang Z, Ming D. Temporal aware Mixed Attention-based Convolution and Transformer Network for cross-subject EEG emotion recognition. Comput Biol Med 2024; 181:108973. [PMID: 39213709 DOI: 10.1016/j.compbiomed.2024.108973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 07/15/2024] [Accepted: 07/29/2024] [Indexed: 09/04/2024]
Abstract
Emotion recognition is crucial for human-computer interaction, and electroencephalography (EEG) stands out as a valuable tool for capturing and reflecting human emotions. In this study, we propose a hierarchical hybrid model called Mixed Attention-based Convolution and Transformer Network (MACTN). This model is designed to collectively capture both local and global temporal information and is inspired by insights from neuroscientific research on the temporal dynamics of emotions. First, we introduce depth-wise temporal convolution and separable convolution to extract local temporal features. Then, a self-attention-based transformer is used to integrate the sparse global emotional features. Besides, channel attention mechanism is designed to identify the most task-relevant channels, facilitating the capture of relationships between different channels and emotional states. Extensive experiments are conducted on three public datasets under both offline and online evaluation modes. In the multi-class cross-subject online evaluation using the THU-EP dataset, MACTN demonstrates an approximate 8% enhancement in 9-class emotion recognition accuracy in comparison to state-of-the-art methods. In the multi-class cross-subject offline evaluation using the DEAP and SEED datasets, a comparable performance is achieved solely based on the raw EEG signals, without the need for prior knowledge or transfer learning during the feature extraction and learning process. Furthermore, ablation studies have shown that integrating self-attention and channel-attention mechanisms improves classification performance. This method won the Emotional BCI Competition's final championship in the World Robot Contest. The source code is available at https://github.com/ThreePoundUniverse/MACTN.
Collapse
|
6
|
Wang J, Ning X, Xu W, Li Y, Jia Z, Lin Y. Multi-source Selective Graph Domain Adaptation Network for cross-subject EEG emotion recognition. Neural Netw 2024; 180:106742. [PMID: 39342695 DOI: 10.1016/j.neunet.2024.106742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 08/31/2024] [Accepted: 09/13/2024] [Indexed: 10/01/2024]
Abstract
Affective brain-computer interface is an important part of realizing emotional human-computer interaction. However, existing objective individual differences among subjects significantly hinder the application of electroencephalography (EEG) emotion recognition. Existing methods still lack the complete extraction of subject-invariant representations for EEG and the ability to fuse valuable information from multiple subjects to facilitate the emotion recognition of the target subject. To address the above challenges, we propose a Multi-source Selective Graph Domain Adaptation Network (MSGDAN), which can better utilize data from different source subjects and perform more robust emotion recognition on the target subject. The proposed network extracts and selects the individual information specific to each subject, where public information refers to subject-invariant components from multi-source subjects. Moreover, the graph domain adaptation network captures both functional connectivity and regional states of the brain via a dynamic graph network and then integrates graph domain adaptation to ensure the invariance of both functional connectivity and regional states. To evaluate our method, we conduct cross-subject emotion recognition experiments on the SEED, SEED-IV, and DEAP datasets. The results demonstrate that the MSGDAN has superior classification performance.
Collapse
|
7
|
Bao Y, Xue M, Gohumpu J, Cao Y, Weng S, Fang P, Wu J, Yu B. Prenatal anxiety recognition model integrating multimodal physiological signal. Sci Rep 2024; 14:21767. [PMID: 39294387 PMCID: PMC11410974 DOI: 10.1038/s41598-024-72507-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 09/09/2024] [Indexed: 09/20/2024] Open
Abstract
Anxiety among pregnant women can significantly impact their overall well-being. However, the development of data-driven HCI interventions for this demographic is often hindered by data scarcity and collection challenges. In this study, we leverage the Empatica E4 wristband to gather physiological data from pregnant women in both resting and relaxed states. Additionally, we collect subjective reports on their anxiety levels. We integrate features from signals including Blood Volume Pulse (BVP), Skin Temperature (SKT), and Inter-Beat Interval (IBI). Employing a Support Vector Machine (SVM) algorithm, we construct a model capable of evaluating anxiety levels in pregnant women. Our model attains an emotion recognition accuracy of 69.3%, marking achievements in HCI technology tailored for this specific user group. Furthermore, we introduce conceptual ideas for biofeedback on maternal emotions and its interactive mechanism, shedding light on improved monitoring and timely intervention strategies to enhance the emotional health of pregnant women.
Collapse
|
8
|
Castelnovo V, Canu E, Aiello EN, Curti B, Sibilla E, Torre S, Freri F, Tripodi C, Lumaca L, Spinelli EG, Schito P, Russo T, Falzone Y, Verde F, Silani V, Ticozzi N, Sturm VE, Rankin KP, Gorno-Tempini ML, Poletti B, Filippi M, Agosta F. How to detect affect recognition alterations in amyotrophic lateral sclerosis. J Neurol 2024:10.1007/s00415-024-12686-6. [PMID: 39287680 DOI: 10.1007/s00415-024-12686-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 09/02/2024] [Accepted: 09/05/2024] [Indexed: 09/19/2024]
Abstract
OBJECTIVE To define the clinical usability of an affect recognition (AR) battery-the Comprehensive Affect Testing System (CATS)-in an Italian sample of patients with amyotrophic lateral sclerosis (ALS). METHODS 96 ALS patients and 116 healthy controls underwent a neuropsychological assessment including the AR subtests of the abbreviated version of the CATS (CATS-A). CATS-A AR subtests and their global score (CATS-A AR Quotient, ARQ) were assessed for their factorial, convergent, and divergent validity. The diagnostic accuracy of each CATS-A AR measure in discriminating ALS patients with cognitive impairment from cognitively normal controls and patients was tested via receiver-operating characteristics analyses. Optimal cut-offs were identified for CATS-A AR measures yielding an acceptable AUC value (≥ .70). The ability of CATS-A ARQ to discriminate between different ALS cognitive phenotypes was also tested. Gray-matter (GM) volumes of controls, ALS with normal (ALS-nARQ), and impaired ARQ score (ALS-iARQ) were compared using ANCOVA models. RESULTS CATS-A AR subtests and ARQ proved to have moderate-to-strong convergent and divergent validity. Almost all considered CATS-A measures reached acceptable accuracy and diagnostic power (AUC range = .79-.83). ARQ showed to be the best diagnostic measure (sensitivity = .80; specificity = .75) and discriminated between different ALS cognitive phenotypes. Compared to ALS-nARQ, ALS-iARQ patients showed reduced GM volumes in the right anterior cingulate, right middle frontal, left inferior temporal, and superior occipital regions. CONCLUSIONS The AR subtests of the CATS-A, and in particular the CATS-A ARQ, are sound measures of AR in ALS. AR deficits may be a valid marker of frontotemporal involvement in these patients.
Collapse
|
9
|
Gao Y, Zhu Z, Fang F, Zhang Y, Meng M. EEG emotion recognition based on data-driven signal auto-segmentation and feature fusion. J Affect Disord 2024; 361:356-366. [PMID: 38885847 DOI: 10.1016/j.jad.2024.06.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 05/27/2024] [Accepted: 06/14/2024] [Indexed: 06/20/2024]
Abstract
Pattern recognition based on network connections has recently been applied to the brain-computer interface (BCI) research, offering new ideas for emotion recognition using Electroencephalogram (EEG) signal. However unified standards are currently lacking for selecting emotional signals in emotion recognition research, and potential associations between activation differences in brain regions and network connectivity pattern are often being overlooked. To bridge this technical gap, a data-driven signal auto-segmentation and feature fusion algorithm (DASF) is proposed in this paper. First, the Phase Locking Value (PLV) method was used to construct the brain functional adjacency matrix of each subject, and the dynamic brain functional network across subjects was then constructed. Next, tucker decomposition was performed and the Grassmann distance of the connectivity submatrix was calculated. Subsequently, different brain network states were distinguished and signal segments under emotional states were automatically extract using data-driven methods. Then, tensor sparse representation was adopted on the intercepted EEG signals to effectively extract functional connections under different emotional states. Finally, power-distribution related features (differential entropy and energy feature) and brain functional connection features were effectively combined for classification using the support vector machines (SVM) classifier. The proposed method was validated on ERN and DEAP datasets. The single-feature emotion classification accuracy of 86.57 % and 87.74 % were achieved on valence and arousal dimensions, respectively. The accuracy of the proposed feature fusion method was achieved at 89.14 % and 89.65 %, accordingly, demonstrating an improvement in emotion recognition accuracy. The results demonstrated the superior classification performance of the proposed data-driven signal auto-segmentation and feature fusion algorithm in emotion recognition compared to state-of-the-art classification methods.
Collapse
|
10
|
Li J, Wang L, Zhang Z, Feng Y, Huang M, Liang D. Analysis and recognition of a novel experimental paradigm for musical emotion brain-computer interface. Brain Res 2024; 1839:149039. [PMID: 38815645 DOI: 10.1016/j.brainres.2024.149039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 05/17/2024] [Accepted: 05/27/2024] [Indexed: 06/01/2024]
Abstract
Musical emotions have received increasing attention over the years. To better recognize the emotions by brain-computer interface (BCI), the random music-playing and sequential music-playing experimental paradigms are proposed and compared in this paper. Two experimental paradigms consist of three positive pieces, three neutral pieces and three negative pieces of music. Ten subjects participate in two experimental paradigms. The features of electroencephalography (EEG) signals are firstly analyzed in the time, frequency and spatial domains. To improve the effect of emotion recognition, a recognition model is proposed with the optimal channels selecting by Pearson's correlation coefficient, and the feature fusion combining differential entropy and wavelet packet energy. According to the analysis results, the features of sequential music-playing experimental paradigm are more different among three emotions. The classification results of sequential music-playing experimental paradigm are also better, and its average results of positive, neutral and negative emotions are 78.53%, 72.81% and 77.35%, respectively. The more obvious the changes of EEG induced by the emotions, the higher the classification accuracy will be. After analyzing two experimental paradigms, a better way for music to induce the emotions can be explored. Therefore, our research offers a novel perspective on affective BCIs.
Collapse
|
11
|
Morningstar M, Hughes C, French RC, Grannis C, Mattson WI, Nelson EE. Functional connectivity during facial and vocal emotion recognition: Preliminary evidence for dissociations in developmental change by nonverbal modality. Neuropsychologia 2024; 202:108946. [PMID: 38945440 DOI: 10.1016/j.neuropsychologia.2024.108946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/15/2024] [Accepted: 06/27/2024] [Indexed: 07/02/2024]
Abstract
The developmental trajectory of emotion recognition (ER) skills is thought to vary by nonverbal modality, with vocal ER becoming mature later than facial ER. To investigate potential neural mechanisms contributing to this dissociation at a behavioural level, the current study examined whether youth's neural functional connectivity during vocal and facial ER tasks showed differential developmental change across time. Youth ages 8-19 (n = 41) completed facial and vocal ER tasks while undergoing functional magnetic resonance imaging, at two timepoints (1 year apart; n = 36 for behavioural data, n = 28 for neural data). Partial least squares analyses revealed that functional connectivity during ER is both distinguishable by modality (with different patterns of connectivity for facial vs. vocal ER) and across time-with changes in connectivity being particularly pronounced for vocal ER. ER accuracy was greater for faces than voices, and positively associated with age; although task performance did not change appreciably across a 1-year period, changes in latent functional connectivity patterns across time predicted participants' ER accuracy at Time 2. Taken together, these results suggest that vocal and facial ER are supported by distinguishable neural correlates that may undergo different developmental trajectories. Our findings are also preliminary evidence that changes in network integration may support the development of ER skills in childhood and adolescence.
Collapse
|
12
|
Kramer M, Fink F, Campo LA, Akinci E, Wieser MO, Juckel G, Mavrogiorgou P. Video analysis of interaction in schizophrenia reveals functionally relevant abnormalities. Schizophr Res 2024; 274:24-32. [PMID: 39250840 DOI: 10.1016/j.schres.2024.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/28/2024] [Accepted: 09/03/2024] [Indexed: 09/11/2024]
Abstract
OBJECTIVE Deficits of dyadic social interaction seem to diminish social functioning in schizophrenia. However, most previous studies are of a limited ecological validity due to decontextualized experimental conditions far off from real-world interaction. In this pilot study, we thus exposed participants to a more real-world-like situation to generate new hypotheses for research and therapeutic interventions. METHODS Dyads of either participants with schizophrenia (n = 21) or control participants without mental disorder (n = 21) were presented with a 5-min emotionally engaging movie. The subsequent uninstructed dyadic interaction was videotaped and analyzed by means of a semi-quantitative, software-supported behavioral analysis. RESULTS The patients with schizophrenia showed significant abnormalities regarding their social interaction, such as more negative verbalizations, a more open display of negative affect and gaze abnormalities. Their interaction behavior was mostly characterized by neutral affect, silence and avoidance of direct eye contact. Neutral affect was associated with poorer psychosocial performance. Verbal intelligence and empathy were associated with positive interaction variables, which were also not impaired by psychotic symptom severity. CONCLUSION In this real-world-like dyadic interaction, participants with schizophrenia show distinct abnormalities that are relevant to psychosocial performance and consistent with a hypothesized lack of attunement to interaffective situations.
Collapse
|
13
|
Bry C, Propice K, Bourgin J, Métral M. Social cognition, psychosocial development and well-being in galactosemia. Orphanet J Rare Dis 2024; 19:325. [PMID: 39243040 PMCID: PMC11378408 DOI: 10.1186/s13023-024-03335-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 08/21/2024] [Indexed: 09/09/2024] Open
Abstract
BACKGROUND Classic galactosemia is a rare inherited metabolic disease with long-term complications, particularly in the psychosocial domain. Patients report a lower quality of social life, difficulties in interactions and social relationships, and a lower mental health. We hypothesised that social cognition deficits could partially explain this psychological symptomatology. Eleven adults with galactosemia and 31 control adults participated in the study. We measured social cognition skills in cognitive and affective theory of Mind, and in basic and complex emotion recognition. We explored psychosocial development and mental well-being. RESULTS We found significant deficits on all 4 social cognition measures. Compared to controls, participants with galactosemia were impaired in the 2nd-order cognitive theory of mind, in affective theory of mind, and in basic and complex emotion recognition. Participants with galactosemia had a significant delay in their psychosexual development, but we found no delay in social development and no significant decrease in mental health. CONCLUSION Social cognition processes seem impaired among our participants with galactosemia. We discuss the future path research may follow. More research is needed to replicate and strengthen these results and establish the links between psychosocial complications and deficits in social cognition.
Collapse
|
14
|
Pang B, Peng Y, Gao J, Kong W. Semi-supervised bipartite graph construction with active EEG sample selection for emotion recognition. Med Biol Eng Comput 2024; 62:2805-2824. [PMID: 38700614 DOI: 10.1007/s11517-024-03094-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/10/2024] [Indexed: 08/18/2024]
Abstract
Electroencephalogram (EEG) signals are derived from the central nervous system and inherently difficult to camouflage, leading to the recent popularity of EEG-based emotion recognition. However, due to the non-stationary nature of EEG, inter-subject variabilities become obstacles for recognition models to well adapt to different subjects. In this paper, we propose a novel approach called semi-supervised bipartite graph construction with active EEG sample selection (SBGASS) for cross-subject emotion recognition, which offers two significant advantages. Firstly, SBGASS adaptively learns a bipartite graph to characterize the underlying relationships between labeled and unlabeled EEG samples, effectively implementing the semantic connection for samples from different subjects. Secondly, we employ active sample selection technique in this paper to reduce the impact of negative samples (outliers or noise in the data) on bipartite graph construction. Drawing from the experimental results with the SEED-IV data set, we have gained the following three insights. (1) SBGASS actively rejects negative labeled samples, which helps mitigate the impact of negative samples when constructing the optimal bipartite graph and improves the model performance. (2) Through the learned optimal bipartite graph in SBGASS, the transferability of labeled EEG samples is quantitatively analyzed, which exhibits a decreasing tendency as the distance between each labeled sample and the corresponding class centroid increases. (3) Besides the improved recognition accuracy, the spatial-frequency patterns in emotion recognition are investigated by the acquired projection matrix.
Collapse
|
15
|
Narsimha Reddy CH, Mahesh S, Manjunathachari K. Hybrid feature integration model and adaptive transformer approach for emotion recognition with EEG signals. Comput Methods Biomech Biomed Engin 2024; 27:1610-1632. [PMID: 37688466 DOI: 10.1080/10255842.2023.2252551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/04/2023] [Accepted: 08/17/2023] [Indexed: 09/11/2023]
Abstract
Nowadays, electroencephalogram (EEG) signals are the major part of recognizing emotions as the signals contain more brain-related information. Several traditional recognition methodologies have been developed, but still, exist with certain challenges that degrade and mislead the recognition process. Some models are facing issues based on insignificant features, poor generalization capability, and computational burden. To alleviate such issues, an effective emotion recognition model is proposed using a heuristic-aided transformer-based approach. Initially, the input EEG signals are gathered and decomposed via the 5-level Discrete Wavelet Transform (DWT). Then, the decomposed signals are fed to the feature extraction process, where the feature extraction is carried out by using Logistic Regression (LR) approach. From the decomposed signals, the Logistic Regression (LR) is used for getting the noteworthy features and then, the same decomposed signals are fed to the optimal weighted feature selection, which is carried out using Adaptive Bald Eagle Search Optimization (ABESO) algorithm. Subsequently, the hybrid weighted feature selection approach is introduced for getting the weighted features, where the weight is optimized with the adoption of the ABESO algorithm. At last, the selected hybrid weighted features are sent to the recognition phase, where the Optimized Block Recurrent Transformer (OBRT), where performance gets enhanced by tuning the parameter with the developed ABESO algorithm. The accuracy and precision rate of the designed approach are attained as 96% and 94%. The performance of the model is assessed and the findings have shown that better performance that makes the system more feasible to recognize human emotions.
Collapse
|
16
|
Fu B, Yu X, Jiang G, Sun N, Liu Y. Enhancing local representation learning through global-local integration with functional connectivity for EEG-based emotion recognition. Comput Biol Med 2024; 179:108857. [PMID: 39018882 DOI: 10.1016/j.compbiomed.2024.108857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/21/2024] [Accepted: 07/06/2024] [Indexed: 07/19/2024]
Abstract
Emotion recognition based on electroencephalogram (EEG) signals is crucial in understanding human affective states. Current research has limitations in extracting local features. The representation capabilities of local features are limited, making it difficult to comprehensively capture emotional information. In this study, a novel approach is proposed to enhance local representation learning through global-local integration with functional connectivity for EEG-based emotion recognition. By leveraging the functional connectivity of brain regions, EEG signals are divided into global embeddings that represent comprehensive brain connectivity patterns throughout the entire process and local embeddings that reflect dynamic interactions within specific brain functional networks at particular moments. Firstly, a convolutional feature extraction branch based on the residual network is designed to extract local features from the global embedding. To further improve the representation ability and accuracy of local features, a multidimensional collaborative attention (MCA) module is introduced. Secondly, the local features and patch embedded local embeddings are integrated into the feature coupling module (FCM), which utilizes hierarchical connections and enhanced cross-attention to couple region-level features, thereby enhancing local representation learning. Experimental results on three public datasets show that compared with other methods, this method improves accuracy by 4.92% on the DEAP, by 1.11% on the SEED, and by 7.76% on the SEED-IV, demonstrating its superior performance in emotion recognition tasks.
Collapse
|
17
|
Ger E, Manfredi M, Osório AAC, Ribeiro CF, Almeida A, Güdel A, Calbi M, Daum MM. Duration of face mask exposure matters: evidence from Swiss and Brazilian kindergartners' ability to recognise emotions. Cogn Emot 2024; 38:857-871. [PMID: 38576358 DOI: 10.1080/02699931.2024.2331795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 03/07/2024] [Accepted: 03/09/2024] [Indexed: 04/06/2024]
Abstract
Wearing facial masks became a common practice worldwide during the COVID-19 pandemic. This study investigated (1) whether facial masks that cover adult faces affect 4- to 6-year-old children's recognition of emotions in those faces and (2) whether the duration of children's exposure to masks is associated with emotion recognition. We tested children from Switzerland (N = 38) and Brazil (N = 41). Brazil represented longer mask exposure due to a stricter mandate during COVID-19. Children had to choose a face displaying a specific emotion (happy, angry, or sad) when the face wore either no cover, a facial mask, or sunglasses. The longer hours of mask exposure were associated with better emotion recognition. Controlling for the hours of exposure, children were less likely to recognise emotions in partially hideen faces. Moreover, Brazilian children were more accurate in recognising happy faces than Swiss children. Overall, facial masks may negatively impact children's emotion recognition. However, prolonged exposure appears to buffer the lack of facial cues from the nose and mouth. In conclusion, restricting facial cues due to masks may impair kindergarten children's emotion recognition in the short run. However, it may facilitate their broader reading of facial emotional cues in the long run.
Collapse
|
18
|
Tan W, Zhang H, Wang Z, Li H, Gao X, Zeng N. S 3T-Net: A novel electroencephalogram signals-oriented emotion recognition model. Comput Biol Med 2024; 179:108808. [PMID: 38996556 DOI: 10.1016/j.compbiomed.2024.108808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 06/01/2024] [Accepted: 06/24/2024] [Indexed: 07/14/2024]
Abstract
In this paper, a novel skipping spatial-spectral-temporal network (S3T-Net) is developed to handle intra-individual differences in electroencephalogram (EEG) signals for accurate, robust, and generalized emotion recognition. In particular, aiming at the 4D features extracted from the raw EEG signals, a multi-branch architecture is proposed to learn spatial-spectral cross-domain representations, which benefits enhancing the model generalization ability. Time dependency among different spatial-spectral features is further captured via a bi-directional long-short term memory module, which employs an attention mechanism to integrate context information. Moreover, a skip-change unit is designed to add another auxiliary pathway for updating model parameters, which alleviates the vanishing gradient problem in complex spatial-temporal network. Evaluation results show that the proposed S3T-Net outperforms other advanced models in terms of the emotion recognition accuracy, which yields an performance improvement of 0.23% , 0.13%, and 0.43% as compared to the sub-optimal model in three test scenes, respectively. In addition, the effectiveness and superiority of the key components of S3T-Net are demonstrated from various experiments. As a reliable and competent emotion recognition model, the proposed S3T-Net contributes to the development of intelligent sentiment analysis in human-computer interaction (HCI) realm.
Collapse
|
19
|
Hendel RK, Hellem MNN, Larsen IU, Vinther-Jensen T, Hjermind LE, Nielsen JE, Vogel A. Impairments of social cognition significantly predict the progression of functional decline in Huntington's disease: A 6-year follow-up study. APPLIED NEUROPSYCHOLOGY. ADULT 2024; 31:777-786. [PMID: 35549503 DOI: 10.1080/23279095.2022.2073824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This study sought to investigate if there was a significant difference between the Huntington's Disease gene expansion carriers who were impaired on the cognitive domains, social cognition and executive functions. Also, it was investigated which of the cognitive domains could predict the decrease in total functional capacity over a 6-year follow-up period. Premanifest and motor-manifest Huntington's Disease gene expansion carriers (N = 98), were examined with a neurological and neuropsychological examination at Time 1 (year 2012-2013). Regression-based normative data was used to classify impairments on the two cognitive domains. Follow-up participants (N = 80) had their functional capacity reexamined at Time 2 (year 2018-2020), to examine which cognitive domain could predict the decrease in functional capacity over the 6-year follow-up. More than 50% of the participants were impaired on the domain of social cognition. These participants were significantly different from the participants who were impaired on executive functions. The motor function and impairments on social cognition significantly predicted the decline in functional capacity. The Emotion Hexagon test was the only significant social cognitive task, that predicted the decline in functional capacity. Social cognition includes unique and separate functions in Huntington's Disease, unaffected by executive functions. This study emphasizes the importance of regular assessment of social cognition in Huntington's Disease and the clinical relevance of impaired social cognitive function.
Collapse
|
20
|
Raheel A. Emotion analysis and recognition in 3D space using classifier-dependent feature selection in response to tactile enhanced audio-visual content using EEG. Comput Biol Med 2024; 179:108807. [PMID: 38970831 DOI: 10.1016/j.compbiomed.2024.108807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/13/2024] [Accepted: 06/24/2024] [Indexed: 07/08/2024]
Abstract
Traditional media such as text, images, audio, and video primarily target specific senses like vision and hearing. In contrast, multiple sensorial media aims to create immersive experiences by integrating additional sensory modalities such as touch, smell, and taste where applicable. Tactile enhanced audio-visual content leverages the sense of touch in addition to visual and auditory stimuli, aiming to create a more immersive and engaging interaction for users. Previously, tactile enhanced content has been explored in 2D emotional space (valence and arousal). In this paper, EEG data against tactile enhanced audio-visual content is labeled based on a self-assessment manikin scale in 3 dimensions i.e., valence, arousal, and dominance. Statistical significance (with a 95% confidence interval) is also established based on gathered scores, highlighting a significant difference in the arousal and dominance dimension of traditional media and tactile enhanced media. A new methodology is proposed using classifier-dependent feature selection approach to classify valence, arousal, and dominance states using three different classifiers. A highest accuracy of 75%, 73.8%, and 75% is achieved for classifying valence, arousal, and dominance states, respectively. The proposed scheme outperforms previous emotion recognition based studies in response to enhanced multimedia content in terms of accuracy, F-score, and other error parameters.
Collapse
|
21
|
Fang A, Zhong P, Pan F, Li Y, He P. Impact of emotional states on tinnitus sound therapy efficacy based on ECG signals and emotion recognition model. J Neurosci Methods 2024; 409:110213. [PMID: 38964476 DOI: 10.1016/j.jneumeth.2024.110213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/07/2024] [Accepted: 06/28/2024] [Indexed: 07/06/2024]
Abstract
BACKGROUND Diagnosis and severity assessment of tinnitus are mostly based on the patient's descriptions and subjective questionnaires, which lacks objective means of diagnosis and assessment bases, the accuracy of which fluctuates with the clarity of the patient's description. This complicates the timely modification of treatment strategies or therapeutic music to improve treatment efficacy. NEW METHOD We employed a novel random convolutional kernel-based method for electrocardiogram (ECG) signal analysis to identify patients' emotional states during Music Tinnitus Sound Therapy (Music-TST) sessions. Then analyzed correlations between emotional changes in different treatment phase and Tinnitus Handicap Inventory (THI) score differences to determine the impact of emotions on tinnitus treatment efficacy. RESULTS This study revealed a significant correlation between patients' emotion changes during Music-TST and the therapy's effectiveness. Changes in arousal and dominance dimension, were strongly linked to THI variations. These findings highlight the substantial impact of emotional responses on sound therapy's efficacy, offering a new perspective for understanding and optimizing tinnitus treatment. COMPARISON WITH EXISTING METHODS Compared to existing methods, we proposed an objective indicator to assess the progress of sound therapy, the indicator could also be used to provide feedback to optimize sound therapy music. CONCLUSIONS This study revealed the critical role of emotion changes in tinnitus sound therapy. By integrating objective ECG-based emotion analysis with traditional subjective scale like THI, we present an innovative approach to assess and potentially optimize therapy effectiveness. This finding could lead to more personalized and effective treatment strategies for tinnitus sound therapy.
Collapse
|
22
|
Biaggi A, Hazelgrove K, Waites F, Bind RH, Lawrence AJ, Fuste M, Conroy S, Howard LM, Mehta MA, Miele M, Seneviratne G, Pawlby S, Pariante CM, Dazzan P. Predictors of mother-infant interaction quality in women at risk of postpartum psychosis: The role of emotion recognition. J Affect Disord 2024; 367:562-572. [PMID: 39216645 DOI: 10.1016/j.jad.2024.08.180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 08/26/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND Limited research exists on mother-infant interaction in women at-risk-of postpartum psychosis (PP). This study aimed to investigate potential predictors of mother-infant interaction quality in women at-risk-of-PP during the first postnatal year. Potential predictors investigated were: maternal ability to recognize emotions, childhood maltreatment, parenting stress, and infant social-interactive behaviour at birth. METHODS 98 women (and their offspring) were included, 40 at-risk-of-PP because of a diagnosis of Bipolar Disorder, Schizoaffective Disorder or previous PP, and 58 with no current/previous mental illness or family history of PP. Mother-infant interaction was assessed using the CARE-Index at 8 weeks and 12 months postpartum. Maternal ability to recognize emotions was assessed with the VERT-K, maternal experience of childhood maltreatment with the CECA-Q, maternal parenting stress with the PSI-SF and infant social-interactive behaviour with the NBAS. RESULTS Women at-risk-of-PP were less able to recognize fear than healthy controls and this predicted the quality of the mother-infant interaction at 8 weeks' and 12 months' post partum, over and above the effect of maternal Group (respectively, β = 0.33, p = .015; β = 0.40, p = .006). Infant social-interactive behaviour at birth was a significant predictor for mother-infant interaction at 12 months (β = 0.32, p = .031), although this did not differ significantly between the groups. LIMITATIONS A relatively small sample size precluded a more in-depth investigation of indirect pathways and other potential predictors. CONCLUSIONS These results are important as they suggest that preventive interventions targeting emotion recognition may be implemented in women at-risk-of-PP, with the aim of improving mother-infant interaction and potentially also the infant long-term development.
Collapse
|
23
|
Farhadi Sedehi J, Jafarnia Dabanloo N, Maghooli K, Sheikhani A. Multimodal insights into granger causality connectivity: Integrating physiological signals and gated eye-tracking data for emotion recognition using convolutional neural network. Heliyon 2024; 10:e36411. [PMID: 39253213 PMCID: PMC11381760 DOI: 10.1016/j.heliyon.2024.e36411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/11/2024] Open
Abstract
This study introduces a groundbreaking method to enhance the accuracy and reliability of emotion recognition systems by combining electrocardiogram (ECG) with electroencephalogram (EEG) data, using an eye-tracking gated strategy. Initially, we propose a technique to filter out irrelevant portions of emotional data by employing pupil diameter metrics from eye-tracking data. Subsequently, we introduce an innovative approach for estimating effective connectivity to capture the dynamic interaction between the brain and the heart during emotional states of happiness and sadness. Granger causality (GC) is estimated and utilized to optimize input for a highly effective pre-trained convolutional neural network (CNN), specifically ResNet-18. To assess this methodology, we employed EEG and ECG data from the publicly available MAHNOB-HCI database, using a 5-fold cross-validation approach. Our method achieved an impressive average accuracy and area under the curve (AUC) of 91.00 % and 0.97, respectively, for GC-EEG-ECG images processed with ResNet-18. Comparative analysis with state-of-the-art studies clearly shows that augmenting ECG with EEG and refining data with an eye-tracking strategy significantly enhances emotion recognition performance across various emotions.
Collapse
|
24
|
Qiu L, Zhong L, Li J, Feng W, Zhou C, Pan J. SFT-SGAT: A semi-supervised fine-tuning self-supervised graph attention network for emotion recognition and consciousness detection. Neural Netw 2024; 180:106643. [PMID: 39186838 DOI: 10.1016/j.neunet.2024.106643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/11/2024] [Accepted: 08/14/2024] [Indexed: 08/28/2024]
Abstract
Emotional recognition is highly important in the field of brain-computer interfaces (BCIs). However, due to the individual variability in electroencephalogram (EEG) signals and the challenges in obtaining accurate emotional labels, traditional methods have shown poor performance in cross-subject emotion recognition. In this study, we propose a cross-subject EEG emotion recognition method based on a semi-supervised fine-tuning self-supervised graph attention network (SFT-SGAT). First, we model multi-channel EEG signals by constructing a graph structure that dynamically captures the spatiotemporal topological features of EEG signals. Second, we employ a self-supervised graph attention neural network to facilitate model training, mitigating the impact of signal noise on the model. Finally, a semi-supervised approach is used to fine-tune the model, enhancing its generalization ability in cross-subject classification. By combining supervised and unsupervised learning techniques, the SFT-SGAT maximizes the utility of limited labeled data in EEG emotion recognition tasks, thereby enhancing the model's performance. Experiments based on leave-one-subject-out cross-validation demonstrate that SFT-SGAT achieves state-of-the-art cross-subject emotion recognition performance on the SEED and SEED-IV datasets, with accuracies of 92.04% and 82.76%, respectively. Furthermore, experiments conducted on a self-collected dataset comprising ten healthy subjects and eight patients with disorders of consciousness (DOCs) revealed that the SFT-SGAT attains high classification performance in healthy subjects (maximum accuracy of 95.84%) and was successfully applied to DOC patients, with four patients achieving emotion recognition accuracies exceeding 60%. The experiments demonstrate the effectiveness of the proposed SFT-SGAT model in cross-subject EEG emotion recognition and its potential for assessing levels of consciousness in patients with DOC.
Collapse
|
25
|
Ivanova-Serokhvostova A, Fanti K, Bonillo A, Supèr H, Corrales M, Pérez-Bonaventura I, Pamias M, Ramos-Quiroga AJ, Torrubia R, Nadal R, Frick PJ, Molinuevo B. Do Children with High Callous-Unemotional Traits Have Attentional Deficits to Emotional Stimuli? Evidence from a Multi-Method and Multi-Informant Study. Child Psychiatry Hum Dev 2024:10.1007/s10578-024-01739-6. [PMID: 39152275 DOI: 10.1007/s10578-024-01739-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/07/2024] [Indexed: 08/19/2024]
Abstract
Callous-unemotional (CU) traits in children and adolescents are linked to severe and persistent antisocial behavior. Based on past empirical research, several theoretical models have suggested that CU traits may be partly explained by difficulties in correctly identifying others' emotional states as well as their reduced attention to others' eyes, which could be important for both causal theory and treatment. This study tested the relationships among CU traits, emotion recognition of facial expressions and visual behavior in a sample of 52 boys referred to a clinic for conduct problems (Mage = 10.29 years; SD = 2.06). We conducted a multi-method and multi-informant assessment of CU traits through the Child Problematic Traits Inventory (CPTI), the Inventory of Callous-Unemotional (ICU), and the Clinical Assessment of Prosocial Emotions-Version 1.1 (CAPE). The primary goal of the study was to compare the utility of these methods for forming subgroups of youth that differ in their emotional processing abilities. An emotion recognition task assessed recognition accuracy (percentage of mistakes) and absolute dwell time on the eyes or mouth region for each emotion. Results from repeated measures ANOVAs revealed that low and high CU groups did not differ in emotion recognition accuracy, irrespective of the method of assessing CU traits. However, the high CU group showed reduced attention to the eyes of fearful and sad facial expressions (using the CPTI) or to all emotions (using the CAPE). The high CU group also showed a general increase in attention to the mouth area, but only when assessed by the CAPE. These findings provide evidence to support abnormalities in how those elevated on CU traits process emotional stimuli, especially when assessed by a clinical interview, which could guide appropriate assessment and more successful interventions for this group of youth.
Collapse
|