1
|
Sabatinelli D, Farkas AH, Gehr MC. Moving toward reality: Electrocortical reactivity to naturalistic multimodal emotional videos. Psychophysiology 2024; 61:e14526. [PMID: 38273427 DOI: 10.1111/psyp.14526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/12/2023] [Accepted: 12/31/2023] [Indexed: 01/27/2024]
Abstract
While previous research has investigated the effects of emotional videos on peripheral physiological measures and conscious experience, this study extends the research to include electrocortical measures, specifically the steady-state visual-evoked potential (ssVEP). A carefully curated set of 45 videos, designed to represent a wide range of emotional and neutral content, were presented with a flickering border. The videos featured a continuous single-shot perspective, natural soundtrack, and excluded elements associated with professional films, to enhance realism. The results demonstrate a consistent reduction in ssVEP amplitude during emotional videos which strongly correlates with the rated emotional intensity of the clips. This suggests that narrative audiovisual stimuli have the potential to track dynamic emotional processing in the cortex, providing new avenues for research in affective neuroscience. The findings highlight the potential of using realistic video stimuli to investigate how the human brain processes emotional events in a paradigm that increases ecological validity. Future studies can further develop this paradigm by expanding the video set, targeting specific cortical networks, and manipulating narrative predictability. Overall, this study establishes a foundation for investigating emotional perception using realistic video stimuli and has the potential to expand our understanding of real-world emotional processing in the human brain.
Collapse
Affiliation(s)
- Dean Sabatinelli
- Department of Psychology, University of Georgia, Athens, Georgia, USA
- Department of Neuroscience, University of Georgia, Athens, Georgia, USA
| | - Andrew H Farkas
- Department of Psychology, University of Georgia, Athens, Georgia, USA
| | - Matthew C Gehr
- Department of Psychology, University of Georgia, Athens, Georgia, USA
| |
Collapse
|
2
|
Paliwal V, Das K, Doesburg SM, Medvedev G, Xi P, Ribary U, Pachori RB, Vakorin VA. Classifying Routine Clinical Electroencephalograms With Multivariate Iterative Filtering and Convolutional Neural Networks. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2038-2048. [PMID: 38768007 DOI: 10.1109/tnsre.2024.3403198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Electroencephalogram (EEG) is widely used in basic and clinical neuroscience to explore neural states in various populations, and classifying these EEG recordings is a fundamental challenge. While machine learning shows promising results in classifying long multivariate time series, optimal prediction models and feature extraction methods for EEG classification remain elusive. Our study addressed the problem of EEG classification under the framework of brain age prediction, applying a deep learning model on EEG time series. We hypothesized that decomposing EEG signals into oscillatory modes would yield more accurate age predictions than using raw or canonically frequency-filtered EEG. Specifically, we employed multivariate intrinsic mode functions (MIMFs), an empirical mode decomposition (EMD) variant based on multivariate iterative filtering (MIF), with a convolutional neural network (CNN) model. Testing a large dataset of routine clinical EEG scans (n = 6540) from patients aged 1 to 103 years, we found that an ad-hoc CNN model without fine-tuning could reasonably predict brain age from EEGs. Crucially, MIMF decomposition significantly improved performance compared to canonical brain rhythms (from delta to lower gamma oscillations). Our approach achieved a mean absolute error (MAE) of 13.76 ± 0.33 and a correlation coefficient of 0.64 ± 0.01 in brain age prediction over the entire lifespan. Our findings indicate that CNN models applied to EEGs, preserving their original temporal structure, remains a promising framework for EEG classification, wherein the adaptive signal decompositions such as the MIF can enhance CNN models' performance in this task.
Collapse
|
3
|
Dong H, Zhou J, Fan C, Zheng W, Tao L, Kwan HK. Multi-scale 3D-CRU for EEG emotion recognition. Biomed Phys Eng Express 2024; 10:045018. [PMID: 38670076 DOI: 10.1088/2057-1976/ad43f1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 04/26/2024] [Indexed: 04/28/2024]
Abstract
In this paper, we propose a novel multi-scale 3D-CRU model, with the goal of extracting more discriminative emotion feature from EEG signals. By concurrently exploiting the relative electrode locations and different frequency subbands of EEG signals, a three-dimensional feature representation is reconstructed wherein the Delta (δ) frequency pattern is included. We employ a multi-scale approach, termed 3D-CRU, to concurrently extract frequency and spatial features at varying levels of granularity within each time segment. In the proposed 3D-CRU, we introduce a multi-scale 3D Convolutional Neural Network (3D-CNN) to effectively capture discriminative information embedded within the 3D feature representation. To model the temporal dynamics across consecutive time segments, we incorporate a Gated Recurrent Unit (GRU) module to extract temporal representations from the time series of combined frequency-spatial features. Ultimately, the 3D-CRU model yields a global feature representation, encompassing comprehensive information across time, frequency, and spatial domains. Numerous experimental assessments conducted on publicly available DEAP and SEED databases provide empirical evidence supporting the enhanced performance of our proposed model in the domain of emotion recognition. These findings underscore the efficacy of the features extracted by the proposed multi-scale 3D-GRU model, particularly with the incorporation of the Delta (δ) frequency pattern. Specifically, on the DEAP dataset, the accuracy of Valence and Arousal are 93.12% and 94.31%, respectively, while on the SEED dataset, the accuracy is 92.25%.
Collapse
Affiliation(s)
- Hao Dong
- School of Computer Science and Technology, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, Anhui, People's Republic of China
| | - Jian Zhou
- School of Computer Science and Technology, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, Anhui, People's Republic of China
| | - Cunhang Fan
- School of Computer Science and Technology, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, Anhui, People's Republic of China
| | - Wenming Zheng
- School of Biological Science and Medical Engineering, Key Laboratory of Child Development and Learning Science, Southeast University, Nanjing 210096, People's Republic of China
| | - Liang Tao
- School of Computer Science and Technology, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, Anhui, People's Republic of China
| | - Hon Keung Kwan
- Department of Electrical and Computer Engineering, University of Windsor, Windsor, Ontario, N9B 3P4, Canada
| |
Collapse
|
4
|
Huang W, Xu W, Wan R, Zhang P, Zha Y, Pang M. Auto Diagnosis of Parkinson's Disease Via a Deep Learning Model Based on Mixed Emotional Facial Expressions. IEEE J Biomed Health Inform 2024; 28:2547-2557. [PMID: 37022035 DOI: 10.1109/jbhi.2023.3239780] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Parkinson's disease (PD) is a common degenerative disease of the nervous system in the elderly. The early diagnosis of PD is very important for potential patients to receive prompt treatment and avoid the aggravation of the disease. Recent studies have found that PD patients always suffer from emotional expression disorder, thus forming the characteristics of "masked faces". Based on this, we thus propose an auto PD diagnosis method based on mixed emotional facial expressions in the paper. Specifically, the proposed method is cast into four steps: Firstly, we synthesize virtual face images containing six basic expressions (i.e., anger, disgust, fear, happiness, sadness, and surprise) via generative adversarial learning, in order to approximate the premorbid expressions of PD patients; Secondly, we design an effective screening scheme to assess the quality of the above synthesized facial expression images and then shortlist the high-quality ones; Thirdly, we train a deep feature extractor accompanied with a facial expression classifier based on the mixture of the original facial expression images of the PD patients, the high-quality synthesized facial expression images of PD patients, and the normal facial expression images from other public face datasets; Finally, with the well-trained deep feature extractor, we thus adopt it to extract the latent expression features for six facial expression images of a potential PD patient to conduct PD/non-PD prediction. To show real-world impacts, we also collected a new facial expression dataset of PD patients in collaboration with a hospital. Extensive experiments are conducted to validate the effectiveness of the proposed method for PD diagnosis and facial expression recognition.
Collapse
|
5
|
Zhang Y, Liu H, Wang D, Zhang D, Lou T, Zheng Q, Quek C. Cross-modal credibility modelling for EEG-based multimodal emotion recognition. J Neural Eng 2024; 21:026040. [PMID: 38565099 DOI: 10.1088/1741-2552/ad3987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/02/2024] [Indexed: 04/04/2024]
Abstract
Objective.The study of emotion recognition through electroencephalography (EEG) has garnered significant attention recently. Integrating EEG with other peripheral physiological signals may greatly enhance performance in emotion recognition. Nonetheless, existing approaches still suffer from two predominant challenges: modality heterogeneity, stemming from the diverse mechanisms across modalities, and fusion credibility, which arises when one or multiple modalities fail to provide highly credible signals.Approach.In this paper, we introduce a novel multimodal physiological signal fusion model that incorporates both intra-inter modality reconstruction and sequential pattern consistency, thereby ensuring a computable and credible EEG-based multimodal emotion recognition. For the modality heterogeneity issue, we first implement a local self-attention transformer to obtain intra-modal features for each respective modality. Subsequently, we devise a pairwise cross-attention transformer to reveal the inter-modal correlations among different modalities, thereby rendering different modalities compatible and diminishing the heterogeneity concern. For the fusion credibility issue, we introduce the concept of sequential pattern consistency to measure whether different modalities evolve in a consistent way. Specifically, we propose to measure the varying trends of different modalities, and compute the inter-modality consistency scores to ascertain fusion credibility.Main results.We conduct extensive experiments on two benchmarked datasets (DEAP and MAHNOB-HCI) with the subject-dependent paradigm. For the DEAP dataset, our method improves the accuracy by 4.58%, and the F1 score by 0.63%, compared to the state-of-the-art baseline. Similarly, for the MAHNOB-HCI dataset, our method improves the accuracy by 3.97%, and the F1 score by 4.21%. In addition, we gain much insight into the proposed framework through significance test, ablation experiments, confusion matrices and hyperparameter analysis. Consequently, we demonstrate the effectiveness of the proposed credibility modelling through statistical analysis and carefully designed experiments.Significance.All experimental results demonstrate the effectiveness of our proposed architecture and indicate that credibility modelling is essential for multimodal emotion recognition.
Collapse
Affiliation(s)
- Yuzhe Zhang
- School of Computer Science and Technology, MOEKLINNS Lab, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Huan Liu
- School of Computer Science and Technology, MOEKLINNS Lab, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Di Wang
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, Singapore 639798, Singapore
| | - Dalin Zhang
- Department of Computer Science, Aalborg University, Fredrik Bajers Vej 7 K, 9220 Aalborg, Denmark
| | - Tianyu Lou
- School of Computer Science and Technology, MOEKLINNS Lab, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Qinghua Zheng
- School of Computer Science and Technology, MOEKLINNS Lab, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Chai Quek
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, Singapore 639798, Singapore
| |
Collapse
|
6
|
Gan K, Li R, Zhang J, Sun Z, Yin Z. Instantaneous estimation of momentary affective responses using neurophysiological signals and a spatiotemporal emotional intensity regression network. Neural Netw 2024; 172:106080. [PMID: 38160622 DOI: 10.1016/j.neunet.2023.12.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 11/25/2023] [Accepted: 12/19/2023] [Indexed: 01/03/2024]
Abstract
Previous studies in affective computing often use a fixed emotional label to train an emotion classifier with electroencephalography (EEG) from individuals experiencing an affective stimulus. However, EEGs encode emotional dynamics that include varying intensities within a given emotional category. To investigate these variations in emotional intensity, we propose a framework that obtains momentary affective labels for fine-grained segments of EEGs with human feedback. We then model these labeled segments using a novel spatiotemporal emotional intensity regression network (STEIR-Net). It integrates temporal EEG patterns from nine predefined cortical regions to provide a continuous estimation of emotional intensity. We demonstrate that the STEIR-Net outperforms classical regression models by reducing the root mean square error (RMSE) by an average of 4∼9 % and 2∼4 % for the SEED and SEED-IV databases, respectively. We find that the frontal and temporal cortical regions contribute significantly to the affective intensity's variation. Higher absolute values of the Spearman correlation coefficient between the model estimation and momentary affective labels under happiness (0.2114) and fear (0.2072) compared to neutral (0.1694) and sad (0.1895) emotions were observed. Besides, increasing the input length of the EEG segments from 4 to 20 s further reduces the RMSE from 1.3548 to 1.3188.
Collapse
Affiliation(s)
- Kaiyu Gan
- Engineering Research Center of Optical Instrument and System, Ministry of Education, Shanghai Key Lab of Modern Optical System, University of Shanghai for Science and Technology, Shanghai 200093, PR China; School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, PR China
| | - Ruiding Li
- Engineering Research Center of Optical Instrument and System, Ministry of Education, Shanghai Key Lab of Modern Optical System, University of Shanghai for Science and Technology, Shanghai 200093, PR China; School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, PR China
| | - Jianhua Zhang
- OsloMet Artificial Intelligence Lab, Department of Computer Science, Oslo Metropolitan University, Oslo N-0130, Norway
| | - Zhanquan Sun
- Engineering Research Center of Optical Instrument and System, Ministry of Education, Shanghai Key Lab of Modern Optical System, University of Shanghai for Science and Technology, Shanghai 200093, PR China; School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, PR China
| | - Zhong Yin
- Engineering Research Center of Optical Instrument and System, Ministry of Education, Shanghai Key Lab of Modern Optical System, University of Shanghai for Science and Technology, Shanghai 200093, PR China; School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, PR China.
| |
Collapse
|
7
|
Wang Z, Wang Y, Wan X, Tang Y. Cerebral asymmetry representation learning-based deep subdomain adaptation network for electroencephalogram-based emotion recognition. Physiol Meas 2024; 45:035004. [PMID: 38422513 DOI: 10.1088/1361-6579/ad2eb6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 02/29/2024] [Indexed: 03/02/2024]
Abstract
Objective.Extracting discriminative spatial information from multiple electrodes is a crucial and challenging problem for electroencephalogram (EEG)-based emotion recognition. Additionally, the domain shift caused by the individual differences degrades the performance of cross-subject EEG classification.Approach.To deal with the above problems, we propose the cerebral asymmetry representation learning-based deep subdomain adaptation network (CARL-DSAN) to enhance cross-subject EEG-based emotion recognition. Specifically, the CARL module is inspired by the neuroscience findings that asymmetrical activations of the left and right brain hemispheres occur during cognitive and affective processes. In the CARL module, we introduce a novel two-step strategy for extracting discriminative features through intra-hemisphere spatial learning and asymmetry representation learning. Moreover, the transformer encoders within the CARL module can emphasize the contributive electrodes and electrode pairs. Subsequently, the DSAN module, known for its superior performance over global domain adaptation, is adopted to mitigate domain shift and further improve the cross-subject performance by aligning relevant subdomains that share the same class samples.Main Results.To validate the effectiveness of the CARL-DSAN, we conduct subject-independent experiments on the DEAP database, achieving accuracies of 68.67% and 67.11% for arousal and valence classification, respectively, and corresponding accuracies of 67.70% and 67.18% on the MAHNOB-HCI database.Significance.The results demonstrate that CARL-DSAN can achieve an outstanding cross-subject performance in both arousal and valence classification.
Collapse
Affiliation(s)
- Zhe Wang
- The School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China
| | - Yongxiong Wang
- The School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China
| | - Xin Wan
- The School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China
| | - Yiheng Tang
- The School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China
| |
Collapse
|
8
|
Aydın S, Onbaşı L. Graph theoretical brain connectivity measures to investigate neural correlates of music rhythms associated with fear and anger. Cogn Neurodyn 2024; 18:49-66. [PMID: 38406195 PMCID: PMC10881947 DOI: 10.1007/s11571-023-09931-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/19/2022] [Accepted: 01/09/2023] [Indexed: 01/26/2023] Open
Abstract
The present study tests the hypothesis that emotions of fear and anger are associated with distinct psychophysiological and neural circuitry according to discrete emotion model due to contrasting neurotransmitter activities, despite being included in the same affective group in many studies due to similar arousal-valance scores of them in emotion models. EEG data is downloaded from OpenNeuro platform with access number of ds002721. Brain connectivity estimations are obtained by using both functional and effective connectivity estimators in analysis of short (2 sec) and long (6 sec) EEG segments across the cortex. In tests, discrete emotions and resting-states are identified by frequency band specific brain network measures and then contrasting emotional states are deep classified with 5-fold cross-validated Long Short Term Memory Networks. Logistic regression modeling has also been examined to provide robust performance criteria. Commonly, the best results are obtained by using Partial Directed Coherence in Gamma (31.5 - 60.5 H z ) sub-bands of short EEG segments. In particular, Fear and Anger have been classified with accuracy of 91.79%. Thus, our hypothesis is supported by overall results. In conclusion, Anger is found to be characterized by increased transitivity and decreased local efficiency in addition to lower modularity in Gamma-band in comparison to fear. Local efficiency refers functional brain segregation originated from the ability of the brain to exchange information locally. Transitivity refer the overall probability for the brain having adjacent neural populations interconnected, thus revealing the existence of tightly connected cortical regions. Modularity quantifies how well the brain can be partitioned into functional cortical regions. In conclusion, PDC is proposed to graph theoretical analysis of short EEG epochs in presenting robust emotional indicators sensitive to perception of affective sounds.
Collapse
Affiliation(s)
- Serap Aydın
- Department of Biophysics, Faculty of Medicine, Hacettepe University, Sıhhiye, Ankara, Turkey
| | - Lara Onbaşı
- School of Medicine, Hacettepe University, Sıhhiye, Ankara, Turkey
| |
Collapse
|
9
|
Ahmed MAO, Satar YA, Darwish EM, Zanaty EA. Synergistic integration of Multi-View Brain Networks and advanced machine learning techniques for auditory disorders diagnostics. Brain Inform 2024; 11:3. [PMID: 38219249 PMCID: PMC10788326 DOI: 10.1186/s40708-023-00214-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 12/06/2023] [Indexed: 01/16/2024] Open
Abstract
In the field of audiology, achieving accurate discrimination of auditory impairments remains a formidable challenge. Conditions such as deafness and tinnitus exert a substantial impact on patients' overall quality of life, emphasizing the urgent need for precise and efficient classification methods. This study introduces an innovative approach, utilizing Multi-View Brain Network data acquired from three distinct cohorts: 51 deaf patients, 54 with tinnitus, and 42 normal controls. Electroencephalogram (EEG) recording data were meticulously collected, focusing on 70 electrodes attached to an end-to-end key with 10 regions of interest (ROI). This data is synergistically integrated with machine learning algorithms. To tackle the inherently high-dimensional nature of brain connectivity data, principal component analysis (PCA) is employed for feature reduction, enhancing interpretability. The proposed approach undergoes evaluation using ensemble learning techniques, including Random Forest, Extra Trees, Gradient Boosting, and CatBoost. The performance of the proposed models is scrutinized across a comprehensive set of metrics, encompassing cross-validation accuracy (CVA), precision, recall, F1-score, Kappa, and Matthews correlation coefficient (MCC). The proposed models demonstrate statistical significance and effectively diagnose auditory disorders, contributing to early detection and personalized treatment, thereby enhancing patient outcomes and quality of life. Notably, they exhibit reliability and robustness, characterized by high Kappa and MCC values. This research represents a significant advancement in the intersection of audiology, neuroimaging, and machine learning, with transformative implications for clinical practice and care.
Collapse
Affiliation(s)
- Muhammad Atta Othman Ahmed
- Department of Computer Science, Faculty of Computers and Information, Luxor University, 85951, Luxor, Egypt.
| | - Yasser Abdel Satar
- Mathematics Department, Faculty of Science, Sohag University, 82511, Sohag, Egypt
| | - Eed M Darwish
- Physics Department, College of Science, Taibah University, Medina, 41411, Saudi Arabia
- Physics Department, Faculty of Science, Sohag University, 82524, Sohag, Egypt
| | - Elnomery A Zanaty
- Department of Computer Science, Faculty of Computers and Artificial Intelligence, Sohag University, 82511, Sohag, Egypt
| |
Collapse
|
10
|
Cheng S, Chen X, Zhang Y, Wang Y, Li X, Li X, Xie P. Multiscale information interaction at local frequency band in functional corticomuscular coupling. Cogn Neurodyn 2023; 17:1575-1589. [PMID: 37974587 PMCID: PMC10640559 DOI: 10.1007/s11571-022-09895-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 08/17/2022] [Accepted: 09/18/2022] [Indexed: 11/26/2022] Open
Abstract
The multiscale information interaction between the cortex and the corresponding muscles is of great significance for understanding the functional corticomuscular coupling (FCMC) in the sensory-motor systems. Though the multiscale transfer entropy (MSTE) method can effectively detect the multiscale characteristics between two signals, it lacks in describing the local frequency-band characteristics. Therefore, to quantify the multiscale interaction at local-frequency bands between the cortex and the muscles, we proposed a novel method, named bivariate empirical mode decomposition-MSTE (BMSTE), by combining the bivariate empirical mode decomposition (BEMD) with MSTE. To verify this, we introduced two simulation models and then applied it to explore the FCMC by analyzing the EEG over brain scalp and surface EMG signals from the effector muscles during steady-state force output. The simulation results showed that the BMSTE method could describe the multiscale time-frequency characteristics compared with the MSTE method, and was sensitive to the coupling strength but not to the data length. The experiment results showed that the coupling at beta1 (15-25 Hz), beta2 (25-35 Hz) and gamma (35-60 Hz) bands in the descending direction was higher than that in the opposition, and at beta2 band was higher than that at beta1 band. Furthermore, there were significant differences at the low scales in beta1 band, almost all scales in beta2 band, and high scales in gamma band. These results suggest the effectiveness of the BMSTE method in describing the interaction between two signals at different time-frequency scales, and further provide a novel approach to understand the motor control. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-022-09895-y.
Collapse
Affiliation(s)
- Shengcui Cheng
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| | - Xiaoling Chen
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
- Key Laboratory of Intelligent Rehabilitation and Neuromodulation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| | - Yuanyuan Zhang
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| | - Ying Wang
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| | - Xin Li
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| | - Xiaoli Li
- National Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China
| | - Ping Xie
- Key Laboratory of Measurement Technology and Instrumentation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
- Key Laboratory of Intelligent Rehabilitation and Neuromodulation of Hebei Province, Institute of Electric Engineering, Yanshan University, Qinhuangdao, Hebei China
| |
Collapse
|
11
|
Williams TF, Cohen AS, Sanchez-Lopez A, Joormann J, Mittal VA. Attentional biases in facial emotion processing in individuals at clinical high risk for psychosis. Eur Arch Psychiatry Clin Neurosci 2023; 273:1825-1835. [PMID: 36920535 PMCID: PMC10502185 DOI: 10.1007/s00406-023-01582-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 02/26/2023] [Indexed: 03/16/2023]
Abstract
Individuals at clinical high risk (CHR) for psychosis exhibit altered facial emotion processing (FEP) and poor social functioning. It is unclear whether FEP deficits result from attentional biases, and further, how these abnormalities are linked to symptomatology (e.g., negative symptoms) and highly comorbid disorders that are also tied to abnormal FEP (e.g., depression). In the present study, we employed an eye-tracking paradigm to assess attentional biases and clinical interviews to examine differences between CHR (N = 34) individuals and healthy controls (HC; N = 46), as well as how such biases relate to symptoms and functioning in CHR individuals. Although no CHR-HC differences emerged in attentional biases, within the CHR group, symptoms and functioning were related to biases. Depressive symptoms were related to some free-view attention switching biases (e.g., to and from fearful faces, r = .50). Negative symptoms were related to more slowly disengaging from happy faces (r = .44), spending less time looking at neutral faces (r = - .42), and more time looking at no face (Avolition, r = .44). In addition, global social functioning was related to processes that overlapped with both depression and negative symptoms, including time looking at no face (r = - .68) and free-view attention switching with fearful faces (r = - .40). These findings are consistent with previous research, indicating that negative symptoms play a prominent role in the CHR syndrome, with distinct mechanisms relative to depression. Furthermore, the results suggest that attentional bias indices from eye-tracking paradigms may be predictive of social functioning.
Collapse
Affiliation(s)
- Trevor F Williams
- Department of Psychology, Northwestern University, Evanston, IL, 60208, USA.
| | - Alex S Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Alvaro Sanchez-Lopez
- Department of Clinical Psychology, Complutense University of Madrid, Madrid, 28223, Spain
| | - Jutta Joormann
- Department of Psychology, Yale University, New Haven, CT, 06520, USA
| | - Vijay A Mittal
- Department of Psychology, Northwestern University, Evanston, IL, 60208, USA
| |
Collapse
|
12
|
Bai Z, Hou F, Sun K, Wu Q, Zhu M, Mao Z, Song Y, Gao Q. SECT: A Method of Shifted EEG Channel Transformer for Emotion Recognition. IEEE J Biomed Health Inform 2023; 27:4758-4767. [PMID: 37540609 DOI: 10.1109/jbhi.2023.3301993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
Recently, electroencephalographic (EEG) emotion recognition attract attention in the field of human-computer interaction (HCI). However, most of the existing EEG emotion datasets primarily consist of data from normal human subjects. To enhance diversity, this study aims to collect EEG signals from 30 hearing-impaired subjects while they watch video clips displaying six different emotions (happiness, inspiration, neutral, anger, fear, and sadness). The frequency domain feature matrix of EEG signals, which comprise power spectral density (PSD) and differential entropy (DE), were up-sampled using cubic spline interpolation to capture the correlation among different channels. To select emotion representation information from both global and localized brain regions, a novel method called Shifted EEG Channel Transformer (SECT) was proposed. The SECT method consists of two layers: the first layer utilizes the traditional channel Transformer (CT) structure to process information from global brain regions, while the second layer acquires localized information from centrally symmetrical and reorganized brain regions by shifted channel Transformer (S-CT). We conducted a subject-dependent experiment, and the accuracy of the PSD and DE features reached 82.51% and 84.76%, respectively, for the six kinds of emotion classification. Moreover, subject-independent experiments were conducted on a public dataset, yielding accuracies of 85.43% (3-classification, SEED), 66.83% (2-classification on Valence, DEAP), and 65.31% (2-classification on Arouse, DEAP), respectively.
Collapse
|
13
|
Domke AK, Hempel M, Hartling C, Stippl A, Carstens L, Gruzman R, Herrera Melendez AL, Bajbouj M, Gärtner M, Grimm S. Functional connectivity changes between amygdala and prefrontal cortex after ECT are associated with improvement in distinct depressive symptoms. Eur Arch Psychiatry Clin Neurosci 2023; 273:1489-1499. [PMID: 36715751 PMCID: PMC10465635 DOI: 10.1007/s00406-023-01552-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 01/09/2023] [Indexed: 01/31/2023]
Abstract
Electroconvulsive therapy (ECT) is one of the most effective treatments for treatment-resistant depression. However, the underlying mechanisms of action are not yet fully understood. The investigation of depression-specific networks using resting-state fMRI and the relation to differential symptom improvement might be an innovative approach providing new insights into the underlying processes. In this naturalistic study, we investigated the relationship between changes in resting-state functional connectivity (rsFC) and symptom improvement after ECT in 21 patients with treatment-resistant depression. We investigated rsFC before and after ECT and focused our analyses on FC changes directly related to symptom reduction and on FC at baseline to identify neural targets that might predict individual clinical responses to ECT. Additional analyses were performed to identify the direct relationship between rsFC change and symptom dimensions such as sadness, negative thoughts, detachment, and neurovegetative symptoms. An increase in rsFC between the left amygdala and left dorsolateral prefrontal cortex (DLPFC) after ECT was related to overall symptom reduction (Bonferroni-corrected p = 0.033) as well as to a reduction in specific symptoms such as sadness (r = 0.524, uncorrected p = 0.014), negative thoughts (r = 0.700, Bonferroni-corrected p = 0.002) and detachment (r = 0.663, p = 0.004), but not in neurovegetative symptoms. Furthermore, high baseline rsFC between the left amygdala and the right frontal pole (FP) predicted treatment outcome (uncorrected p = 0.039). We conclude that changes in FC in regions of the limbic-prefrontal network are associated with symptom improvement, particularly in affective and cognitive dimensions. Frontal-limbic connectivity has the potential to predict symptom improvement after ECT. Further research combining functional imaging biomarkers and a symptom-based approach might be promising.
Collapse
Affiliation(s)
- Ann-Kathrin Domke
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany.
| | - Moritz Hempel
- Department of Psychology, MSB Medical School Berlin, Rüdesheimer Straße 50, 14197, Berlin, Germany
| | - Corinna Hartling
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Anna Stippl
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Luisa Carstens
- Department of Psychology, MSB Medical School Berlin, Rüdesheimer Straße 50, 14197, Berlin, Germany
| | - Rebecca Gruzman
- Department of Psychology, MSB Medical School Berlin, Rüdesheimer Straße 50, 14197, Berlin, Germany
| | - Ana Lucia Herrera Melendez
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Malek Bajbouj
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Matti Gärtner
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
- Department of Psychology, MSB Medical School Berlin, Rüdesheimer Straße 50, 14197, Berlin, Germany
| | - Simone Grimm
- Department of Psychiatry, Centre for Affective Neuroscience (CAN), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany
- Department of Psychology, MSB Medical School Berlin, Rüdesheimer Straße 50, 14197, Berlin, Germany
- Department of Psychiatry, Psychotherapy and Psychosomatics, Psychiatric Hospital, University of Zurich, Lenggstrasse 31, 8032, Zurich, Switzerland
| |
Collapse
|
14
|
Schiano di Cola V, Chiaro D, Prezioso E, Izzo S, Giampaolo F. Insight Extraction From E-Health Bookings by Means of Hypergraph and Machine Learning. IEEE J Biomed Health Inform 2023; 27:4649-4659. [PMID: 37018305 DOI: 10.1109/jbhi.2022.3233498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
New technologies are transforming medicine, and this revolution starts with data. Usually, health services within public healthcare systems are accessed through a booking centre managed by local health authorities and controlled by the regional government. In this perspective, structuring e-health data through a Knowledge Graph (KG) approach can provide a feasible method to quickly and simply organize data and/or retrieve new information. Starting from raw health bookings data from the public healthcare system in Italy, a KG method is presented to support e-health services through the extraction of medical knowledge and novel insights. By exploiting graph embedding which arranges the various attributes of the entities into the same vector space, we are able to apply Machine Learning (ML) techniques to the embedded vectors. The findings suggest that KGs could be used to assess patients' medical booking patterns, either from unsupervised or supervised ML. In particular, the former can determine possible presence of hidden groups of entities that is not immediately available through the original legacy dataset structure. The latter, although the performance of the used algorithms is not very high, shows encouraging results in predicting a patient's likelihood to undergo a particular medical visit within a year. However, many technological advances remain to be made, especially in graph database technologies and graph embedding algorithms.
Collapse
|
15
|
Quayson E, Ganaa ED, Zhu Q, Shen XJ. Multi-view Representation Induced Kernel Ensemble Support Vector Machine. Neural Process Lett 2023. [DOI: 10.1007/s11063-023-11250-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
16
|
Li Z, Zhang G, Wang L, Wei J, Dang J. Emotion recognition using spatial-temporal EEG features through convolutional graph attention network. J Neural Eng 2023; 20. [PMID: 36720164 DOI: 10.1088/1741-2552/acb79e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 01/31/2023] [Indexed: 02/02/2023]
Abstract
Objective.Constructing an efficient human emotion recognition model based on electroencephalogram (EEG) signals is significant for realizing emotional brain-computer interaction and improving machine intelligence.Approach.In this paper, we present a spatial-temporal feature fused convolutional graph attention network (STFCGAT) model based on multi-channel EEG signals for human emotion recognition. First, we combined the single-channel differential entropy (DE) feature with the cross-channel functional connectivity (FC) feature to extract both the temporal variation and spatial topological information of EEG. After that, a novel convolutional graph attention network was used to fuse the DE and FC features and further extract higher-level graph structural information with sufficient expressive power for emotion recognition. Furthermore, we introduced a multi-headed attention mechanism in graph neural networks to improve the generalization ability of the model.Main results.We evaluated the emotion recognition performance of our proposed model on the public SEED and DEAP datasets, which achieved a classification accuracy of 99.11% ± 0.83% and 94.83% ± 3.41% in the subject-dependent and subject-independent experiments on the SEED dataset, and achieved an accuracy of 91.19% ± 1.24% and 92.03% ± 4.57% for discrimination of arousal and valence in subject-independent experiments on DEAP dataset. Notably, our model achieved state-of-the-art performance on cross-subject emotion recognition tasks for both datasets. In addition, we gained insight into the proposed frame through both the ablation experiments and the analysis of spatial patterns of FC and DE features.Significance.All these results prove the effectiveness of the STFCGAT architecture for emotion recognition and also indicate that there are significant differences in the spatial-temporal characteristics of the brain under different emotional states.
Collapse
Affiliation(s)
- Zhongjie Li
- Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, People's Republic of China
| | - Gaoyan Zhang
- Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, People's Republic of China
| | - Longbiao Wang
- Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, People's Republic of China
| | - Jianguo Wei
- Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, People's Republic of China
| | - Jianwu Dang
- Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, People's Republic of China
| |
Collapse
|
17
|
Peketi S, Dhok SB. Machine Learning Enabled P300 Classifier for Autism Spectrum Disorder Using Adaptive Signal Decomposition. Brain Sci 2023; 13:brainsci13020315. [PMID: 36831857 PMCID: PMC9954262 DOI: 10.3390/brainsci13020315] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/06/2023] [Accepted: 02/08/2023] [Indexed: 02/16/2023] Open
Abstract
Joint attention skills deficiency in Autism spectrum disorder (ASD) hinders individuals from communicating effectively. The P300 Electroencephalogram (EEG) signal-based brain-computer interface (BCI) helps these individuals in neurorehabilitation training to overcome this deficiency. The detection of the P300 signal is more challenging in ASD as it is noisy, has less amplitude, and has a higher latency than in other individuals. This paper presents a novel application of the variational mode decomposition (VMD) technique in a BCI system involving ASD subjects for P300 signal identification. The EEG signal is decomposed into five modes using VMD. Thirty linear and non-linear time and frequency domain features are extracted for each mode. Synthetic minority oversampling technique data augmentation is performed to overcome the class imbalance problem in the chosen dataset. Then, a comparative analysis of three popular machine learning classifiers is performed for this application. VMD's fifth mode with a support vector machine (fine Gaussian kernel) classifier gave the best performance parameters, namely accuracy, F1-score, and the area under the curve, as 91.12%, 91.18%, and 96.6%, respectively. These results are better when compared to other state-of-the-art methods.
Collapse
|
18
|
Srinivasulu A, Sriraam N, Prakash VS. A Signal Processing Framework for the Detection of Abnormal Cardiac Episodes. Cardiovasc Eng Technol 2023; 14:331-349. [PMID: 36750523 DOI: 10.1007/s13239-023-00656-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Accepted: 01/24/2023] [Indexed: 02/09/2023]
Abstract
MOTIVATION Cardiologists rely on the long duration Holter electrocardiogram (ECG) recordings in general for assessment of abnormal episodes and such process found to be tedious and time consuming. An automatic abnormal cardiac episode detection algorithm is the need of the hour that needs to be optimized to reduce the manual burden. OBJECTIVE The current study presents a signal processing framework with a cross-database to detect abnormal episodes in long-term ECG signals. METHODOLOGY The data was pre-processed to remove power line interference and baseline drift using basis pursuit sparsely decomposed tunable-Q wavelet transform (BPSD-TQWT). A total of 44 features of time domain, frequency domain, and time-frequency domain characteristics were extracted from the ECG signal. This proposed work tested classification performance with support vector machine (SVM), K-nearest neighbour (KNN), decision tree, naïve Bayes, the nearest mean classifier, and the nearest root mean square classifiers. The trained models with open-source data were used to predict the abnormal episodes from the proprietary database and vice versa. Finally, the performance was analysed via recall rate, specificity, precision, F1-score, and accuracy. RESULTS Among six classification models, SVM performed best. With an open-source database, the SVM model achieved 95.01% accuracy, and detected the abnormal episodes from proprietary database with an accuracy of 99.31%. In addition, with the proprietary database SVM model classified the normal-abnormal cardiac episodes with an accuracy of 99.89% and detected the abnormal episodes from proprietary database with an accuracy of 92.51%. CONCLUSION When the performance results were compared with the literature, it was observed that the proposed framework performed well. As a result, the proposed framework could be used in an autonomous diagnosis system.
Collapse
Affiliation(s)
- Avvaru Srinivasulu
- Department of Electrical, Electronics and Communication Engineering, GITAM, Bangalore Campus, Bengaluru, India
| | - N Sriraam
- Center for Medical Electronics and Computing, M.S. Ramaiah Institute of Technology, Bengaluru, India.
| | - V S Prakash
- Department of Cardiology, M.S. Ramaiah Medical College and Hospitals, Bengaluru, India
| |
Collapse
|
19
|
Wang H, Zheng X, Hao T, Yu Y, Xu K, Wang Y. Research on mental load state recognition based on combined information sources. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
20
|
Parallel genetic algorithm based common spatial patterns selection on time–frequency decomposed EEG signals for motor imagery brain-computer interface. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
21
|
Su F, Wei M, Sun M, Jiang L, Dong Z, Wang J, Zhang C. Deep learning-based synapse counting and synaptic ultrastructure analysis of electron microscopy images. J Neurosci Methods 2023; 384:109750. [PMID: 36414102 DOI: 10.1016/j.jneumeth.2022.109750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 11/11/2022] [Accepted: 11/18/2022] [Indexed: 11/21/2022]
Abstract
BACKGROUND Synapses are the connections between neurons in the central nervous system (CNS) or between neurons and other excitable cells in the peripheral nervous system (PNS), where electrical or chemical signals rapidly travel through one cell to another with high spatial precision. Synaptic analysis, based on synapse numbers and fine morphology, is the basis for understanding neurological functions and diseases. Manual analysis of synaptic structures in electron microscopy (EM) images is often limited by low efficiency and subjective bias. NEW METHOD We developed a multifunctional synaptic analysis system based on several advanced deep learning (DL) models. The system achieved synapse counting in low-magnification EM images and synaptic ultrastructure analysis in high-magnification EM images. RESULTS The synapse counting system based on ResNet18 and a Faster R-CNN model had a mean average precision (mAP) of 92.55%. For synaptic ultrastructure analysis, the Faster R-CNN model based on ResNet50 achieved a mAP of 91.60%, the DeepLab v3 + model based on ResNet50 enabled high performance in presynaptic and postsynaptic membrane segmentation with a global accuracy of 0.9811, and the Faster R-CNN model based on ResNet18 achieved a mAP of 91.41% for synaptic vesicle detection. CONCLUSIONS The proposed multifunctional synaptic analysis system may help to overcome the experimental bias inherent in manual analysis, thereby facilitating EM image-based synaptic function studies.
Collapse
Affiliation(s)
- Feng Su
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China; Chinese Institute for Brain Research, Beijing 102206, China; State Key Laboratory of Translational Medicine and Innovative Drug Development, Nanjing 210000, Jiangsu, China; Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Mengping Wei
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China
| | - Meng Sun
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China
| | - Lixin Jiang
- Peking University Institute of Mental Health (Sixth Hospital), No. 51 Huayuanbei Road, Haidian District, Beijing 100191, China
| | - Zhaoqi Dong
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China
| | - Jue Wang
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China
| | - Chen Zhang
- Department of Neurobiology, School of Basic Medical Sciences, Beijing Key Laboratory of Neural Regeneration and Repair, Capital Medical University, Beijing 100069, China; Chinese Institute for Brain Research, Beijing 102206, China; State Key Laboratory of Translational Medicine and Innovative Drug Development, Nanjing 210000, Jiangsu, China.
| |
Collapse
|
22
|
Sun C, Li H, Ma L. Speech emotion recognition based on improved masking EMD and convolutional recurrent neural network. Front Psychol 2023; 13:1075624. [PMID: 36698559 PMCID: PMC9869168 DOI: 10.3389/fpsyg.2022.1075624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 12/16/2022] [Indexed: 01/12/2023] Open
Abstract
Speech emotion recognition (SER) is the key to human-computer emotion interaction. However, the nonlinear characteristics of speech emotion are variable, complex, and subtly changing. Therefore, accurate recognition of emotions from speech remains a challenge. Empirical mode decomposition (EMD), as an effective decomposition method for nonlinear non-stationary signals, has been successfully used to analyze emotional speech signals. However, the mode mixing problem of EMD affects the performance of EMD-based methods for SER. Various improved methods for EMD have been proposed to alleviate the mode mixing problem. These improved methods still suffer from the problems of mode mixing, residual noise, and long computation time, and their main parameters cannot be set adaptively. To overcome these problems, we propose a novel SER framework, named IMEMD-CRNN, based on the combination of an improved version of the masking signal-based EMD (IMEMD) and convolutional recurrent neural network (CRNN). First, IMEMD is proposed to decompose speech. IMEMD is a novel disturbance-assisted EMD method and can determine the parameters of masking signals to the nature of signals. Second, we extract the 43-dimensional time-frequency features that can characterize the emotion from the intrinsic mode functions (IMFs) obtained by IMEMD. Finally, we input these features into a CRNN network to recognize emotions. In the CRNN, 2D convolutional neural networks (CNN) layers are used to capture nonlinear local temporal and frequency information of the emotional speech. Bidirectional gated recurrent units (BiGRU) layers are used to learn the temporal context information further. Experiments on the publicly available TESS dataset and Emo-DB dataset demonstrate the effectiveness of our proposed IMEMD-CRNN framework. The TESS dataset consists of 2,800 utterances containing seven emotions recorded by two native English speakers. The Emo-DB dataset consists of 535 utterances containing seven emotions recorded by ten native German speakers. The proposed IMEMD-CRNN framework achieves a state-of-the-art overall accuracy of 100% for the TESS dataset over seven emotions and 93.54% for the Emo-DB dataset over seven emotions. The IMEMD alleviates the mode mixing and obtains IMFs with less noise and more physical meaning with significantly improved efficiency. Our IMEMD-CRNN framework significantly improves the performance of emotion recognition.
Collapse
|
23
|
Wu M, Teng W, Fan C, Pei S, Li P, Lv Z. An Investigation of Olfactory-Enhanced Video on EEG-Based Emotion Recognition. IEEE Trans Neural Syst Rehabil Eng 2023; 31:1602-1613. [PMID: 37028354 DOI: 10.1109/tnsre.2023.3253866] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Collecting emotional physiological signals is significant in building affective Human-Computer Interactions (HCI). However, how to evoke subjects' emotions efficiently in EEG-related emotional experiments is still a challenge. In this work, we developed a novel experimental paradigm that allows odors dynamically participate in different stages of video-evoked emotions, to investigate the efficiency of olfactory-enhanced videos in inducing subjects' emotions; According to the period that the odors participated in, the stimuli were divided into four patterns, i.e., the olfactory-enhanced video in early/later stimulus periods (OVEP/OVLP), and the traditional videos in early/later stimulus periods (TVEP/TVLP). The differential entropy (DE) feature and four classifiers were employed to test the efficiency of emotion recognition. The best average accuracies of the OVEP, OVLP, TVEP, and TVLP were 50.54%, 51.49%, 40.22%, and 57.55%, respectively. The experimental results indicated that the OVEP significantly outperformed the TVEP on classification performance, while there was no significant difference between the OVLP and TVLP. Besides, olfactory-enhanced videos achieved higher efficiency in evoking negative emotions than traditional videos. Moreover, we found that the neural patterns in response to emotions under different stimulus methods were stable, and for Fp1, FP2, and F7, there existed significant differences in whether adopt the odors.
Collapse
Affiliation(s)
- Minchao Wu
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| | - Wei Teng
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| | - Cunhang Fan
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| | - Shengbing Pei
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| | - Ping Li
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| | - Zhao Lv
- Anhui Province Key Laboratory of Multimodal Cognitive Computation and the School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
24
|
Jun S, Joo Y, Sim Y, Pyo C, Ham K. Fronto-parietal single-trial brain connectivity benefits successful memory recognition. Transl Neurosci 2022; 13:506-513. [PMID: 36660006 PMCID: PMC9816457 DOI: 10.1515/tnsci-2022-0265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 11/11/2022] [Accepted: 11/22/2022] [Indexed: 01/04/2023] Open
Abstract
Successful recognition has been known to produce distinct patterns of neural activity. Many studies have used spectral power or event-related potentials of single recognition-specific regions as classification features. However, this does not accurately reflect the mechanisms behind recognition, in that recognition requires multiple brain regions to work together. Hence, classification accuracy of subsequent memory performance could be improved by using functional connectivity within memory-related brain networks instead of using local brain activity as classifiers. In this study, we examined electroencephalography (EEG) signals while performing a word recognition memory task. Recorded EEG signals were collected using a 32-channel cap. Connectivity measures related to the left hemispheric fronto-parietal connectivity (P3 and F3) were found to contribute to the accurate recognition of previously studied memory items. Classification of subsequent memory outcome using connectivity features revealed that the classifier with support vector machine achieved the highest classification accuracy of 86.79 ± 5.93% (mean ± standard deviation) by using theta (3-8 Hz) connectivity during successful recognition trials. The results strongly suggest that highly accurate classification of subsequent memory outcome can be achieved by using single-trial functional connectivity.
Collapse
Affiliation(s)
- Soyeon Jun
- Neuroscience Research Institute, Seoul National University College of Medicine, Seoul, South Korea
| | - Yihyun Joo
- National Forensic Services, Forensic Medical Examination Division, 10, Ipchun-ro, Wonju-si, Gangwon-do, 26460, South Korea
| | - Youjin Sim
- National Forensic Services, Forensic Medical Examination Division, 10, Ipchun-ro, Wonju-si, Gangwon-do, 26460, South Korea
| | - Chuyun Pyo
- National Forensic Services, Forensic Medical Examination Division, 10, Ipchun-ro, Wonju-si, Gangwon-do, 26460, South Korea
| | - Keunsoo Ham
- National Forensic Services, Forensic Medical Examination Division, 10, Ipchun-ro, Wonju-si, Gangwon-do, 26460, South Korea
| |
Collapse
|
25
|
Zhao X, Chen J, Chen T, Wang S, Liu Y, Zeng X, Liu G. Responses of functional brain networks in micro-expressions: An EEG study. Front Psychol 2022; 13:996905. [DOI: 10.3389/fpsyg.2022.996905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 10/04/2022] [Indexed: 11/13/2022] Open
Abstract
Micro-expressions (MEs) can reflect an individual’s subjective emotions and true mental state, and they are widely used in the fields of mental health, justice, law enforcement, intelligence, and security. However, one of the major challenges of working with MEs is that their neural mechanism is not entirely understood. To the best of our knowledge, the present study is the first to use electroencephalography (EEG) to investigate the reorganizations of functional brain networks involved in MEs. We aimed to reveal the underlying neural mechanisms that can provide electrophysiological indicators for ME recognition. A real-time supervision and emotional expression suppression experimental paradigm was designed to collect video and EEG data of MEs and no expressions (NEs) of 70 participants expressing positive emotions. Based on the graph theory, we analyzed the efficiency of functional brain network at the scalp level on both macro and micro scales. The results revealed that in the presence of MEs compared with NEs, the participants exhibited higher global efficiency and nodal efficiency in the frontal, occipital, and temporal regions. Additionally, using the random forest algorithm to select a subset of functional connectivity features as input, the support vector machine classifier achieved a classification accuracy for MEs and NEs of 0.81, with an area under the curve of 0.85. This finding demonstrates the possibility of using EEG to recognize MEs, with a wide range of application scenarios, such as persons wearing face masks or patients with expression disorders.
Collapse
|
26
|
Li M, Liu Y, Liu Y, Pu C, Yin R, Zeng Z, Deng L, Wang X. Resting-state EEG-based convolutional neural network for the diagnosis of depression and its severity. Front Physiol 2022; 13:956254. [PMID: 36299253 PMCID: PMC9589234 DOI: 10.3389/fphys.2022.956254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Purpose: The study aimed to assess the value of the resting-state electroencephalogram (EEG)-based convolutional neural network (CNN) method for the diagnosis of depression and its severity in order to better serve depressed patients and at-risk populations. Methods: In this study, we used the resting state EEG-based CNN to identify depression and evaluated its severity. The EEG data were collected from depressed patients and healthy people using the Nihon Kohden EEG-1200 system. Analytical processing of resting-state EEG data was performed using Python and MATLAB software applications. The questionnaire included the Self-Rating Anxiety Scale (SAS), Self-Rating Depression Scale (SDS), Symptom Check-List-90 (SCL-90), and the Eysenck Personality Questionnaire (EPQ). Results: A total of 82 subjects were included in this study, with 41 in the depression group and 41 in the healthy control group. The area under the curve (AUC) of the resting-state EEG-based CNN in depression diagnosis was 0.74 (95%CI: 0.70–0.77) with an accuracy of 66.40%. In the depression group, the SDS, SAS, SCL-90 subscales, and N scores were significantly higher in the major depression group than those in the non-major depression group (p < 0.05). The AUC of the model in depression severity was 0.70 (95%CI: 0.65–0.75) with an accuracy of 66.93%. Correlation analysis revealed that major depression AI scores were significantly correlated with SAS scores (r = 0.508, p = 0.003) and SDS scores (r = 0.765, p < 0.001). Conclusion: Our model can accurately identify the depression-specific EEG signal in terms of depression diagnosis and severity identification. It would eventually provide new strategies for early diagnosis of depression and its severity.
Collapse
Affiliation(s)
- Mengqian Li
- Department of Psychosomatic Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Yuan Liu
- Department of Psychosomatic Medicine, The First Affiliated Hospital of Nanchang University, Nanchang, China
| | - Yan Liu
- Second Clinical Medical College, Nanchang University, Nanchang, China
| | - Changqin Pu
- Queen Mary College, Nanchang University, Nanchang, China
| | - Ruocheng Yin
- Queen Mary College, Nanchang University, Nanchang, China
| | - Ziqiang Zeng
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
- School of Public Health, Nanchang University, Nanchang, China
| | - Libin Deng
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
- *Correspondence: Libin Deng, ; Xing Wang,
| | - Xing Wang
- School of Life Sciences, Nanchang University, Nanchang, China
- Clinical Medical Experiment Center, Nanchang University, Nanchang, China
- *Correspondence: Libin Deng, ; Xing Wang,
| |
Collapse
|
27
|
Huxley muscle model surrogates for high-speed multi-scale simulations of cardiac contraction. Comput Biol Med 2022; 149:105963. [DOI: 10.1016/j.compbiomed.2022.105963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/17/2022] [Accepted: 08/13/2022] [Indexed: 11/19/2022]
|
28
|
Zuo X, Zhang C, Hämäläinen T, Gao H, Fu Y, Cong F. Cross-Subject Emotion Recognition Using Fused Entropy Features of EEG. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1281. [PMID: 36141167 PMCID: PMC9497745 DOI: 10.3390/e24091281] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/04/2022] [Accepted: 09/05/2022] [Indexed: 06/16/2023]
Abstract
Emotion recognition based on electroencephalography (EEG) has attracted high interest in fields such as health care, user experience evaluation, and human-computer interaction (HCI), as it plays an important role in human daily life. Although various approaches have been proposed to detect emotion states in previous studies, there is still a need to further study the dynamic changes of EEG in different emotions to detect emotion states accurately. Entropy-based features have been proved to be effective in mining the complexity information in EEG in many areas. However, different entropy features vary in revealing the implicit information of EEG. To improve system reliability, in this paper, we propose a framework for EEG-based cross-subject emotion recognition using fused entropy features and a Bidirectional Long Short-term Memory (BiLSTM) network. Features including approximate entropy (AE), fuzzy entropy (FE), Rényi entropy (RE), differential entropy (DE), and multi-scale entropy (MSE) are first calculated to study dynamic emotional information. Then, we train a BiLSTM classifier with the inputs of entropy features to identify different emotions. Our results show that MSE of EEG is more efficient than other single-entropy features in recognizing emotions. The performance of BiLSTM is further improved with an accuracy of 70.05% using fused entropy features compared with that of single-type feature.
Collapse
Affiliation(s)
- Xin Zuo
- School of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, China
- Faculty of Information Technology, University of Jyväskylä, 40014 Jyväskylä, Finland
| | - Chi Zhang
- School of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, China
- Liaoning Key Laboratory of Integrated Circuit and Biomedical Electronic System, Dalian 116024, China
| | - Timo Hämäläinen
- Faculty of Information Technology, University of Jyväskylä, 40014 Jyväskylä, Finland
| | - Hanbing Gao
- School of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, China
| | - Yu Fu
- School of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, China
| | - Fengyu Cong
- School of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, China
- Faculty of Information Technology, University of Jyväskylä, 40014 Jyväskylä, Finland
| |
Collapse
|
29
|
Rachkovskij DA. Representation of spatial objects by shift-equivariant similarity-preserving hypervectors. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07619-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
30
|
Aly M, Alotaibi NS. A novel deep learning model to detect COVID-19 based on wavelet features extracted from Mel-scale spectrogram of patients' cough and breathing sounds. INFORMATICS IN MEDICINE UNLOCKED 2022; 32:101049. [PMID: 35989705 PMCID: PMC9375256 DOI: 10.1016/j.imu.2022.101049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 10/26/2022] Open
Abstract
The goal of this paper is to classify the various cough and breath sounds of COVID-19 artefacts in the signals from dynamic real-life environments. The main reason for choosing cough and breath sounds than other common symptoms to detect COVID-19 patients from the comfort of their homes, so that they do not overload the Medicare system and therefore do not unwittingly spread the disease by regularly monitoring themselves. The presented model includes two main phases. The first phase is the sound-to-image transformation, which is improved by the Mel-scale spectrogram approach. The second phase consists of extraction of features and classification using nine deep transfer models (ResNet18/34/50/100/101, GoogLeNet, SqueezeNet, MobileNetv2, and NasNetmobile). The dataset contains information data from almost 1600 people (1185 Male and 415 Female) from all over the world. Our classification model is the most accurate, its accuracy is 99.2% according to the SGDM optimizer. The accuracy is good enough that a large set of labelled cough and breath data may be used to check the possibility for generalization. The results demonstrate that ResNet18 is the best stable model for classifying cough and breath tones from a restricted dataset, with a sensitivity of 98.3% and a specificity of 97.8%. Finally, the presented model is shown to be more trustworthy and accurate than any other present model. Cough and breath study accuracy is promising enough to put extrapolation and generalization to the test.
Collapse
Affiliation(s)
- Mohammed Aly
- Department of Artificial Intelligence, Faculty of Computers and Artificial Intelligence, Egyptian Russian University, Badr City, 11829, Cairo, Egypt
| | - Nouf Saeed Alotaibi
- Department of Computer Science, College of Science, Shaqra University, Shaqra City, 11961, Saudi Arabia
| |
Collapse
|
31
|
Giangiacomo E, Visaggi MC, Aceti F, Giacchetti N, Martucci M, Giovannone F, Valente D, Galeoto G, Tofani M, Sogos C. Early Neuro-Psychomotor Therapy Intervention for Theory of Mind and Emotion Recognition in Neurodevelopmental Disorders: A Pilot Study. CHILDREN (BASEL, SWITZERLAND) 2022; 9:children9081142. [PMID: 36010032 PMCID: PMC9406700 DOI: 10.3390/children9081142] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 07/26/2022] [Accepted: 07/27/2022] [Indexed: 11/22/2022]
Abstract
The aim of the present study is to explore the effect of early neuro-psychomotor therapy to improve theory of mind skills and emotion recognition in children with neurodevelopmental disorders. A pilot study was set up, consisting of in-group training activities based on the neuro-psychomotor approach. Children were evaluated using Neuropsychological Assessment for Child (Nepsy-II), Test of Emotion Comprehension (TEC), and Social Communication Questionnaire (SCQ). For data analysis, one-sample Wilcoxon signed rank test was used with a significance of p < 0.05. Two children with a developmental language disorder and four children with autism spectrum disorders participated in a 3-month training program. Our findings revealed significant improvement in emotion recognition, as measured with Nepsy-II (p = 0.04), while no statistical improvement was found for theory of mind. Despite the limited sample, early neuro-psychomotor therapy improves emotion recognition skills in children with neurodevelopmental disorders. However, considering the explorative nature of the study, findings should be interpreted with caution.
Collapse
|
32
|
Yang C, Wang X, Yao L, Long G, Jiang J, Xu G. Attentional Gated Res2Net for Multivariate Time Series Classification. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10944-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractMultivariate time series classification is a critical problem in data mining with broad applications. It requires harnessing the inter-relationship of multiple variables and various ranges of temporal dependencies to assign the correct classification label of the time series. Multivariate time series may come from a wide range of sources and be used in various scenarios, bringing the classifier challenge of temporal representation learning. We propose a novel convolutional neural network architecture called Attentional Gated Res2Net for multivariate time series classification. Our model uses hierarchical residual-like connections to achieve multi-scale receptive fields and capture multi-granular temporal information. The gating mechanism enables the model to consider the relations between the feature maps extracted by receptive fields of multiple sizes for information fusion. Further, we propose two types of attention modules, channel-wise attention and block-wise attention, to better leverage the multi-granular temporal patterns. Our experimental results on 14 benchmark multivariate time-series datasets show that our model outperforms several baselines and state-of-the-art methods by a large margin. Our model outperforms the SOTA by a large margin, the classification accuracy of our model is 10.16% better than the SOTA model. Besides, we demonstrate that our model improves the performance of existing models when used as a plugin. Further, based on our experiments and analysis, we provide practical advice on applying our model to a new problem.
Collapse
|
33
|
Comparison and Analysis of Acoustic Features of Western and Chinese Classical Music Emotion Recognition Based on V-A Model. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12125787] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Music emotion recognition is increasingly becoming important in scientific research and practical applications. Due to the differences in musical characteristics between Western and Chinese classical music, it is necessary to investigate the distinctions in music emotional feature sets to improve the accuracy of cross-cultural emotion recognition models. Therefore, a comparative study on emotion recognition in Chinese and Western classical music was conducted. Using the V-A model as an emotional perception model, approximately 1000 pieces of Western and Chinese classical excerpts in total were selected, and approximately 20-dimension feature sets for different emotional dimensions of different datasets were finally extracted. We considered different kinds of algorithms at each step of the training process, from pre-processing to feature selection and regression model selection. The results reveal that the combination of MaxAbsScaler pre-processing and the wrapper method using the recursive feature elimination algorithm based on extremely randomized trees is the optimal algorithm. The harmonic change detection function is a culturally universal feature, whereas spectral flux is a culturally specific feature for Chinese classical music. It is also found that pitch features are more significant for Western classical music, whereas loudness and rhythm features are more significant for Chinese classical music.
Collapse
|
34
|
Visual Affective Stimulus Database: A Validated Set of Short Videos. Behav Sci (Basel) 2022; 12:bs12050137. [PMID: 35621434 PMCID: PMC9138138 DOI: 10.3390/bs12050137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 05/02/2022] [Accepted: 05/05/2022] [Indexed: 12/10/2022] Open
Abstract
Two hundred and ninety-nine videos representing four categories (people, animals, objects, and scenes) were standardized using Adobe Premiere Pro CC 2018, with a unified duration of 3 s, a resolution of 1080 pixels/inch, and a size of 1920 × 1080 mm2. One-hundred and sixteen participants (mean age 22.60 ± 2.06 years; 51 males) assessed the videos by scoring, on a self-reported 9-point scale, three dimensions of emotion: valence, arousal, and dominance. The content was attributed a specific valence (positive, neutral, or negative) if more than 60% of the participants identified it with an emotion category. Results: In total, 242 short videos, including 112 positive videos, 47 neutral videos, and 83 negative videos, were retained in the video stimuli database. In the three-dimensional degree of emotion, the distribution relationship between them conformed to the fundamental characteristics of emotion. The internal consistency reliability coefficient for valence, arousal, and dominance was 0.968, 0.984, and 0.970. The internal consistency reliability of the emotional dimensions for people and faces, animals, objects, and scenes ranged between 0.799 and 0.968. Conclusions: The emotion short-video system contains multi-scene dynamic stimuli with good reliability and scoring distribution and is applicable in emotion and emotion-related research.
Collapse
|