1
|
Ma F, Yuan Y, Xie Y, Ren H, Liu I, He Y, Ren F, Yu FR, Ni S. Generative technology for human emotion recognition: A scoping review. INFORMATION FUSION 2025; 115:102753. [DOI: 10.1016/j.inffus.2024.102753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
|
2
|
Nia AF, Tang V, Talou GM, Billinghurst M. Synthesizing affective neurophysiological signals using generative models: A review paper. J Neurosci Methods 2024; 406:110129. [PMID: 38614286 DOI: 10.1016/j.jneumeth.2024.110129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/04/2024] [Accepted: 04/03/2024] [Indexed: 04/15/2024]
Abstract
The integration of emotional intelligence in machines is an important step in advancing human-computer interaction. This demands the development of reliable end-to-end emotion recognition systems. However, the scarcity of public affective datasets presents a challenge. In this literature review, we emphasize the use of generative models to address this issue in neurophysiological signals, particularly Electroencephalogram (EEG) and Functional Near-Infrared Spectroscopy (fNIRS). We provide a comprehensive analysis of different generative models used in the field, examining their input formulation, deployment strategies, and methodologies for evaluating the quality of synthesized data. This review serves as a comprehensive overview, offering insights into the advantages, challenges, and promising future directions in the application of generative models in emotion recognition systems. Through this review, we aim to facilitate the progression of neurophysiological data augmentation, thereby supporting the development of more efficient and reliable emotion recognition systems.
Collapse
Affiliation(s)
- Alireza F Nia
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand.
| | - Vanessa Tang
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| | - Gonzalo Maso Talou
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| | - Mark Billinghurst
- Auckland Bioengineering Institute, 70 Symonds Street, Auckland, 1010, New Zealand
| |
Collapse
|
3
|
Pang M, Wang H, Huang J, Vong CM, Zeng Z, Chen C. Multi-Scale Masked Autoencoders for Cross-Session Emotion Recognition. IEEE Trans Neural Syst Rehabil Eng 2024; 32:1637-1646. [PMID: 38619940 DOI: 10.1109/tnsre.2024.3389037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Affective brain-computer interfaces (aBCIs) have garnered widespread applications, with remarkable advancements in utilizing electroencephalogram (EEG) technology for emotion recognition. However, the time-consuming process of annotating EEG data, inherent individual differences, non-stationary characteristics of EEG data, and noise artifacts in EEG data collection pose formidable challenges in developing subject-specific cross-session emotion recognition models. To simultaneously address these challenges, we propose a unified pre-training framework based on multi-scale masked autoencoders (MSMAE), which utilizes large-scale unlabeled EEG signals from multiple subjects and sessions to extract noise-robust, subject-invariant, and temporal-invariant features. We subsequently fine-tune the obtained generalized features with only a small amount of labeled data from a specific subject for personalization and enable cross-session emotion recognition. Our framework emphasizes: 1) Multi-scale representation to capture diverse aspects of EEG signals, obtaining comprehensive information; 2) An improved masking mechanism for robust channel-level representation learning, addressing missing channel issues while preserving inter-channel relationships; and 3) Invariance learning for regional correlations in spatial-level representation, minimizing inter-subject and inter-session variances. Under these elaborate designs, the proposed MSMAE exhibits a remarkable ability to decode emotional states from a different session of EEG data during the testing phase. Extensive experiments conducted on the two publicly available datasets, i.e., SEED and SEED-IV, demonstrate that the proposed MSMAE consistently achieves stable results and outperforms competitive baseline methods in cross-session emotion recognition.
Collapse
|
4
|
Wang D, Lian J, Cheng H, Zhou Y. Music-evoked emotions classification using vision transformer in EEG signals. Front Psychol 2024; 15:1275142. [PMID: 38638516 PMCID: PMC11024288 DOI: 10.3389/fpsyg.2024.1275142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 03/20/2024] [Indexed: 04/20/2024] Open
Abstract
Introduction The field of electroencephalogram (EEG)-based emotion identification has received significant attention and has been widely utilized in both human-computer interaction and therapeutic settings. The process of manually analyzing electroencephalogram signals is characterized by a significant investment of time and work. While machine learning methods have shown promising results in classifying emotions based on EEG data, the task of extracting distinct characteristics from these signals still poses a considerable difficulty. Methods In this study, we provide a unique deep learning model that incorporates an attention mechanism to effectively extract spatial and temporal information from emotion EEG recordings. The purpose of this model is to address the existing gap in the field. The implementation of emotion EEG classification involves the utilization of a global average pooling layer and a fully linked layer, which are employed to leverage the discernible characteristics. In order to assess the effectiveness of the suggested methodology, we initially gathered a dataset of EEG recordings related to music-induced emotions. Experiments Subsequently, we ran comparative tests between the state-of-the-art algorithms and the method given in this study, utilizing this proprietary dataset. Furthermore, a publicly accessible dataset was included in the subsequent comparative trials. Discussion The experimental findings provide evidence that the suggested methodology outperforms existing approaches in the categorization of emotion EEG signals, both in binary (positive and negative) and ternary (positive, negative, and neutral) scenarios.
Collapse
Affiliation(s)
- Dong Wang
- School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan, China
- School of Intelligence Engineering, Shandong Management University, Jinan, China
| | - Jian Lian
- School of Intelligence Engineering, Shandong Management University, Jinan, China
| | - Hebin Cheng
- School of Intelligence Engineering, Shandong Management University, Jinan, China
| | - Yanan Zhou
- School of Arts, Beijing Foreign Studies University, Beijing, China
| |
Collapse
|
5
|
Gong M, Zhong W, Ye L, Zhang Q. MISNet: multi-source information-shared EEG emotion recognition network with two-stream structure. Front Neurosci 2024; 18:1293962. [PMID: 38419660 PMCID: PMC10899343 DOI: 10.3389/fnins.2024.1293962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/26/2024] [Indexed: 03/02/2024] Open
Abstract
Introduction When constructing machine learning and deep neural networks, the domain shift problem on different subjects complicates the subject independent electroencephalography (EEG) emotion recognition. Most of the existing domain adaptation methods either treat all source domains as equivalent or train source-specific learners directly, misleading the network to acquire unreasonable transfer knowledge and thus resulting in negative transfer. Methods This paper incorporates the individual difference and group commonality of distinct domains and proposes a multi-source information-shared network (MISNet) to enhance the performance of subject independent EEG emotion recognition models. The network stability is enhanced by employing a two-stream training structure with loop iteration strategy to alleviate outlier sources confusing the model. Additionally, we design two auxiliary loss functions for aligning the marginal distributions of domain-specific and domain shared features, and then optimize the convergence process by constraining gradient penalty on these auxiliary loss functions. Furthermore, the pre-training strategy is also proposed to ensure that the initial mapping of shared encoder contains sufficient emotional information. Results We evaluate the proposed MISNet to ascertain the impact of several hyper-parameters on the domain adaptation capability of network. The ablation experiments are conducted on two publically accessible datasets SEED and SEED-IV to assess the effectiveness of each loss function. Discussion The experimental results demonstrate that by disentangling private and shared emotional characteristics from differential entropy features of EEG signals, the proposed MISNet can gain robust subject independent performance and strong domain adaptability.
Collapse
Affiliation(s)
- Ming Gong
- Key Laboratory of Media Audio and Video (Communication University of China), Ministry of Education, Beijing, China
| | - Wei Zhong
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
| | - Long Ye
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
| | - Qin Zhang
- State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
| |
Collapse
|
6
|
Li R, Ren C, Zhang S, Yang Y, Zhao Q, Hou K, Yuan W, Zhang X, Hu B. STSNet: a novel spatio-temporal-spectral network for subject-independent EEG-based emotion recognition. Health Inf Sci Syst 2023; 11:25. [PMID: 37265664 PMCID: PMC10229500 DOI: 10.1007/s13755-023-00226-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 04/28/2023] [Indexed: 06/03/2023] Open
Abstract
How to use the characteristics of EEG signals to obtain more complementary and discriminative data representation is an issue in EEG-based emotion recognition. Many studies have tried spatio-temporal or spatio-spectral feature fusion to obtain higher-level representations of EEG data. However, these studies ignored the complementarity between spatial, temporal and spectral domains of EEG signals, thus limiting the classification ability of models. This study proposed an end-to-end network based on ManifoldNet and BiLSTM networks, named STSNet. The STSNet first constructed a 4-D spatio-temporal-spectral data representation and a spatio-temporal data representation based on EEG signals in manifold space. After that, they were fed into the ManifoldNet network and the BiLSTM network respectively to calculate higher-level features and achieve spatio-temporal-spectral feature fusion. Finally, extensive comparative experiments were performed on two public datasets, DEAP and DREAMER, using the subject-independent leave-one-subject-out cross-validation strategy. On the DEAP dataset, the average accuracy of the valence and arousal are 69.38% and 71.88%, respectively; on the DREAMER dataset, the average accuracy of the valence and arousal are 78.26% and 82.37%, respectively. Experimental results show that the STSNet model has good emotion recognition performance.
Collapse
Affiliation(s)
- Rui Li
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Chao Ren
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Sipo Zhang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Yikun Yang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Qiqi Zhao
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Kechen Hou
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Wenjie Yuan
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Xiaowei Zhang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| |
Collapse
|
7
|
Carrle FP, Hollenbenders Y, Reichenbach A. Generation of synthetic EEG data for training algorithms supporting the diagnosis of major depressive disorder. Front Neurosci 2023; 17:1219133. [PMID: 37849893 PMCID: PMC10577178 DOI: 10.3389/fnins.2023.1219133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 09/05/2023] [Indexed: 10/19/2023] Open
Abstract
Introduction Major depressive disorder (MDD) is the most common mental disorder worldwide, leading to impairment in quality and independence of life. Electroencephalography (EEG) biomarkers processed with machine learning (ML) algorithms have been explored for objective diagnoses with promising results. However, the generalizability of those models, a prerequisite for clinical application, is restricted by small datasets. One approach to train ML models with good generalizability is complementing the original with synthetic data produced by generative algorithms. Another advantage of synthetic data is the possibility of publishing the data for other researchers without risking patient data privacy. Synthetic EEG time-series have not yet been generated for two clinical populations like MDD patients and healthy controls. Methods We first reviewed 27 studies presenting EEG data augmentation with generative algorithms for classification tasks, like diagnosis, for the possibilities and shortcomings of recent methods. The subsequent empirical study generated EEG time-series based on two public datasets with 30/28 and 24/29 subjects (MDD/controls). To obtain baseline diagnostic accuracies, convolutional neural networks (CNN) were trained with time-series from each dataset. The data were synthesized with generative adversarial networks (GAN) consisting of CNNs. We evaluated the synthetic data qualitatively and quantitatively and finally used it for re-training the diagnostic model. Results The reviewed studies improved their classification accuracies by between 1 and 40% with the synthetic data. Our own diagnostic accuracy improved up to 10% for one dataset but not significantly for the other. We found a rich repertoire of generative models in the reviewed literature, solving various technical issues. A major shortcoming in the field is the lack of meaningful evaluation metrics for synthetic data. The few studies analyzing the data in the frequency domain, including our own, show that only some features can be produced truthfully. Discussion The systematic review combined with our own investigation provides an overview of the available methods for generating EEG data for a classification task, their possibilities, and shortcomings. The approach is promising and the technical basis is set. For a broad application of these techniques in neuroscience research or clinical application, the methods need fine-tuning facilitated by domain expertise in (clinical) EEG research.
Collapse
Affiliation(s)
- Friedrich Philipp Carrle
- Center for Machine Learning, Heilbronn University, Heilbronn, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Yasmin Hollenbenders
- Center for Machine Learning, Heilbronn University, Heilbronn, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Alexandra Reichenbach
- Center for Machine Learning, Heilbronn University, Heilbronn, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| |
Collapse
|
8
|
Zhou Y, Lian J. Identification of emotions evoked by music via spatial-temporal transformer in multi-channel EEG signals. Front Neurosci 2023; 17:1188696. [PMID: 37483354 PMCID: PMC10358766 DOI: 10.3389/fnins.2023.1188696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 06/20/2023] [Indexed: 07/25/2023] Open
Abstract
Introduction Emotion plays a vital role in understanding activities and associations. Due to being non-invasive, many experts have employed EEG signals as a reliable technique for emotion recognition. Identifying emotions from multi-channel EEG signals is evolving into a crucial task for diagnosing emotional disorders in neuroscience. One challenge with automated emotion recognition in EEG signals is to extract and select the discriminating features to classify different emotions accurately. Methods In this study, we proposed a novel Transformer model for identifying emotions from multi-channel EEG signals. Note that we directly fed the raw EEG signal into the proposed Transformer, which aims at eliminating the issues caused by the local receptive fields in the convolutional neural networks. The presented deep learning model consists of two separate channels to address the spatial and temporal information in the EEG signals, respectively. Results In the experiments, we first collected the EEG recordings from 20 subjects during listening to music. Experimental results of the proposed approach for binary emotion classification (positive and negative) and ternary emotion classification (positive, negative, and neutral) indicated the accuracy of 97.3 and 97.1%, respectively. We conducted comparison experiments on the same dataset using the proposed method and state-of-the-art techniques. Moreover, we achieved a promising outcome in comparison with these approaches. Discussion Due to the performance of the proposed approach, it can be a potentially valuable instrument for human-computer interface system.
Collapse
Affiliation(s)
- Yanan Zhou
- School of Arts, Beijing Foreign Studies University, Beijing, China
| | - Jian Lian
- School of Intelligence Engineering, Shandong Management University, Jinan, China
| |
Collapse
|
9
|
EEG Emotion Recognition Based on Temporal and Spatial Features of Sensitive signals. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2022. [DOI: 10.1155/2022/5130184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Currently, there are some problems in the electrocorticogram (EEG) emotion recognition research, such as single feature, redundant signal, which make it impossible to achieve high-precision recognition accuracy when used a few channel signals. To solve the abovementioned problems, the authors proposed a method for emotion recognition based on long short-term memory (LSTM) neural network and convolutional neural network (CNN) combined with neurophysiological knowledge. First, the authors selected emotion-sensitive signals based on the physiological function of EEG regions and the active scenario of the band signals, and then merged temporal and spatial features extracted from sensitive signals by LSTM and CNN. Finally, merged features were classified to recognize emotion. The method was experimented on the DEAP dataset, the average accuracy in the valence and arousal dimensions were 92.87% and 93.23%, respectively. Compared with similar studies, it not only improved the recognition accuracy, but also greatly reduced the calculation channel, which proved the superiority of the method.
Collapse
|
10
|
Aung ST, Hassan M, Brady M, Mannan ZI, Azam S, Karim A, Zaman S, Wongsawat Y. Entropy-Based Emotion Recognition from Multichannel EEG Signals Using Artificial Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:6000989. [PMID: 36275950 PMCID: PMC9584707 DOI: 10.1155/2022/6000989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 09/22/2022] [Indexed: 11/17/2022]
Abstract
Humans experience a variety of emotions throughout the course of their daily lives, including happiness, sadness, and rage. As a result, an effective emotion identification system is essential for electroencephalography (EEG) data to accurately reflect emotion in real-time. Although recent studies on this problem can provide acceptable performance measures, it is still not adequate for the implementation of a complete emotion recognition system. In this research work, we propose a new approach for an emotion recognition system, using multichannel EEG calculation with our developed entropy known as multivariate multiscale modified-distribution entropy (MM-mDistEn) which is combined with a model based on an artificial neural network (ANN) to attain a better outcome over existing methods. The proposed system has been tested with two different datasets and achieved better accuracy than existing methods. For the GAMEEMO dataset, we achieved an average accuracy ± standard deviation of 95.73% ± 0.67 for valence and 96.78% ± 0.25 for arousal. Moreover, the average accuracy percentage for the DEAP dataset reached 92.57% ± 1.51 in valence and 80.23% ± 1.83 in arousal.
Collapse
Affiliation(s)
- Si Thu Aung
- Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Salaya, Thailand
| | - Mehedi Hassan
- Computer Science and Engineering, North Western University, Khulna, Bangladesh
| | - Mark Brady
- Asia Pacific College of Business and Law, Charles Darwin University, Casuarina, NT, Australia
| | - Zubaer Ibna Mannan
- Department of Smart Computing, Kyungdong University, Global Campus, Goseong-Gun, Republic of Korea
| | - Sami Azam
- College of Engineering IT and Environment, Charles Darwin University, Casuarina, NT, Australia
| | - Asif Karim
- College of Engineering IT and Environment, Charles Darwin University, Casuarina, NT, Australia
| | - Sadika Zaman
- Computer Science and Engineering, North Western University, Khulna, Bangladesh
| | - Yodchanan Wongsawat
- Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Salaya, Thailand
| |
Collapse
|
11
|
Na Y, Joo H, Trang LT, Quan LDA, Woo J. Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses. Front Neurosci 2022; 16:906616. [PMID: 36061597 PMCID: PMC9433707 DOI: 10.3389/fnins.2022.906616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 07/25/2022] [Indexed: 11/29/2022] Open
Abstract
Auditory prostheses provide an opportunity for rehabilitation of hearing-impaired patients. Speech intelligibility can be used to estimate the extent to which the auditory prosthesis improves the user’s speech comprehension. Although behavior-based speech intelligibility is the gold standard, precise evaluation is limited due to its subjectiveness. Here, we used a convolutional neural network to predict speech intelligibility from electroencephalography (EEG). Sixty-four–channel EEGs were recorded from 87 adult participants with normal hearing. Sentences spectrally degraded by a 2-, 3-, 4-, 5-, and 8-channel vocoder were used to set relatively low speech intelligibility conditions. A Korean sentence recognition test was used. The speech intelligibility scores were divided into 41 discrete levels ranging from 0 to 100%, with a step of 2.5%. Three scores, namely 30.0, 37.5, and 40.0%, were not collected. The speech features, i.e., the speech temporal envelope (ENV) and phoneme (PH) onset, were used to extract continuous-speech EEGs for speech intelligibility prediction. The deep learning model was trained by a dataset of event-related potentials (ERP), correlation coefficients between the ERPs and ENVs, between the ERPs and PH onset, or between ERPs and the product of the multiplication of PH and ENV (PHENV). The speech intelligibility prediction accuracies were 97.33% (ERP), 99.42% (ENV), 99.55% (PH), and 99.91% (PHENV). The models were interpreted using the occlusion sensitivity approach. While the ENV models’ informative electrodes were located in the occipital area, the informative electrodes of the phoneme models, i.e., PH and PHENV, were based on the occlusion sensitivity map located in the language processing area. Of the models tested, the PHENV model obtained the best speech intelligibility prediction accuracy. This model may promote clinical prediction of speech intelligibility with a comfort speech intelligibility test.
Collapse
Affiliation(s)
- Youngmin Na
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
| | - Hyosung Joo
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Le Thi Trang
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Luong Do Anh Quan
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
| | - Jihwan Woo
- Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea
- Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea
- *Correspondence: Jihwan Woo,
| |
Collapse
|
12
|
Shim M, Im CH, Lee SH, Hwang HJ. Enhanced Performance by Interpretable Low-Frequency Electroencephalogram Oscillations in the Machine Learning-Based Diagnosis of Post-traumatic Stress Disorder. Front Neuroinform 2022; 16:811756. [PMID: 35571868 PMCID: PMC9094422 DOI: 10.3389/fninf.2022.811756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 03/28/2022] [Indexed: 11/13/2022] Open
Abstract
Electroencephalography (EEG)-based diagnosis of psychiatric diseases using machine-learning approaches has made possible the objective diagnosis of various psychiatric diseases. The objective of this study was to improve the performance of a resting-state EEG-based computer-aided diagnosis (CAD) system to diagnose post-traumatic stress disorder (PTSD), by optimizing the frequency bands used to extract EEG features. We used eyes-closed resting-state EEG data recorded from 77 PTSD patients and 58 healthy controls (HC). Source-level power spectrum densities (PSDs) of the resting-state EEG data were extracted from 6 frequency bands (delta, theta, alpha, low-beta, high-beta, and gamma), and the PSD features of each frequency band and their combinations were independently used to discriminate PTSD and HC. The classification performance was evaluated using support vector machine with leave-one-out cross validation. The PSD features extracted from slower-frequency bands (delta and theta) showed significantly higher classification performance than those of relatively higher-frequency bands. The best classification performance was achieved when using delta PSD features (86.61%), which was significantly higher than that reported in a recent study by about 13%. The PSD features selected to obtain better classification performances could be explained from a neurophysiological point of view, demonstrating the promising potential to develop a clinically reliable EEG-based CAD system for PTSD diagnosis.
Collapse
Affiliation(s)
- Miseon Shim
- Department of Electronics and Information, Korea University, Sejong, South Korea
- Industry Development Institute, Korea University, Sejong, South Korea
| | - Chang-Hwan Im
- Department of Biomedical Engineering, Hanyang University, Seoul, South Korea
| | - Seung-Hwan Lee
- Department of Psychiatry, Ilsan Paik Hospital, Inje University, Goyang, South Korea
- Clinical Emotion and Cognition Research Laboratory, Goyang, South Korea
| | - Han-Jeong Hwang
- Department of Electronics and Information, Korea University, Sejong, South Korea
- Interdisciplinary Graduate Program for Artificial Intelligence Smart Convergence Technology, Korea University, Sejong, South Korea
| |
Collapse
|