1
|
Novičić M, Djordjević O, Miler-Jerković V, Konstantinović L, Savić AM. Improving the Performance of Electrotactile Brain-Computer Interface Using Machine Learning Methods on Multi-Channel Features of Somatosensory Event-Related Potentials. SENSORS (BASEL, SWITZERLAND) 2024; 24:8048. [PMID: 39771785 PMCID: PMC11679428 DOI: 10.3390/s24248048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 12/10/2024] [Accepted: 12/14/2024] [Indexed: 01/11/2025]
Abstract
Traditional tactile brain-computer interfaces (BCIs), particularly those based on steady-state somatosensory-evoked potentials, face challenges such as lower accuracy, reduced bit rates, and the need for spatially distant stimulation points. In contrast, using transient electrical stimuli offers a promising alternative for generating tactile BCI control signals: somatosensory event-related potentials (sERPs). This study aimed to optimize the performance of a novel electrotactile BCI by employing advanced feature extraction and machine learning techniques on sERP signals for the classification of users' selective tactile attention. The experimental protocol involved ten healthy subjects performing a tactile attention task, with EEG signals recorded from five EEG channels over the sensory-motor cortex. We employed sequential forward selection (SFS) of features from temporal sERP waveforms of all EEG channels. We systematically tested classification performance using machine learning algorithms, including logistic regression, k-nearest neighbors, support vector machines, random forests, and artificial neural networks. We explored the effects of the number of stimuli required to obtain sERP features for classification and their influence on accuracy and information transfer rate. Our approach indicated significant improvements in classification accuracy compared to previous studies. We demonstrated that the number of stimuli for sERP generation can be reduced while increasing the information transfer rate without a statistically significant decrease in classification accuracy. In the case of the support vector machine classifier, we achieved a mean accuracy over 90% for 10 electrical stimuli, while for 6 stimuli, the accuracy decreased by less than 7%, and the information transfer rate increased by 60%. This research advances methods for tactile BCI control based on event-related potentials. This work is significant since tactile stimulation is an understudied modality for BCI control, and electrically induced sERPs are the least studied control signals in reactive BCIs. Exploring and optimizing the parameters of sERP elicitation, as well as feature extraction and classification methods, is crucial for addressing the accuracy versus speed trade-off in various assistive BCI applications where the tactile modality may have added value.
Collapse
Affiliation(s)
- Marija Novičić
- School of Electrical Engineering, University of Belgrade, 11000 Belgrade, Serbia
| | - Olivera Djordjević
- Faculty of Medicine, University of Belgrade, 11000 Belgrade, Serbia
- Clinic for Rehabilitation “Dr. Miroslav Zotović”, 11000 Belgrade, Serbia
| | - Vera Miler-Jerković
- Innovation Center of the School of Electrical Engineering, University of Belgrade, 11000 Belgrade, Serbia
| | - Ljubica Konstantinović
- Faculty of Medicine, University of Belgrade, 11000 Belgrade, Serbia
- Clinic for Rehabilitation “Dr. Miroslav Zotović”, 11000 Belgrade, Serbia
| | - Andrej M. Savić
- School of Electrical Engineering, University of Belgrade, 11000 Belgrade, Serbia
| |
Collapse
|
2
|
Zhang Z, Ding X, Bao Y, Zhao Y, Liang X, Qin B, Liu T. Chisco: An EEG-based BCI dataset for decoding of imagined speech. Sci Data 2024; 11:1265. [PMID: 39572577 PMCID: PMC11582579 DOI: 10.1038/s41597-024-04114-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 11/07/2024] [Indexed: 11/24/2024] Open
Abstract
The rapid advancement of deep learning has enabled Brain-Computer Interfaces (BCIs) technology, particularly neural decoding techniques, to achieve higher accuracy and deeper levels of interpretation. Interest in decoding imagined speech has significantly increased because its concept akin to "mind reading". However, previous studies on decoding neural language have predominantly focused on brain activity patterns during human reading. The absence of imagined speech electroencephalography (EEG) datasets has constrained further research in this field. We present the Chinese Imagined Speech Corpus (Chisco), including over 20,000 sentences of high-density EEG recordings of imagined speech from healthy adults. Each subject's EEG data exceeds 900 minutes, representing the largest dataset per individual currently available for decoding neural language to date. Furthermore, the experimental stimuli include over 6,000 everyday phrases across 39 semantic categories, covering nearly all aspects of daily language. We believe that Chisco represents a valuable resource for the fields of BCIs, facilitating the development of more user-friendly BCIs.
Collapse
Affiliation(s)
- Zihan Zhang
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China
| | - Xiao Ding
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China.
| | - Yu Bao
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China
| | - Yi Zhao
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China
| | - Xia Liang
- Harbin Institute of Technology, The School of Space Environment and Material Science, Harbin, 150000, China.
| | - Bing Qin
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China
| | - Ting Liu
- Harbin Institute of Technology, Department of Computer Science, Harbin, 150000, China
| |
Collapse
|
3
|
Pan H, Song W, Li L, Qin X. The design and implementation of multi-character classification scheme based on EEG signals of visual imagery. Cogn Neurodyn 2024; 18:2299-2309. [PMID: 39678727 PMCID: PMC11639744 DOI: 10.1007/s11571-024-10087-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 01/23/2024] [Accepted: 02/07/2024] [Indexed: 12/17/2024] Open
Abstract
In visual-imagery-based brain-computer interface (VI-BCI), there are problems of singleness of imagination task and insufficient description of feature information, which seriously hinder the development and application of VI-BCI technology in the field of restoring communication. In this paper, we design and optimize a multi-character classification scheme based on electroencephalogram (EEG) signals of visual imagery (VI), which is used to classify 29 characters including 26 lowercase English letters and three punctuation marks. Firstly, a new paradigm of randomly presenting characters and including preparation stage is designed to acquire EEG signals and construct a multi-character dataset, which can eliminate the influence between VI tasks. Secondly, tensor data is obtained by the Morlet wavelet transform, and a feature extraction algorithm based on tensor-uncorrelated multilinear principal component analysis is used to extract high-quality features. Finally, three classifiers, namely support vector machine, K-nearest neighbor, and extreme learning machine, are employed for classifying multi-character, and the results are compared. The experimental results demonstrate that, the proposed scheme effectively extracts character features with minimal redundancy, weak correlation, and strong representation capability, and successfully achieves an average classification accuracy 97.59% for 29 characters, surpassing existing research in terms of both accuracy and quantity of classification. The present study designs a new paradigm for acquiring EEG signals of VI, and combines the Morlet wavelet transform and UMPCA algorithm to extract the character features, enabling multi-character classification in various classifiers. This research paves a novel pathway for establishing direct brain-to-world communication.
Collapse
Affiliation(s)
- Hongguang Pan
- College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an, 710054 Shaanxi China
- Xi’an Key Laboratory of Electrical Equipment Condition Monitoring and Power Supply Security, Xi’an, 710054 Shaanxi China
| | - Wei Song
- College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an, 710054 Shaanxi China
| | - Li Li
- College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an, 710054 Shaanxi China
| | - Xuebin Qin
- College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an, 710054 Shaanxi China
| |
Collapse
|
4
|
Lee BH, Cho JH, Kwon BH, Lee M, Lee SW. Iteratively Calibratable Network for Reliable EEG-Based Robotic Arm Control. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2793-2804. [PMID: 39074028 DOI: 10.1109/tnsre.2024.3434983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
Robotic arms are increasingly being utilized in shared workspaces, which necessitates the accurate interpretation of human intentions for both efficiency and safety. Electroencephalogram (EEG) signals, commonly employed to measure brain activity, offer a direct communication channel between humans and robotic arms. However, the ambiguous and unstable characteristics of EEG signals, coupled with their widespread distribution, make it challenging to collect sufficient data and hinder the calibration performance for new signals, thereby reducing the reliability of EEG-based applications. To address these issues, this study proposes an iteratively calibratable network aimed at enhancing the reliability and efficiency of EEG-based robotic arm control systems. The proposed method integrates feature inputs with network expansion techniques. This integration allows a network trained on an extensive initial dataset to adapt effectively to new users during calibration. Additionally, our approach combines motor imagery and speech imagery datasets to increase not only its intuitiveness but also the number of command classes. The evaluation is conducted in a pseudo-online manner, with a robotic arm operating in real-time to collect data, which is then analyzed offline. The evaluation results demonstrated that the proposed method outperformed the comparison group in 10 sessions and demonstrated competitive results when the two paradigms were combined. Therefore, it was confirmed that the network can be calibrated and personalized using only the new data from new users.
Collapse
|
5
|
Pan H, Wang Y, Li Z, Chu X, Teng B, Gao H. A Complete Scheme for Multi-Character Classification Using EEG Signals From Speech Imagery. IEEE Trans Biomed Eng 2024; 71:2454-2462. [PMID: 38470574 DOI: 10.1109/tbme.2024.3376603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Some classification studies of brain-computer interface (BCI) based on speech imagery show potential for improving communication skills in patients with amyotrophic lateral sclerosis (ALS). However, current research on speech imagery is limited in scope and primarily focuses on vowels or a few selected words. In this paper, we propose a complete research scheme for multi-character classification based on EEG signals derived from speech imagery. Firstly, we record 31 speech imagery contents, including 26 alphabets and five commonly used punctuation marks, from seven subjects using a 32-channel electroencephalogram (EEG) device. Secondly, we introduce the wavelet scattering transform (WST), which shares a structural resemblance to Convolutional Neural Networks (CNNs), for feature extraction. The WST is a knowledge-driven technique that preserves high-frequency information and maintains the deformation stability of EEG signals. To reduce the dimensionality of wavelet scattering coefficient features, we employ Kernel Principal Component Analysis (KPCA). Finally, the reduced features are fed into an Extreme Gradient Boosting (XGBoost) classifier within a multi-classification framework. The XGBoost classifier is optimized through hyperparameter tuning using grid search and 10-fold cross-validation, resulting in an average accuracy of 78.73% for the multi-character classification task. We utilize t-Distributed Stochastic Neighbor Embedding (t-SNE) technology to visualize the low-dimensional representation of multi-character speech imagery. This visualization effectively enables us to observe the clustering of similar characters. The experimental results demonstrate the effectiveness of our proposed multi-character classification scheme. Furthermore, our classification categories and accuracy exceed those reported in existing research.
Collapse
|
6
|
Lee M, Kang H, Yu SH, Cho H, Oh J, van der Lande G, Gosseries O, Jeong JH. Automatic Sleep Stage Classification Using Nasal Pressure Decoding Based on a Multi-Kernel Convolutional BiLSTM Network. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2533-2544. [PMID: 38941194 DOI: 10.1109/tnsre.2024.3420715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Sleep quality is an essential parameter of a healthy human life, while sleep disorders such as sleep apnea are abundant. In the investigation of sleep and its malfunction, the gold-standard is polysomnography, which utilizes an extensive range of variables for sleep stage classification. However, undergoing full polysomnography, which requires many sensors that are directly connected to the heaviness of the setup and the discomfort of sleep, brings a significant burden. In this study, sleep stage classification was performed using the single dimension of nasal pressure, dramatically decreasing the complexity of the process. In turn, such improvements could increase the much needed clinical applicability. Specifically, we propose a deep learning structure consisting of multi-kernel convolutional neural networks and bidirectional long short-term memory for sleep stage classification. Sleep stages of 25 healthy subjects were classified into 3-class (wake, rapid eye movement (REM), and non-REM) and 4-class (wake, REM, light, and deep sleep) based on nasal pressure. Following a leave-one-subject-out cross-validation, in the 3-class the accuracy was 0.704, the F1-score was 0.490, and the kappa value was 0.283 for the overall metrics. In the 4-class, the accuracy was 0.604, the F1-score was 0.349, and the kappa value was 0.217 for the overall metrics. This was higher than the four comparative models, including the class-wise F1-score. This result demonstrates the possibility of a sleep stage classification model only using easily applicable and highly practical nasal pressure recordings. This is also likely to be used with interventions that could help treat sleep-related diseases.
Collapse
|
7
|
Zhang Y, Li M, Wang H, Zhang M, Xu G. Preparatory movement state enhances premovement EEG representations for brain-computer interfaces. J Neural Eng 2024; 21:036044. [PMID: 38806037 DOI: 10.1088/1741-2552/ad5109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 05/28/2024] [Indexed: 05/30/2024]
Abstract
Objective. Motor-related brain-computer interface (BCI) have a broad range of applications, with the detection of premovement intentions being a prominent use case. However, the electroencephalography (EEG) features during the premovement phase are not distinctly evident and are susceptible to attentional influences. These limitations impede the enhancement of performance in motor-based BCI. The objective of this study is to establish a premovement BCI encoding paradigm that integrates the preparatory movement state and validates its feasibility in improving the detection of movement intentions.Methods. Two button tasks were designed to induce subjects into a preparation state for two movement intentions (left and right) based on visual guidance, in contrast to spontaneous premovement. The low frequency movement-related cortical potentials (MRCPs) and high frequency event-related desynchronization (ERD) EEG data of 14 subjects were recorded. Extracted features were fused and classified using task related common spatial patterns (CSP) and CSP algorithms. Differences between prepared premovement and spontaneous premovement were compared in terms of time domain, frequency domain, and classification accuracy.Results. In the time domain, MRCPs features reveal that prepared premovement induce lower amplitude and earlier latency on both contralateral and ipsilateral motor cortex compared to spontaneous premovement, with susceptibility to the dominant hand's influence. Frequency domain ERD features indicate that prepared premovement induce lower ERD values bilaterally, and the ERD recovery speed after button press is the fastest. By using the fusion approach, the classification accuracy increased from 78.92% for spontaneous premovement to 83.59% for prepared premovement (p< 0.05). Along with the 4.67% improvement in classification accuracy, the standard deviation decreased by 0.95.Significance. The research findings confirm that incorporating a preparatory state into premovement enhances neural representations related to movement. This encoding enhancement paradigm effectively improves the performance of motor-based BCI. Additionally, this concept has the potential to broaden the range of decodable movement intentions and related information in motor-related BCI.
Collapse
Affiliation(s)
- Yuxin Zhang
- School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, People's Republic of China
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, People's Republic of China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin 300130, People's Republic of China
| | - Mengfan Li
- School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, People's Republic of China
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, People's Republic of China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin 300130, People's Republic of China
| | - Haili Wang
- School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, People's Republic of China
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, People's Republic of China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin 300130, People's Republic of China
| | - Mingyu Zhang
- School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, People's Republic of China
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, People's Republic of China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin 300130, People's Republic of China
| | - Guizhi Xu
- School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, People's Republic of China
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300130, People's Republic of China
- Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin 300130, People's Republic of China
| |
Collapse
|
8
|
Kilmarx J, Tashev I, del R. Millán J, Sulzer J, Lewis-Peacock JA. Evaluating the Feasibility of Visual Imagery for an EEG-Based Brain-Computer Interface. IEEE Trans Neural Syst Rehabil Eng 2024; 32:2209-2219. [PMID: 38843055 PMCID: PMC11249027 DOI: 10.1109/tnsre.2024.3410870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Visual imagery, or the mental simulation of visual information from memory, could serve as an effective control paradigm for a brain-computer interface (BCI) due to its ability to directly convey the user's intention with many natural ways of envisioning an intended action. However, multiple initial investigations into using visual imagery as a BCI control strategies have been unable to fully evaluate the capabilities of true spontaneous visual mental imagery. One major limitation in these prior works is that the target image is typically displayed immediately preceding the imagery period. This paradigm does not capture spontaneous mental imagery as would be necessary in an actual BCI application but something more akin to short-term retention in visual working memory. Results from the present study show that short-term visual imagery following the presentation of a specific target image provides a stronger, more easily classifiable neural signature in EEG than spontaneous visual imagery from long-term memory following an auditory cue for the image. We also show that short-term visual imagery and visual perception share commonalities in the most predictive electrodes and spectral features. However, visual imagery received greater influence from frontal electrodes whereas perception was mostly confined to occipital electrodes. This suggests that visual perception is primarily driven by sensory information whereas visual imagery has greater contributions from areas associated with memory and attention. This work provides the first direct comparison of short-term and long-term visual imagery tasks and provides greater insight into the feasibility of using visual imagery as a BCI control strategy.
Collapse
Affiliation(s)
| | | | - José del R. Millán
- Department of Electrical and Computer Engineering and Department of Neurology, University of Texas at Austin, Austin, TX, USA
| | - James Sulzer
- Department of Physical Medicine and Rehabilitation, MetroHealth Medical Center and Case Western Reserve University, Cleveland, OH, USA
| | | |
Collapse
|
9
|
Li C, Liu Y, Li J, Miao Y, Liu J, Song L. Decoding Bilingual EEG Signals With Complex Semantics Using Adaptive Graph Attention Convolutional Network. IEEE Trans Neural Syst Rehabil Eng 2024; 32:249-258. [PMID: 38163312 DOI: 10.1109/tnsre.2023.3348981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Decoding neural signals of silent reading with Brain-Computer Interface (BCI) techniques presents a fast and intuitive communication method for severely aphasia patients. Electroencephalogram (EEG) acquisition is convenient and easily wearable with high temporal resolution. However, existing EEG-based decoding units primarily concentrate on individual words due to their low signal-to-noise ratio, rendering them insufficient for facilitating daily communication. Decoding at the word level is less efficient than decoding at the phrase or sentence level. Furthermore, with the popularity of multilingualism, decoding EEG signals with complex semantics under multiple languages is highly urgent and necessary. To the best of our knowledge, there is currently no research on decoding EEG signals during silent reading of complex semantics, let alone decoding silent reading EEG signals with complex semantics for bilingualism. Moreover, the feasibility of decoding such signals remains to be investigated. In this work, we collect silent reading EEG signals of 9 English Phrases (EP), 7 English Sentences (ES), 10 Chinese Phrases (CP), and 7 Chinese Sentences (CS) from the subject within 26 days. We propose a novel Adaptive Graph Attention Convolution Network (AGACN) for classification. Experimental results demonstrate that our proposed method outperforms state-of-the-art methods, achieving the highest classification accuracy of 54.70%, 62.26%, 44.55%, and 57.14% for silent reading EEG signals of EP, ES, CP, and CS, respectively. Moreover, our results prove the feasibility of complex semantics EEG signal decoding. This work will aid aphasic patients in achieving regular communication while providing novel ideas for neural signal decoding research.
Collapse
|
10
|
Jeong JH, Cho JH, Lee BH, Lee SW. Real-Time Deep Neurolinguistic Learning Enhances Noninvasive Neural Language Decoding for Brain-Machine Interaction. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7469-7482. [PMID: 36251899 DOI: 10.1109/tcyb.2022.3211694] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Electroencephalogram (EEG)-based brain-machine interface (BMI) has been utilized to help patients regain motor function and has recently been validated for its use in healthy people because of its ability to directly decipher human intentions. In particular, neurolinguistic research using EEGs has been investigated as an intuitive and naturalistic communication tool between humans and machines. In this study, the human mind directly decoded the neural languages based on speech imagery using the proposed deep neurolinguistic learning. Through real-time experiments, we evaluated whether BMI-based cooperative tasks between multiple users could be accomplished using a variety of neural languages. We successfully demonstrated a BMI system that allows a variety of scenarios, such as essential activity, collaborative play, and emotional interaction. This outcome presents a novel BMI frontier that can interact at the level of human-like intelligence in real time and extends the boundaries of the communication paradigm.
Collapse
|
11
|
Moon J, Chau T. Online Ternary Classification of Covert Speech by Leveraging the Passive Perception of Speech. Int J Neural Syst 2023; 33:2350048. [PMID: 37522623 DOI: 10.1142/s012906572350048x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Brain-computer interfaces (BCIs) provide communicative alternatives to those without functional speech. Covert speech (CS)-based BCIs enable communication simply by thinking of words and thus have intuitive appeal. However, an elusive barrier to their clinical translation is the collection of voluminous examples of high-quality CS signals, as iteratively rehearsing words for long durations is mentally fatiguing. Research on CS and speech perception (SP) identifies common spatiotemporal patterns in their respective electroencephalographic (EEG) signals, pointing towards shared encoding mechanisms. The goal of this study was to investigate whether a model that leverages the signal similarities between SP and CS can differentiate speech-related EEG signals online. Ten participants completed a dyadic protocol where in each trial, they listened to a randomly selected word and then subsequently mentally rehearsed the word. In the offline sessions, eight words were presented to participants. For the subsequent online sessions, the two most distinct words (most separable in terms of their EEG signals) were chosen to form a ternary classification problem (two words and rest). The model comprised a functional mapping derived from SP and CS signals of the same speech token (features are extracted via a Riemannian approach). An average ternary online accuracy of 75.3% (60% chance level) was achieved across participants, with individual accuracies as high as 93%. Moreover, we observed that the signal-to-noise ratio (SNR) of CS signals was enhanced by perception-covert modeling according to the level of high-frequency ([Formula: see text]-band) correspondence between CS and SP. These findings may lead to less burdensome data collection for training speech BCIs, which could eventually enhance the rate at which the vocabulary can grow.
Collapse
Affiliation(s)
- Jae Moon
- Institute of Biomedical Engineering, University of Toronto, Holland Bloorview Kid's Rehabilitation Hospital, Toronto, Ontario, Canada
| | - Tom Chau
- Institute of Biomedical Engineering, University of Toronto, Holland Bloorview Kid's Rehabilitation Hospital, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Park HJ, Lee B. Multiclass classification of imagined speech EEG using noise-assisted multivariate empirical mode decomposition and multireceptive field convolutional neural network. Front Hum Neurosci 2023; 17:1186594. [PMID: 37645689 PMCID: PMC10461632 DOI: 10.3389/fnhum.2023.1186594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/21/2023] [Indexed: 08/31/2023] Open
Abstract
Introduction In this study, we classified electroencephalography (EEG) data of imagined speech using signal decomposition and multireceptive convolutional neural network. The imagined speech EEG with five vowels /a/, /e/, /i/, /o/, and /u/, and mute (rest) sounds were obtained from ten study participants. Materials and methods First, two different signal decomposition methods were applied for comparison: noise-assisted multivariate empirical mode decomposition and wavelet packet decomposition. Six statistical features were calculated from the decomposed eight sub-frequency bands EEG. Next, all features obtained from each channel of the trial were vectorized and used as the input vector of classifiers. Lastly, EEG was classified using multireceptive field convolutional neural network and several other classifiers for comparison. Results We achieved an average classification rate of 73.09 and up to 80.41% in a multiclass (six classes) setup (Chance: 16.67%). In comparison with various other classifiers, significant improvements for other classifiers were achieved (p-value < 0.05). From the frequency sub-band analysis, high-frequency band regions and the lowest-frequency band region contain more information about imagined vowel EEG data. The misclassification and classification rate of each vowel imaginary EEG was analyzed through a confusion matrix. Discussion Imagined speech EEG can be classified successfully using the proposed signal decomposition method and a convolutional neural network. The proposed classification method for imagined speech EEG can contribute to developing a practical imagined speech-based brain-computer interfaces system.
Collapse
Affiliation(s)
- Hyeong-jun Park
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| | - Boreom Lee
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
- AI Graduate School, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| |
Collapse
|
13
|
Li M, Pun SH, Chen F. Impacts of Cortical Regions on EEG-based Classification of Lexical Tones and Vowels in Spoken Speech. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083036 DOI: 10.1109/embc40787.2023.10340428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Speech impairment is one of the most serious problems for patients with communication disorders, e.g., stroke survivors. The brain-computer interface (BCI) systems have shown the potential to alternatively control or rehabilitate the neurological damages in speech production. The effects of different cortical regions in speech-based BCI systems are essential to be studied, which are favorable for improving the performance of speech-based BCI systems. This work aimed to explore the impacts of different speech-related cortical regions in the electroencephalogram (EEG) based classification of seventy spoken Mandarin monosyllables carrying four vowels and four lexical tones. Seven audible speech production-related cortical regions were studied, involving Broca's and Wernicke's areas, auditory cortex, motor cortex, prefrontal cortex, sensory cortex, left brain, right brain, and whole brain. Following the previous studies in which EEG signals were collected from ten subjects during Mandarin speech production, the features of EEG signals were extracted by the Riemannian manifold method, and a linear discriminant analysis (LDA) was regarded as a classifier to classify different vowels and lexical tones. The results showed that when using electrodes from whole brain, the classifier reached the best performances, which were 48.5% for lexical tones and 70.0% for vowels, respectively. The vowel classification results under Broca's and Wernicke's areas, auditory cortex, or prefrontal cortex were higher than those under the motor cortex or sensory cortex. No such differences were observed in the lexical tone classification task.
Collapse
|
14
|
Wilson H, Golbabaee M, Proulx MJ, Charles S, O'Neill E. EEG-based BCI Dataset of Semantic Concepts for Imagination and Perception Tasks. Sci Data 2023; 10:386. [PMID: 37322034 PMCID: PMC10272218 DOI: 10.1038/s41597-023-02287-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/02/2023] [Indexed: 06/17/2023] Open
Abstract
Electroencephalography (EEG) is a widely-used neuroimaging technique in Brain Computer Interfaces (BCIs) due to its non-invasive nature, accessibility and high temporal resolution. A range of input representations has been explored for BCIs. The same semantic meaning can be conveyed in different representations, such as visual (orthographic and pictorial) and auditory (spoken words). These stimuli representations can be either imagined or perceived by the BCI user. In particular, there is a scarcity of existing open source EEG datasets for imagined visual content, and to our knowledge there are no open source EEG datasets for semantics captured through multiple sensory modalities for both perceived and imagined content. Here we present an open source multisensory imagination and perception dataset, with twelve participants, acquired with a 124 EEG channel system. The aim is for the dataset to be open for purposes such as BCI related decoding and for better understanding the neural mechanisms behind perception, imagination and across the sensory modalities when the semantic category is held constant.
Collapse
Affiliation(s)
- Holly Wilson
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK.
| | - Mohammad Golbabaee
- Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1TW, UK
| | | | - Stephen Charles
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK
| | - Eamonn O'Neill
- Department of Computer Science, University of Bath, Bath, BA2 7AY, UK.
| |
Collapse
|
15
|
Shah U, Alzubaidi M, Mohsen F, Abd-Alrazaq A, Alam T, Househ M. The Role of Artificial Intelligence in Decoding Speech from EEG Signals: A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:6975. [PMID: 36146323 PMCID: PMC9505262 DOI: 10.3390/s22186975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/01/2022] [Accepted: 08/09/2022] [Indexed: 06/16/2023]
Abstract
Background: Brain traumas, mental disorders, and vocal abuse can result in permanent or temporary speech impairment, significantly impairing one's quality of life and occasionally resulting in social isolation. Brain-computer interfaces (BCI) can support people who have issues with their speech or who have been paralyzed to communicate with their surroundings via brain signals. Therefore, EEG signal-based BCI has received significant attention in the last two decades for multiple reasons: (i) clinical research has capitulated detailed knowledge of EEG signals, (ii) inexpensive EEG devices, and (iii) its application in medical and social fields. Objective: This study explores the existing literature and summarizes EEG data acquisition, feature extraction, and artificial intelligence (AI) techniques for decoding speech from brain signals. Method: We followed the PRISMA-ScR guidelines to conduct this scoping review. We searched six electronic databases: PubMed, IEEE Xplore, the ACM Digital Library, Scopus, arXiv, and Google Scholar. We carefully selected search terms based on target intervention (i.e., imagined speech and AI) and target data (EEG signals), and some of the search terms were derived from previous reviews. The study selection process was carried out in three phases: study identification, study selection, and data extraction. Two reviewers independently carried out study selection and data extraction. A narrative approach was adopted to synthesize the extracted data. Results: A total of 263 studies were evaluated; however, 34 met the eligibility criteria for inclusion in this review. We found 64-electrode EEG signal devices to be the most widely used in the included studies. The most common signal normalization and feature extractions in the included studies were the bandpass filter and wavelet-based feature extraction. We categorized the studies based on AI techniques, such as machine learning and deep learning. The most prominent ML algorithm was a support vector machine, and the DL algorithm was a convolutional neural network. Conclusions: EEG signal-based BCI is a viable technology that can enable people with severe or temporal voice impairment to communicate to the world directly from their brain. However, the development of BCI technology is still in its infancy.
Collapse
Affiliation(s)
- Uzair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mahmood Alzubaidi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Alaa Abd-Alrazaq
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha P.O. Box 34110, Qatar
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mowafa Househ
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| |
Collapse
|
16
|
Jeong JH, Cho JH, Lee YE, Lee SH, Shin GH, Kweon YS, Millán JDR, Müller KR, Lee SW. 2020 International brain-computer interface competition: A review. Front Hum Neurosci 2022; 16:898300. [PMID: 35937679 PMCID: PMC9354666 DOI: 10.3389/fnhum.2022.898300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 07/01/2022] [Indexed: 11/16/2022] Open
Abstract
The brain-computer interface (BCI) has been investigated as a form of communication tool between the brain and external devices. BCIs have been extended beyond communication and control over the years. The 2020 international BCI competition aimed to provide high-quality neuroscientific data for open access that could be used to evaluate the current degree of technical advances in BCI. Although there are a variety of remaining challenges for future BCI advances, we discuss some of more recent application directions: (i) few-shot EEG learning, (ii) micro-sleep detection (iii) imagined speech decoding, (iv) cross-session classification, and (v) EEG(+ear-EEG) detection in an ambulatory environment. Not only did scientists from the BCI field compete, but scholars with a broad variety of backgrounds and nationalities participated in the competition to address these challenges. Each dataset was prepared and separated into three data that were released to the competitors in the form of training and validation sets followed by a test set. Remarkable BCI advances were identified through the 2020 competition and indicated some trends of interest to BCI researchers.
Collapse
Affiliation(s)
- Ji-Hoon Jeong
- School of Computer Science, Chungbuk National University, Cheongju, South Korea
| | - Jeong-Hyun Cho
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - Young-Eun Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - Seo-Hyun Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - Gi-Hwan Shin
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - Young-Seok Kweon
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
| | - José del R. Millán
- Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, United States
| | - Klaus-Robert Müller
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
- Machine Learning Group, Department of Computer Science, Berlin Institute of Technology, Berlin, Germany
- Max Planck Institute for Informatics, Saarbrucken, Germany
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Seong-Whan Lee
- Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| |
Collapse
|
17
|
Lee KW, Lee DH, Kim SJ, Lee SW. Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1977-1980. [PMID: 36086641 DOI: 10.1109/embc48229.2022.9871721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal information and low signal-to-noise ratio. In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese. Our assumption was that English, a non-tonal and phonogram-based language, would have spectral differences in neural computation compared to Chinese, a tonal and ideogram-based language. The results showed the significant difference in the relative power spectral density between English and Chinese in specific frequency band groups. Also, the spatial evaluation of Chinese native speakers in the theta band was distinctive during the imagination task. Hence, this paper would suggest the key spectral and spatial information of word imagination with specialized language while decoding the neural signals of speech. Clinical Relevance- Imagined speech-related studies lead to the development of assistive communication technology especially for patients with speech disorders such as aphasia due to brain damage. This study suggests significant spectral features by analyzing cross-language differences of EEG-based imagined speech using two widely used languages.
Collapse
|
18
|
Cooney C, Folli R, Coyle D. A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech. IEEE Trans Biomed Eng 2021; 69:1983-1994. [PMID: 34874850 DOI: 10.1109/tbme.2021.3132861] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE Brain-computer interfaces (BCI) studies are increasingly leveraging different attributes of multiple signal modalities simultaneously. Bimodal data acquisition protocols combining the temporal resolution of electroencephalography (EEG) with the spatial resolution of functional near-infrared spectroscopy (fNIRS) require novel approaches to decoding. METHODS We present an EEG-fNIRS Hybrid BCI that employs a new bimodal deep neural network architecture consisting of two convolutional sub-networks (subnets) to decode overt and imagined speech. Features from each subnet are fused before further feature extraction and classification. Nineteen participants performed overt and imagined speech in a novel cue-based paradigm enabling investigation of stimulus and linguistic effects on decoding. RESULTS Using the hybrid approach, classification accuracies (46.31% and 34.29% for overt and imagined speech, respectively (chance: 25%)) indicated a significant improvement on EEG used independently for imagined speech (p=0.020) while tending towards significance for overt speech (p=0.098). In comparison with fNIRS, significant improvements for both speech-types were achieved with bimodal decoding (p<0.001). There was a mean difference of ~12.02% between overt and imagined speech with accuracies as high as 87.18% and 53%. Deeper subnets enhanced performance while stimulus effected overt and imagined speech in significantly different ways. CONCLUSION The bimodal approach was a significant improvement on unimodal results for several tasks. Results indicate the potential of multi-modal deep learning for enhancing neural signal decoding. SIGNIFICANCE This novel architecture can be used to enhance speech decoding from bimodal neural signals.
Collapse
|
19
|
Sarmiento LC, Villamizar S, López O, Collazos AC, Sarmiento J, Rodríguez JB. Recognition of EEG Signals from Imagined Vowels Using Deep Learning Methods. SENSORS (BASEL, SWITZERLAND) 2021; 21:6503. [PMID: 34640824 PMCID: PMC8512781 DOI: 10.3390/s21196503] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/17/2021] [Accepted: 09/24/2021] [Indexed: 01/27/2023]
Abstract
The use of imagined speech with electroencephalographic (EEG) signals is a promising field of brain-computer interfaces (BCI) that seeks communication between areas of the cerebral cortex related to language and devices or machines. However, the complexity of this brain process makes the analysis and classification of this type of signals a relevant topic of research. The goals of this study were: to develop a new algorithm based on Deep Learning (DL), referred to as CNNeeg1-1, to recognize EEG signals in imagined vowel tasks; to create an imagined speech database with 50 subjects specialized in imagined vowels from the Spanish language (/a/,/e/,/i/,/o/,/u/); and to contrast the performance of the CNNeeg1-1 algorithm with the DL Shallow CNN and EEGNet benchmark algorithms using an open access database (BD1) and the newly developed database (BD2). In this study, a mixed variance analysis of variance was conducted to assess the intra-subject and inter-subject training of the proposed algorithms. The results show that for intra-subject training analysis, the best performance among the Shallow CNN, EEGNet, and CNNeeg1-1 methods in classifying imagined vowels (/a/,/e/,/i/,/o/,/u/) was exhibited by CNNeeg1-1, with an accuracy of 65.62% for BD1 database and 85.66% for BD2 database.
Collapse
Affiliation(s)
- Luis Carlos Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Sergio Villamizar
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| | - Omar López
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Ana Claros Collazos
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jhon Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jan Bacca Rodríguez
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| |
Collapse
|
20
|
Shi R, Zhao Y, Cao Z, Liu C, Kang Y, Zhang J. Categorizing objects from MEG signals using EEGNet. Cogn Neurodyn 2021; 16:365-377. [PMID: 35401863 PMCID: PMC8934895 DOI: 10.1007/s11571-021-09717-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 08/09/2021] [Accepted: 09/02/2021] [Indexed: 11/25/2022] Open
Abstract
Magnetoencephalography (MEG) signals have demonstrated their practical application to reading human minds. Current neural decoding studies have made great progress to build subject-wise decoding models to extract and discriminate the temporal/spatial features in neural signals. In this paper, we used a compact convolutional neural network-EEGNet-to build a common decoder across subjects, which deciphered the categories of objects (faces, tools, animals, and scenes) from MEG data. This study investigated the influence of the spatiotemporal structure of MEG on EEGNet's classification performance. Furthermore, the EEGNet replaced its convolution layers with two sets of parallel convolution structures to extract the spatial and temporal features simultaneously. Our results showed that the organization of MEG data fed into the EEGNet has an effect on EEGNet classification accuracy, and the parallel convolution structures in EEGNet are beneficial to extracting and fusing spatial and temporal MEG features. The classification accuracy demonstrated that the EEGNet succeeds in building the common decoder model across subjects, and outperforms several state-of-the-art feature fusing methods.
Collapse
Affiliation(s)
- Ran Shi
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
| | - Yanyu Zhao
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
| | - Zhiyuan Cao
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
| | - Chunyu Liu
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
| | - Yi Kang
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
| | - Jiacai Zhang
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China
- Engineering Research Center of Intelligent Technology and Educational Application, Ministry of Education, Beijing, 100875, China
| |
Collapse
|
21
|
Li F, Chao W, Li Y, Fu B, Ji Y, Wu H, Shi G. Decoding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network. J Neural Eng 2021; 18. [PMID: 34256357 DOI: 10.1088/1741-2552/ac13c0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 07/13/2021] [Indexed: 11/12/2022]
Abstract
Objective.Directly decoding imagined speech from electroencephalogram (EEG) signals has attracted much interest in brain-computer interface applications, because it provides a natural and intuitive communication method for locked-in patients. Several methods have been applied to imagined speech decoding, but how to construct spatial-temporal dependencies and capture long-range contextual cues in EEG signals to better decode imagined speech should be considered.Approach.In this study, we propose a novel model called hybrid-scale spatial-temporal dilated convolution network (HS-STDCN) for EEG-based imagined speech recognition. HS-STDCN integrates feature learning from temporal and spatial information into a unified end-to-end model. To characterize the temporal dependencies of the EEG sequences, we adopted a hybrid-scale temporal convolution layer to capture temporal information at multiple levels. A depthwise spatial convolution layer was then designed to construct intrinsic spatial relationships of EEG electrodes, which can produce a spatial-temporal representation of the input EEG data. Based on the spatial-temporal representation, dilated convolution layers were further employed to learn long-range discriminative features for the final classification.Main results.To evaluate the proposed method, we compared the HS-STDCN with other existing methods on our collected dataset. The HS-STDCN achieved an averaged classification accuracy of 54.31% for decoding eight imagined words, which is significantly better than other methods at a significance level of 0.05.Significance.The proposed HS-STDCN model provided an effective approach to make use of both the temporal and spatial dependencies of the input EEG signals for imagined speech recognition. We also visualized the word semantic differences to analyze the impact of word semantics on imagined speech recognition, investigated the important regions in the decoding process, and explored the use of fewer electrodes to achieve comparable performance.
Collapse
Affiliation(s)
- Fu Li
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Weibing Chao
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Yang Li
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Boxun Fu
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Youshuo Ji
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Hao Wu
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Guangming Shi
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| |
Collapse
|
22
|
Lee DY, Lee M, Lee SW. Decoding Imagined Speech Based on Deep Metric Learning for Intuitive BCI Communication. IEEE Trans Neural Syst Rehabil Eng 2021; 29:1363-1374. [PMID: 34255630 DOI: 10.1109/tnsre.2021.3096874] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Imagined speech is a highly promising paradigm due to its intuitive application and multiclass scalability in the field of brain-computer interfaces. However, optimal feature extraction and classifiers have not yet been established. Furthermore, retraining still requires a large number of trials when new classes are added. The aim of this study is (i) to increase the classification performance for imagined speech and (ii) to apply a new class using a pretrained classifier with a small number of trials. We propose a novel framework based on deep metric learning that learns the distance by comparing the similarity between samples. We also applied the instantaneous frequency and spectral entropy used for speech signals to electroencephalography signals during imagined speech. The method was evaluated on two public datasets (6-class Coretto DB and 5-class BCI Competition DB). We achieved a 6-class accuracy of 45.00 ± 3.13% and a 5-class accuracy of 48.10 ± 3.68% using the proposed method, which significantly outperformed state-of-the-art methods. Additionally, we verified that the new class could be detected through incremental learning with a small number of trials. As a result, the average accuracy is 44.50 ± 0.26% for Coretto DB and 47.12 ± 0.27% for BCI Competition DB, which shows similar accuracy to baseline accuracy without incremental learning. Our results have shown that the accuracy can be greatly improved even with a small number of trials by selecting appropriate features from imagined speech. The proposed framework could be directly used to help construct an extensible intuitive communication system based on brain-computer interfaces.
Collapse
|
23
|
Ko W, Jeon E, Jeong S, Suk HI. Multi-Scale Neural Network for EEG Representation Learning in BCI. IEEE COMPUT INTELL M 2021. [DOI: 10.1109/mci.2021.3061875] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|