1
|
Zhang X, Landsness EC, Miao H, Chen W, Tang MJ, Brier LM, Culver JP, Lee JM, Anastasio MA. Attention-based CNN-BiLSTM for sleep state classification of spatiotemporal wide-field calcium imaging data. J Neurosci Methods 2024; 411:110250. [PMID: 39151658 DOI: 10.1016/j.jneumeth.2024.110250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 08/03/2024] [Accepted: 08/13/2024] [Indexed: 08/19/2024]
Abstract
BACKGROUND Wide-field calcium imaging (WFCI) with genetically encoded calcium indicators allows for spatiotemporal recordings of neuronal activity in mice. When applied to the study of sleep, WFCI data are manually scored into the sleep states of wakefulness, non-REM (NREM) and REM by use of adjunct EEG and EMG recordings. However, this process is time-consuming, invasive and often suffers from low inter- and intra-rater reliability. Therefore, an automated sleep state classification method that operates on spatiotemporal WFCI data is desired. NEW METHOD A hybrid network architecture consisting of a convolutional neural network (CNN) to extract spatial features of image frames and a bidirectional long short-term memory network (BiLSTM) with attention mechanism to identify temporal dependencies among different time points was proposed to classify WFCI data into states of wakefulness, NREM and REM sleep. RESULTS Sleep states were classified with an accuracy of 84 % and Cohen's κ of 0.64. Gradient-weighted class activation maps revealed that the frontal region of the cortex carries more importance when classifying WFCI data into NREM sleep while posterior area contributes most to the identification of wakefulness. The attention scores indicated that the proposed network focuses on short- and long-range temporal dependency in a state-specific manner. COMPARISON WITH EXISTING METHOD On a held out, repeated 3-hour WFCI recording, the CNN-BiLSTM achieved a κ of 0.67, comparable to a κ of 0.65 corresponding to the human EEG/EMG-based scoring. CONCLUSIONS The CNN-BiLSTM effectively classifies sleep states from spatiotemporal WFCI data and will enable broader application of WFCI in sleep research.
Collapse
Affiliation(s)
- Xiaohui Zhang
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Eric C Landsness
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Hanyang Miao
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Wei Chen
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Michelle J Tang
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Lindsey M Brier
- Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Joseph P Culver
- Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Biomedical Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA; Department of Electrical and Systems Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA; Department of Physics, Washington University School of Arts and Science, St. Louis, Mo 63130, USA
| | - Jin-Moo Lee
- Department of Neurology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Radiology, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Biomedical Engineering, Washington University School of Engineering, St. Louis, MO 63130, USA
| | - Mark A Anastasio
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
2
|
Komeiji S, Mitsuhashi T, Iimura Y, Suzuki H, Sugano H, Shinoda K, Tanaka T. Feasibility of decoding covert speech in ECoG with a Transformer trained on overt speech. Sci Rep 2024; 14:11491. [PMID: 38769115 PMCID: PMC11106343 DOI: 10.1038/s41598-024-62230-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 05/15/2024] [Indexed: 05/22/2024] Open
Abstract
Several attempts for speech brain-computer interfacing (BCI) have been made to decode phonemes, sub-words, words, or sentences using invasive measurements, such as the electrocorticogram (ECoG), during auditory speech perception, overt speech, or imagined (covert) speech. Decoding sentences from covert speech is a challenging task. Sixteen epilepsy patients with intracranially implanted electrodes participated in this study, and ECoGs were recorded during overt speech and covert speech of eight Japanese sentences, each consisting of three tokens. In particular, Transformer neural network model was applied to decode text sentences from covert speech, which was trained using ECoGs obtained during overt speech. We first examined the proposed Transformer model using the same task for training and testing, and then evaluated the model's performance when trained with overt task for decoding covert speech. The Transformer model trained on covert speech achieved an average token error rate (TER) of 46.6% for decoding covert speech, whereas the model trained on overt speech achieved a TER of 46.3% ( p > 0.05 ; d = 0.07 ) . Therefore, the challenge of collecting training data for covert speech can be addressed using overt speech. The performance of covert speech can improve by employing several overt speeches.
Collapse
Affiliation(s)
- Shuji Komeiji
- Department of Electronic and Information Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
| | - Takumi Mitsuhashi
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Yasushi Iimura
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Hiroharu Suzuki
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Hidenori Sugano
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Koichi Shinoda
- Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Toshihisa Tanaka
- Department of Electronic and Information Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan.
| |
Collapse
|
3
|
Sharma N, Upadhyay A, Sharma M, Singhal A. Deep temporal networks for EEG-based motor imagery recognition. Sci Rep 2023; 13:18813. [PMID: 37914729 PMCID: PMC10620382 DOI: 10.1038/s41598-023-41653-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 08/29/2023] [Indexed: 11/03/2023] Open
Abstract
The electroencephalogram (EEG) based motor imagery (MI) signal classification, also known as motion recognition, is a highly popular area of research due to its applications in robotics, gaming, and medical fields. However, the problem is ill-posed as these signals are non-stationary and noisy. Recently, a lot of efforts have been made to improve MI signal classification using a combination of signal decomposition and machine learning techniques but they fail to perform adequately on large multi-class datasets. Previously, researchers have implemented long short-term memory (LSTM), which is capable of learning the time-series information, on the MI-EEG dataset for motion recognition. However, it can not model very long-term dependencies present in the motion recognition data. With the advent of transformer networks in natural language processing (NLP), the long-term dependency issue has been widely addressed. Motivated by the success of transformer algorithms, in this article, we propose a transformer-based deep learning neural network architecture that performs motion recognition on the raw BCI competition III IVa and IV 2a datasets. The validation results show that the proposed method achieves superior performance than the existing state-of-the-art methods. The proposed method produces classification accuracy of 99.7% and 84% on the binary class and the multi-class datasets, respectively. Further, the performance of the proposed transformer-based model is also compared with LSTM.
Collapse
Affiliation(s)
- Neha Sharma
- Department of Electronics and Communication Engineering, Bennett University, Greater Noida, 201310, India
| | - Avinash Upadhyay
- Department of Electronics and Communication Engineering, Bennett University, Greater Noida, 201310, India
| | - Manoj Sharma
- Department of Electronics and Communication Engineering, Bennett University, Greater Noida, 201310, India
| | - Amit Singhal
- Department of Electronics and Communication Engineering, NSUT, New Delhi, 110078, India.
| |
Collapse
|
4
|
Zhang J, Li K, Yang B, Han X. Local and global convolutional transformer-based motor imagery EEG classification. Front Neurosci 2023; 17:1219988. [PMID: 37662099 PMCID: PMC10469791 DOI: 10.3389/fnins.2023.1219988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 08/07/2023] [Indexed: 09/05/2023] Open
Abstract
Transformer, a deep learning model with the self-attention mechanism, combined with the convolution neural network (CNN) has been successfully applied for decoding electroencephalogram (EEG) signals in Motor Imagery (MI) Brain-Computer Interface (BCI). However, the extremely non-linear, nonstationary characteristics of the EEG signals limits the effectiveness and efficiency of the deep learning methods. In addition, the variety of subjects and the experimental sessions impact the model adaptability. In this study, we propose a local and global convolutional transformer-based approach for MI-EEG classification. The local transformer encoder is combined to dynamically extract temporal features and make up for the shortcomings of the CNN model. The spatial features from all channels and the difference in hemispheres are obtained to improve the robustness of the model. To acquire adequate temporal-spatial feature representations, we combine the global transformer encoder and Densely Connected Network to improve the information flow and reuse. To validate the performance of the proposed model, three scenarios including within-session, cross-session and two-session are designed. In the experiments, the proposed method achieves up to 1.46%, 7.49% and 7.46% accuracy improvement respectively in the three scenarios for the public Korean dataset compared with current state-of-the-art models. For the BCI competition IV 2a dataset, the proposed model also achieves a 2.12% and 2.21% improvement for the cross-session and two-session scenarios respectively. The results confirm that the proposed approach can effectively extract much richer set of MI features from the EEG signals and improve the performance in the BCI applications.
Collapse
Affiliation(s)
- Jiayang Zhang
- School of Electrical Engineering, University of Leeds, Leeds, United Kingdom
| | - Kang Li
- School of Electrical Engineering, University of Leeds, Leeds, United Kingdom
| | - Banghua Yang
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, China
| | - Xiaofei Han
- School of Electrical Engineering, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
5
|
Jing Y, Wang W, Wang J, Jiao Y, Xiang K, Lin T, Shi W, Hou ZG. Transformer Based Cross-Subject Mental Workload Classification Using FNIRS for Real-World Application. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38082781 DOI: 10.1109/embc40787.2023.10341167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Mental state monitoring is a hot topic especially in neurorehabilitation, skill training, etc, for which the functional near-infrared spectroscopy (fNIRS) has been suggested to be used, and fewer detection channels and cross-subject performance are usually required for real-world application. To this goal, we propose a transformer-based method for cross-subject mental workload classification using fewer channels of fNIRS. Firstly, the input fNIRS signals in a window are divided into patches in the temporal order and transformed into embeddings, to which a classification token and learnable position embeddings are added. Then, a transformer encoder is used to learn the long-range dependencies among the embeddings, of which the output classification token is sent to a multilayer perceptron (MLP) head. Mental workload classification results can be represented by the outputs of the MLP head. Finally, comparison experiments were conducted on the open-access fNIRS2MW dataset. The results show that, the proposed method can outperform previous methods in cross-subject classification accuracy, and relatively efficient computation can be obtained.
Collapse
|
6
|
Wan Z, Li M, Liu S, Huang J, Tan H, Duan W. EEGformer: A transformer-based brain activity classification method using EEG signal. Front Neurosci 2023; 17:1148855. [PMID: 37034169 PMCID: PMC10079879 DOI: 10.3389/fnins.2023.1148855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 03/06/2023] [Indexed: 04/11/2023] Open
Abstract
Background The effective analysis methods for steady-state visual evoked potential (SSVEP) signals are critical in supporting an early diagnosis of glaucoma. Most efforts focused on adopting existing techniques to the SSVEPs-based brain-computer interface (BCI) task rather than proposing new ones specifically suited to the domain. Method Given that electroencephalogram (EEG) signals possess temporal, regional, and synchronous characteristics of brain activity, we proposed a transformer-based EEG analysis model known as EEGformer to capture the EEG characteristics in a unified manner. We adopted a one-dimensional convolution neural network (1DCNN) to automatically extract EEG-channel-wise features. The output was fed into the EEGformer, which is sequentially constructed using three components: regional, synchronous, and temporal transformers. In addition to using a large benchmark database (BETA) toward SSVEP-BCI application to validate model performance, we compared the EEGformer to current state-of-the-art deep learning models using two EEG datasets, which are obtained from our previous study: SJTU emotion EEG dataset (SEED) and a depressive EEG database (DepEEG). Results The experimental results show that the EEGformer achieves the best classification performance across the three EEG datasets, indicating that the rationality of our model architecture and learning EEG characteristics in a unified manner can improve model classification performance. Conclusion EEGformer generalizes well to different EEG datasets, demonstrating our approach can be potentially suitable for providing accurate brain activity classification and being used in different application scenarios, such as SSVEP-based early glaucoma diagnosis, emotion recognition and depression discrimination.
Collapse
Affiliation(s)
- Zhijiang Wan
- The First Affiliated Hospital of Nanchang University, Nanchang University, Nanchang, Jiangxi, China
- School of Information Engineering, Nanchang University, Nanchang, Jiangxi, China
- Industrial Institute of Artificial Intelligence, Nanchang University, Nanchang, Jiangxi, China
| | - Manyu Li
- School of Information Engineering, Nanchang University, Nanchang, Jiangxi, China
| | - Shichang Liu
- School of Computer Science, Shaanxi Normal University, Xi’an, Shaanxi, China
| | - Jiajin Huang
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Hai Tan
- School of Computer Science, Nanjing Audit University, Nanjing, Jiangsu, China
| | - Wenfeng Duan
- The First Affiliated Hospital of Nanchang University, Nanchang University, Nanchang, Jiangxi, China
| |
Collapse
|