1
|
Sharma Y, Singh BK, Dhurandhar S. Vocal tasks-based EEG and speech signal analysis in children with neurodevelopmental disorders: a multimodal investigation. Cogn Neurodyn 2024; 18:2387-2403. [PMID: 39555290 PMCID: PMC11564584 DOI: 10.1007/s11571-024-10096-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 02/06/2024] [Accepted: 02/24/2024] [Indexed: 11/19/2024] Open
Abstract
Neurodevelopmental disorders (NDs) often hamper multiple functional prints of a child brain. Despite several studies on their neural and speech responses, multimodal researches on NDs are extremely rare. The present work examined the electroencephalography (EEG) and speech signals of the ND and control children, who performed "Hindi language" vocal tasks (V) of seven different categories, viz. 'vowel', 'consonant', 'one syllable', 'multi-syllable', 'compound', 'complex', and 'sentence' (V1-V7). Statistical testing of EEG parameters showed substantially high beta and gamma band energies in frontal, central, and temporal head sites of NDs for tasks V1-V5 and in parietal too for V6. For the 'sentence' task (V7), the NDs yielded significantly high theta and low alpha energies in the parietal area. These findings imply that even performing a general context-based task exerts a heavy cognitive loading in neurodevelopmental subjects. They also exhibited poor auditory comprehension while executing a long phrasing. Further, the speech signal analysis manifested significantly high amplitude (for V1-V7) and frequency (for V3-V7) perturbations in the voices of ND children. Moreover, the classification of subjects as ND or control was done via EEG and speech features. We attained 100% accuracy, precision, and F-measure using EEG features of all tasks, and using speech features of the 'complex' task. Jointly, the 'complex' task transpired as the best vocal stimuli among V1-V7 for characterizing ND brains. Meanwhile, we also inspected inter-relations between EEG energies and speech attributes of the ND group. Our work, thus, represents a unique multimodal layout to explore the distinctiveness of neuro-impaired children.
Collapse
Affiliation(s)
- Yogesh Sharma
- Department of Biomedical Engineering, National Institute of Technology Raipur, Raipur, Chhattisgarh 492010 India
| | - Bikesh Kumar Singh
- Department of Biomedical Engineering, National Institute of Technology Raipur, Raipur, Chhattisgarh 492010 India
| | | |
Collapse
|
2
|
Bordonaro M. Postmortem communication. Theory Biosci 2024; 143:229-234. [PMID: 39096453 DOI: 10.1007/s12064-024-00423-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/27/2024] [Indexed: 08/05/2024]
Abstract
The phenomenon of near death and dying experiences has been both of popular interest and of scientific speculation. However, the reality of mental perception at the point of death is currently a subjective experience and has not been formally evaluated. While postmortem gene expression, even in humans, has been evaluated, restoration of postmortem brain activity has heretofore only been attempted in animal models, at the molecular and cellular levels. Meanwhile, progress has been made to translate brain activity of living humans into speech and images. This paper proposes two inter-related thought experiments. First, assuming progress and refinement of the technology of translating human brain activity into interpretable speech and images, can an objective analysis of death experiences be obtained by utilizing these technologies on dying humans? Second, can human brain function be revived postmortem and, if so, can the relevant technologies be utilized for communication with (recently) deceased individuals? In this paper, these questions are considered and possible implications explored.
Collapse
Affiliation(s)
- Michael Bordonaro
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA, 18509, USA.
| |
Collapse
|
3
|
Liang Y, Zhang C, An S, Wang Z, Shi K, Peng T, Ma Y, Xie X, He J, Zheng K. FetchEEG: a hybrid approach combining feature extraction and temporal-channel joint attention for EEG-based emotion classification. J Neural Eng 2024; 21:036011. [PMID: 38701773 DOI: 10.1088/1741-2552/ad4743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 05/03/2024] [Indexed: 05/05/2024]
Abstract
Objective. Electroencephalogram (EEG) analysis has always been an important tool in neural engineering, and the recognition and classification of human emotions are one of the important tasks in neural engineering. EEG data, obtained from electrodes placed on the scalp, represent a valuable resource of information for brain activity analysis and emotion recognition. Feature extraction methods have shown promising results, but recent trends have shifted toward end-to-end methods based on deep learning. However, these approaches often overlook channel representations, and their complex structures pose certain challenges to model fitting.Approach. To address these challenges, this paper proposes a hybrid approach named FetchEEG that combines feature extraction and temporal-channel joint attention. Leveraging the advantages of both traditional feature extraction and deep learning, the FetchEEG adopts a multi-head self-attention mechanism to extract representations between different time moments and channels simultaneously. The joint representations are then concatenated and classified using fully-connected layers for emotion recognition. The performance of the FetchEEG is verified by comparison experiments on a self-developed dataset and two public datasets.Main results. In both subject-dependent and subject-independent experiments, the FetchEEG demonstrates better performance and stronger generalization ability than the state-of-the-art methods on all datasets. Moreover, the performance of the FetchEEG is analyzed for different sliding window sizes and overlap rates in the feature extraction module. The sensitivity of emotion recognition is investigated for three- and five-frequency-band scenarios.Significance. FetchEEG is a novel hybrid method based on EEG for emotion classification, which combines EEG feature extraction with Transformer neural networks. It has achieved state-of-the-art performance on both self-developed datasets and multiple public datasets, with significantly higher training efficiency compared to end-to-end methods, demonstrating its effectiveness and feasibility.
Collapse
Affiliation(s)
- Yu Liang
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| | - Chenlong Zhang
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| | - Shan An
- JD Health International Inc., Beijing, People's Republic of China
| | - Zaitian Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| | - Kaize Shi
- University of Technology Sydney, Sydney, Australia
| | - Tianhao Peng
- Beihang University, Beijing, People's Republic of China
| | - Yuqing Ma
- Beihang University, Beijing, People's Republic of China
| | - Xiaoyang Xie
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| | - Jian He
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| | - Kun Zheng
- Faculty of Information Technology, Beijing University of Technology, Beijing, People's Republic of China
| |
Collapse
|
4
|
Qin Y, Zhang W, Tao X. TBEEG: A Two-Branch Manifold Domain Enhanced Transformer Algorithm for Learning EEG Decoding. IEEE Trans Neural Syst Rehabil Eng 2024; 32:1466-1476. [PMID: 38526885 DOI: 10.1109/tnsre.2024.3380595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
The electroencephalogram-based (EEG) brain-computer interface (BCI) has garnered significant attention in recent research. However, the practicality of EEG remains constrained by the lack of efficient EEG decoding technology. The challenge lies in effectively translating intricate EEG into meaningful, generalizable information. EEG signal decoding primarily relies on either time domain or frequency domain information. There lacks a method capable of simultaneously and effectively extracting both time and frequency domain features, as well as efficiently fuse these features. Addressing these limitations, a two-branch Manifold Domain enhanced transformer algorithm is designed to holistically capture EEG's spatio-temporal information. Our method projects the time-domain information of EEG signals into the Riemannian spaces to fully decode the time dependence of EEG signals. Using wavelet transform, the time domain information is converted into frequency domain information, and the spatial information contained in the frequency domain information of EEG signal is mined through the spectrogram. The effectiveness of the proposed TBEEG algorithm is validated on BCIC-IV-2a dataset and MAMEM-SSVEP-II datasets.
Collapse
|
5
|
Liu W, Li G, Huang Z, Jiang W, Luo X, Xu X. Enhancing generalized anxiety disorder diagnosis precision: MSTCNN model utilizing high-frequency EEG signals. Front Psychiatry 2023; 14:1310323. [PMID: 38179243 PMCID: PMC10764566 DOI: 10.3389/fpsyt.2023.1310323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 12/01/2023] [Indexed: 01/06/2024] Open
Abstract
Generalized Anxiety Disorder (GAD) is a prevalent mental disorder on the rise in modern society. It is crucial to achieve precise diagnosis of GAD for improving the treatments and averting exacerbation. Although a growing number of researchers beginning to explore the deep learning algorithms for detecting mental disorders, there is a dearth of reports concerning precise GAD diagnosis. This study proposes a multi-scale spatial-temporal local sequential and global parallel convolutional model, named MSTCNN, which designed to achieve highly accurate GAD diagnosis using high-frequency electroencephalogram (EEG) signals. To this end, 10-min resting EEG data were collected from 45 GAD patients and 36 healthy controls (HC). Various frequency bands were extracted from the EEG data as the inputs of the MSTCNN. The results demonstrate that the proposed MSTCNN, combined with the attention mechanism of Squeeze-and-Excitation Networks, achieves outstanding classification performance for GAD detection, with an accuracy of 99.48% within the 4-30 Hz EEG data, which is competitively related to state-of-art methods in terms of GAD classification. Furthermore, our research unveils an intriguing revelation regarding the pivotal role of high-frequency band in GAD diagnosis. As the frequency band increases, diagnostic accuracy improves. Notably, high-frequency EEG data ranging from 10-30 Hz exhibited an accuracy rate of 99.47%, paralleling the performance of the broader 4-30 Hz band. In summary, these findings move a step forward towards the practical application of automatic diagnosis of GAD and provide basic theory and technical support for the development of future clinical diagnosis system.
Collapse
Affiliation(s)
- Wei Liu
- College of Computer Science and Technology, Zhejiang Normal University, Jinhua, China
| | - Gang Li
- College of Mathematical Medicine, Zhejiang Normal University, Jinhua, China
| | - Ziyi Huang
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou, China
| | - Weixiong Jiang
- College of Mathematical Medicine, Zhejiang Normal University, Jinhua, China
| | | | - Xingjuan Xu
- College of Mathematical Medicine, Zhejiang Normal University, Jinhua, China
| |
Collapse
|
6
|
Moon J, Chau T. Online Ternary Classification of Covert Speech by Leveraging the Passive Perception of Speech. Int J Neural Syst 2023; 33:2350048. [PMID: 37522623 DOI: 10.1142/s012906572350048x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
Brain-computer interfaces (BCIs) provide communicative alternatives to those without functional speech. Covert speech (CS)-based BCIs enable communication simply by thinking of words and thus have intuitive appeal. However, an elusive barrier to their clinical translation is the collection of voluminous examples of high-quality CS signals, as iteratively rehearsing words for long durations is mentally fatiguing. Research on CS and speech perception (SP) identifies common spatiotemporal patterns in their respective electroencephalographic (EEG) signals, pointing towards shared encoding mechanisms. The goal of this study was to investigate whether a model that leverages the signal similarities between SP and CS can differentiate speech-related EEG signals online. Ten participants completed a dyadic protocol where in each trial, they listened to a randomly selected word and then subsequently mentally rehearsed the word. In the offline sessions, eight words were presented to participants. For the subsequent online sessions, the two most distinct words (most separable in terms of their EEG signals) were chosen to form a ternary classification problem (two words and rest). The model comprised a functional mapping derived from SP and CS signals of the same speech token (features are extracted via a Riemannian approach). An average ternary online accuracy of 75.3% (60% chance level) was achieved across participants, with individual accuracies as high as 93%. Moreover, we observed that the signal-to-noise ratio (SNR) of CS signals was enhanced by perception-covert modeling according to the level of high-frequency ([Formula: see text]-band) correspondence between CS and SP. These findings may lead to less burdensome data collection for training speech BCIs, which could eventually enhance the rate at which the vocabulary can grow.
Collapse
Affiliation(s)
- Jae Moon
- Institute of Biomedical Engineering, University of Toronto, Holland Bloorview Kid's Rehabilitation Hospital, Toronto, Ontario, Canada
| | - Tom Chau
- Institute of Biomedical Engineering, University of Toronto, Holland Bloorview Kid's Rehabilitation Hospital, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Park HJ, Lee B. Multiclass classification of imagined speech EEG using noise-assisted multivariate empirical mode decomposition and multireceptive field convolutional neural network. Front Hum Neurosci 2023; 17:1186594. [PMID: 37645689 PMCID: PMC10461632 DOI: 10.3389/fnhum.2023.1186594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/21/2023] [Indexed: 08/31/2023] Open
Abstract
Introduction In this study, we classified electroencephalography (EEG) data of imagined speech using signal decomposition and multireceptive convolutional neural network. The imagined speech EEG with five vowels /a/, /e/, /i/, /o/, and /u/, and mute (rest) sounds were obtained from ten study participants. Materials and methods First, two different signal decomposition methods were applied for comparison: noise-assisted multivariate empirical mode decomposition and wavelet packet decomposition. Six statistical features were calculated from the decomposed eight sub-frequency bands EEG. Next, all features obtained from each channel of the trial were vectorized and used as the input vector of classifiers. Lastly, EEG was classified using multireceptive field convolutional neural network and several other classifiers for comparison. Results We achieved an average classification rate of 73.09 and up to 80.41% in a multiclass (six classes) setup (Chance: 16.67%). In comparison with various other classifiers, significant improvements for other classifiers were achieved (p-value < 0.05). From the frequency sub-band analysis, high-frequency band regions and the lowest-frequency band region contain more information about imagined vowel EEG data. The misclassification and classification rate of each vowel imaginary EEG was analyzed through a confusion matrix. Discussion Imagined speech EEG can be classified successfully using the proposed signal decomposition method and a convolutional neural network. The proposed classification method for imagined speech EEG can contribute to developing a practical imagined speech-based brain-computer interfaces system.
Collapse
Affiliation(s)
- Hyeong-jun Park
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| | - Boreom Lee
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
- AI Graduate School, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| |
Collapse
|
8
|
Shah U, Alzubaidi M, Mohsen F, Abd-Alrazaq A, Alam T, Househ M. The Role of Artificial Intelligence in Decoding Speech from EEG Signals: A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:6975. [PMID: 36146323 PMCID: PMC9505262 DOI: 10.3390/s22186975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/01/2022] [Accepted: 08/09/2022] [Indexed: 06/16/2023]
Abstract
Background: Brain traumas, mental disorders, and vocal abuse can result in permanent or temporary speech impairment, significantly impairing one's quality of life and occasionally resulting in social isolation. Brain-computer interfaces (BCI) can support people who have issues with their speech or who have been paralyzed to communicate with their surroundings via brain signals. Therefore, EEG signal-based BCI has received significant attention in the last two decades for multiple reasons: (i) clinical research has capitulated detailed knowledge of EEG signals, (ii) inexpensive EEG devices, and (iii) its application in medical and social fields. Objective: This study explores the existing literature and summarizes EEG data acquisition, feature extraction, and artificial intelligence (AI) techniques for decoding speech from brain signals. Method: We followed the PRISMA-ScR guidelines to conduct this scoping review. We searched six electronic databases: PubMed, IEEE Xplore, the ACM Digital Library, Scopus, arXiv, and Google Scholar. We carefully selected search terms based on target intervention (i.e., imagined speech and AI) and target data (EEG signals), and some of the search terms were derived from previous reviews. The study selection process was carried out in three phases: study identification, study selection, and data extraction. Two reviewers independently carried out study selection and data extraction. A narrative approach was adopted to synthesize the extracted data. Results: A total of 263 studies were evaluated; however, 34 met the eligibility criteria for inclusion in this review. We found 64-electrode EEG signal devices to be the most widely used in the included studies. The most common signal normalization and feature extractions in the included studies were the bandpass filter and wavelet-based feature extraction. We categorized the studies based on AI techniques, such as machine learning and deep learning. The most prominent ML algorithm was a support vector machine, and the DL algorithm was a convolutional neural network. Conclusions: EEG signal-based BCI is a viable technology that can enable people with severe or temporal voice impairment to communicate to the world directly from their brain. However, the development of BCI technology is still in its infancy.
Collapse
Affiliation(s)
- Uzair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mahmood Alzubaidi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Alaa Abd-Alrazaq
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha P.O. Box 34110, Qatar
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mowafa Househ
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| |
Collapse
|
9
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
10
|
Multiclass Classification of Imagined Speech Vowels and Words of Electroencephalography Signals Using Deep Learning. ADVANCES IN HUMAN-COMPUTER INTERACTION 2022. [DOI: 10.1155/2022/1374880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The paper’s emphasis is on the imagined speech decoding of electroencephalography (EEG) neural signals of individuals in accordance with the expansion of the brain-computer interface to encompass individuals with speech problems encountering communication challenges. Decoding an individual’s imagined speech from nonstationary and nonlinear EEG neural signals is a complex task. Related research work in the field of imagined speech has revealed that imagined speech decoding performance and accuracy require attention to further improve. The evolution of deep learning technology increases the likelihood of decoding imagined speech from EEG signals with enhanced performance. We proposed a novel supervised deep learning model that combined the temporal convolutional networks and the convolutional neural networks with the intent of retrieving information from the EEG signals. The experiment was carried out using an open-access dataset of fifteen subjects’ imagined speech multichannel signals of vowels and words. The raw multichannel EEG signals of multiple subjects were processed using discrete wavelet transformation technique. The model was trained and evaluated using the preprocessed signals, and the model hyperparameters were adjusted to achieve higher accuracy in the classification of imagined speech. The experiment results demonstrated that the multiclass imagined speech classification of the proposed model exhibited a higher overall accuracy of 0.9649 and a classification error rate of 0.0350. The results of the study indicate that individuals with speech difficulties might well be able to leverage a noninvasive EEG-based imagined speech brain-computer interface system as one of the long-term alternative artificial verbal communication mediums.
Collapse
|