1
|
Carvalho VR, Mendes EMAM, Fallah A, Sejnowski TJ, Comstock L, Lainscsek C. Decoding imagined speech with delay differential analysis. Front Hum Neurosci 2024; 18:1398065. [PMID: 38826617 PMCID: PMC11140152 DOI: 10.3389/fnhum.2024.1398065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 04/25/2024] [Indexed: 06/04/2024] Open
Abstract
Speech decoding from non-invasive EEG signals can achieve relatively high accuracy (70-80%) for strictly delimited classification tasks, but for more complex tasks non-invasive speech decoding typically yields a 20-50% classification accuracy. However, decoder generalization, or how well algorithms perform objectively across datasets, is complicated by the small size and heterogeneity of existing EEG datasets. Furthermore, the limited availability of open access code hampers a comparison between methods. This study explores the application of a novel non-linear method for signal processing, delay differential analysis (DDA), to speech decoding. We provide a systematic evaluation of its performance on two public imagined speech decoding datasets relative to all publicly available deep learning methods. The results support DDA as a compelling alternative or complementary approach to deep learning methods for speech decoding. DDA is a fast and efficient time-domain open-source method that fits data using only few strong features and does not require extensive preprocessing.
Collapse
Affiliation(s)
- Vinícius Rezende Carvalho
- RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Postgraduate Program in Electrical Engineering, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Aria Fallah
- Department of Neurosurgery, University of California, Los Angeles, Los Angeles, CA, United States
| | - Terrence J. Sejnowski
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, United States
- Institute for Neural Computation University of California, San Diego, La Jolla, CA, United States
- Department of Neurobiology, University of California, San Diego, La Jolla, CA, United States
| | - Lindy Comstock
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, United States
| | - Claudia Lainscsek
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, United States
- Institute for Neural Computation University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
2
|
Xu Z, Hu Y, Shao X, Shi T, Yang J, Wan Q, Liu Y. The Efficacy of Machine Learning Models for Predicting the Prognosis of Heart Failure: A Systematic Review and Meta-Analysis. Cardiology 2024:1-19. [PMID: 38648752 DOI: 10.1159/000538639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/28/2024] [Indexed: 04/25/2024]
Abstract
INTRODUCTION Heart failure (HF) is a major global public health concern. The application of machine learning (ML) to identify individuals at high risk and enable early intervention is a promising approach for improving HF prognosis. We aim to systematically evaluate the performance and value of ML models for predicting HF prognosis. METHODS PubMed, Web of Science, Scopus, and Embase online databases were searched up to April 30, 2023, to identify studies on the use of ML models to predict HF prognosis. HF prognosis primarily encompasses readmission and mortality. The meta-analysis was conducted by MedCalc software. Subgroup analyses include grouping based on types of ML models, time intervals, sample sizes, the number of predictive variables, validation methods, whether to conduct hyperparameter optimization and calibration, data set partitioning methods. RESULTS A total of 31 studies were included. The most common ML models were random forest, boosting, support vector machine, neural network. The area under the receiver operating characteristic curve (AUC) for predicting HF readmission was 0.675 (95% CI: 0.651-0.699, p < 0.001), and the AUC for predicting HF mortality was 0.790 (95% CI: 0.765-0.816, p < 0.001). Subgroup analyses revealed that models with the prediction time interval of 1 year, sample sizes ≥10,000, the number of predictive variables ≥100, external validation, hyperparameter tuning, calibration adjustment, and data set partitioning using 10-fold cross-validation exhibited favorable performance within their respective subgroups. CONCLUSION The performance of ML models in predicting HF readmission is relatively poor, while its performance in predicting HF mortality is moderate. The quality of the relevant studies is generally low, it is essential to enhance the predictive capabilities of ML models through targeted improvements in practical applications.
Collapse
Affiliation(s)
- Zhaohui Xu
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China,
| | - Yinqin Hu
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xinyi Shao
- The Grier School, Tyrone, Pennsylvania, USA
| | - Tianyun Shi
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Jiahui Yang
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Qiqi Wan
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yongming Liu
- Department of Cardiovascular Disease, ShuGuang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Department of Cardiovascular Disease, Anhui Provincial Hospital of Integrated Medicine, Hefei Anhui, China
| |
Collapse
|
3
|
Rakhmatulin I, Dao MS, Nassibi A, Mandic D. Exploring Convolutional Neural Network Architectures for EEG Feature Extraction. SENSORS (BASEL, SWITZERLAND) 2024; 24:877. [PMID: 38339594 PMCID: PMC10856895 DOI: 10.3390/s24030877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/12/2024] [Accepted: 01/20/2024] [Indexed: 02/12/2024]
Abstract
The main purpose of this paper is to provide information on how to create a convolutional neural network (CNN) for extracting features from EEG signals. Our task was to understand the primary aspects of creating and fine-tuning CNNs for various application scenarios. We considered the characteristics of EEG signals, coupled with an exploration of various signal processing and data preparation techniques. These techniques include noise reduction, filtering, encoding, decoding, and dimension reduction, among others. In addition, we conduct an in-depth analysis of well-known CNN architectures, categorizing them into four distinct groups: standard implementation, recurrent convolutional, decoder architecture, and combined architecture. This paper further offers a comprehensive evaluation of these architectures, covering accuracy metrics, hyperparameters, and an appendix that contains a table outlining the parameters of commonly used CNN architectures for feature extraction from EEG signals.
Collapse
Affiliation(s)
- Ildar Rakhmatulin
- Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK; (A.N.)
| | - Minh-Son Dao
- National Institute of Information and Communications Technology (NICT), Tokyo 184-0015, Japan
| | - Amir Nassibi
- Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK; (A.N.)
| | - Danilo Mandic
- Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK; (A.N.)
| |
Collapse
|
4
|
Jeong JH, Cho JH, Lee BH, Lee SW. Real-Time Deep Neurolinguistic Learning Enhances Noninvasive Neural Language Decoding for Brain-Machine Interaction. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7469-7482. [PMID: 36251899 DOI: 10.1109/tcyb.2022.3211694] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Electroencephalogram (EEG)-based brain-machine interface (BMI) has been utilized to help patients regain motor function and has recently been validated for its use in healthy people because of its ability to directly decipher human intentions. In particular, neurolinguistic research using EEGs has been investigated as an intuitive and naturalistic communication tool between humans and machines. In this study, the human mind directly decoded the neural languages based on speech imagery using the proposed deep neurolinguistic learning. Through real-time experiments, we evaluated whether BMI-based cooperative tasks between multiple users could be accomplished using a variety of neural languages. We successfully demonstrated a BMI system that allows a variety of scenarios, such as essential activity, collaborative play, and emotional interaction. This outcome presents a novel BMI frontier that can interact at the level of human-like intelligence in real time and extends the boundaries of the communication paradigm.
Collapse
|
5
|
Park HJ, Lee B. Multiclass classification of imagined speech EEG using noise-assisted multivariate empirical mode decomposition and multireceptive field convolutional neural network. Front Hum Neurosci 2023; 17:1186594. [PMID: 37645689 PMCID: PMC10461632 DOI: 10.3389/fnhum.2023.1186594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/21/2023] [Indexed: 08/31/2023] Open
Abstract
Introduction In this study, we classified electroencephalography (EEG) data of imagined speech using signal decomposition and multireceptive convolutional neural network. The imagined speech EEG with five vowels /a/, /e/, /i/, /o/, and /u/, and mute (rest) sounds were obtained from ten study participants. Materials and methods First, two different signal decomposition methods were applied for comparison: noise-assisted multivariate empirical mode decomposition and wavelet packet decomposition. Six statistical features were calculated from the decomposed eight sub-frequency bands EEG. Next, all features obtained from each channel of the trial were vectorized and used as the input vector of classifiers. Lastly, EEG was classified using multireceptive field convolutional neural network and several other classifiers for comparison. Results We achieved an average classification rate of 73.09 and up to 80.41% in a multiclass (six classes) setup (Chance: 16.67%). In comparison with various other classifiers, significant improvements for other classifiers were achieved (p-value < 0.05). From the frequency sub-band analysis, high-frequency band regions and the lowest-frequency band region contain more information about imagined vowel EEG data. The misclassification and classification rate of each vowel imaginary EEG was analyzed through a confusion matrix. Discussion Imagined speech EEG can be classified successfully using the proposed signal decomposition method and a convolutional neural network. The proposed classification method for imagined speech EEG can contribute to developing a practical imagined speech-based brain-computer interfaces system.
Collapse
Affiliation(s)
- Hyeong-jun Park
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| | - Boreom Lee
- Department of Biomedical Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
- AI Graduate School, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea
| |
Collapse
|
6
|
Nitta T, Horikawa J, Iribe Y, Taguchi R, Katsurada K, Shinohara S, Kawai G. Linguistic representation of vowels in speech imagery EEG. Front Hum Neurosci 2023; 17:1163578. [PMID: 37275343 PMCID: PMC10237317 DOI: 10.3389/fnhum.2023.1163578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 04/27/2023] [Indexed: 06/07/2023] Open
Abstract
Speech imagery recognition from electroencephalograms (EEGs) could potentially become a strong contender among non-invasive brain-computer interfaces (BCIs). In this report, first we extract language representations as the difference of line-spectra of phones by statistically analyzing many EEG signals from the Broca area. Then we extract vowels by using iterative search from hand-labeled short-syllable data. The iterative search process consists of principal component analysis (PCA) that visualizes linguistic representation of vowels through eigen-vectors φ(m), and subspace method (SM) that searches an optimum line-spectrum for redesigning φ(m). The extracted linguistic representation of Japanese vowels /i/ /e/ /a/ /o/ /u/ shows 2 distinguished spectral peaks (P1, P2) in the upper frequency range. The 5 vowels are aligned on the P1-P2 chart. A 5-vowel recognition experiment using a data set of 5 subjects and a convolutional neural network (CNN) classifier gave a mean accuracy rate of 72.6%.
Collapse
Affiliation(s)
- Tsuneo Nitta
- Graduate School of Engineering, Toyohashi University of Technology, Toyohashi, Japan
| | - Junsei Horikawa
- Graduate School of Engineering, Toyohashi University of Technology, Toyohashi, Japan
| | - Yurie Iribe
- Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute, Japan
| | - Ryo Taguchi
- Graduate School of Information, Nagoya Institute of Technology, Nagoya, Japan
| | - Kouichi Katsurada
- Faculty of Science and Technology, Tokyo University of Science, Noda, Japan
| | - Shuji Shinohara
- School of Science and Engineering, Tokyo Denki University, Saitama, Japan
| | - Goh Kawai
- Online Learning Support Team, Tokyo University of Foreign Studies, Tokyo, Japan
| |
Collapse
|
7
|
Yang L, Van Hulle MM. Real-Time Navigation in Google Street View ® Using a Motor Imagery-Based BCI. SENSORS (BASEL, SWITZERLAND) 2023; 23:1704. [PMID: 36772744 PMCID: PMC9921617 DOI: 10.3390/s23031704] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 01/28/2023] [Accepted: 01/30/2023] [Indexed: 06/18/2023]
Abstract
Navigation in virtual worlds is ubiquitous in games and other virtual reality (VR) applications and mainly relies on external controllers. As brain-computer interfaces (BCI)s rely on mental control, bypassing traditional neural pathways, they provide to paralyzed users an alternative way to navigate. However, the majority of BCI-based navigation studies adopt cue-based visual paradigms, and the evoked brain responses are encoded into navigation commands. Although robust and accurate, these paradigms are less intuitive and comfortable for navigation compared to imagining limb movements (motor imagery, MI). However, decoding motor imagery from EEG activity is notoriously challenging. Typically, wet electrodes are used to improve EEG signal quality, including a large number of them to discriminate between movements of different limbs, and a cuedbased paradigm is used instead of a self-paced one to maximize decoding performance. Motor BCI applications primarily focus on typing applications or on navigating a wheelchair-the latter raises safety concerns-thereby calling for sensors scanning the environment for obstacles and potentially hazardous scenarios. With the help of new technologies such as virtual reality (VR), vivid graphics can be rendered, providing the user with a safe and immersive experience; and they could be used for navigation purposes, a topic that has yet to be fully explored in the BCI community. In this study, we propose a novel MI-BCI application based on an 8-dry-electrode EEG setup, with which users can explore and navigate in Google Street View®. We pay attention to system design to address the lower performance of the MI decoder due to the dry electrodes' lower signal quality and the small number of electrodes. Specifically, we restricted the number of navigation commands by using a novel middle-level control scheme and avoided decoder mistakes by introducing eye blinks as a control signal in different navigation stages. Both offline and online experiments were conducted with 20 healthy subjects. The results showed acceptable performance, even given the limitations of the EEG set-up, which we attribute to the design of the BCI application. The study suggests the use of MI-BCI in future games and VR applications for consumers and patients temporarily or permanently devoid of muscle control.
Collapse
|
8
|
A survey on multi-objective hyperparameter optimization algorithms for machine learning. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10359-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
AbstractHyperparameter optimization (HPO) is a necessary step to ensure the best possible performance of Machine Learning (ML) algorithms. Several methods have been developed to perform HPO; most of these are focused on optimizing one performance measure (usually an error-based measure), and the literature on such single-objective HPO problems is vast. Recently, though, algorithms have appeared that focus on optimizing multiple conflicting objectives simultaneously. This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms, distinguishing between metaheuristic-based algorithms, metamodel-based algorithms and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
Collapse
|
9
|
Shah U, Alzubaidi M, Mohsen F, Abd-Alrazaq A, Alam T, Househ M. The Role of Artificial Intelligence in Decoding Speech from EEG Signals: A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:6975. [PMID: 36146323 PMCID: PMC9505262 DOI: 10.3390/s22186975] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/01/2022] [Accepted: 08/09/2022] [Indexed: 06/16/2023]
Abstract
Background: Brain traumas, mental disorders, and vocal abuse can result in permanent or temporary speech impairment, significantly impairing one's quality of life and occasionally resulting in social isolation. Brain-computer interfaces (BCI) can support people who have issues with their speech or who have been paralyzed to communicate with their surroundings via brain signals. Therefore, EEG signal-based BCI has received significant attention in the last two decades for multiple reasons: (i) clinical research has capitulated detailed knowledge of EEG signals, (ii) inexpensive EEG devices, and (iii) its application in medical and social fields. Objective: This study explores the existing literature and summarizes EEG data acquisition, feature extraction, and artificial intelligence (AI) techniques for decoding speech from brain signals. Method: We followed the PRISMA-ScR guidelines to conduct this scoping review. We searched six electronic databases: PubMed, IEEE Xplore, the ACM Digital Library, Scopus, arXiv, and Google Scholar. We carefully selected search terms based on target intervention (i.e., imagined speech and AI) and target data (EEG signals), and some of the search terms were derived from previous reviews. The study selection process was carried out in three phases: study identification, study selection, and data extraction. Two reviewers independently carried out study selection and data extraction. A narrative approach was adopted to synthesize the extracted data. Results: A total of 263 studies were evaluated; however, 34 met the eligibility criteria for inclusion in this review. We found 64-electrode EEG signal devices to be the most widely used in the included studies. The most common signal normalization and feature extractions in the included studies were the bandpass filter and wavelet-based feature extraction. We categorized the studies based on AI techniques, such as machine learning and deep learning. The most prominent ML algorithm was a support vector machine, and the DL algorithm was a convolutional neural network. Conclusions: EEG signal-based BCI is a viable technology that can enable people with severe or temporal voice impairment to communicate to the world directly from their brain. However, the development of BCI technology is still in its infancy.
Collapse
Affiliation(s)
- Uzair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mahmood Alzubaidi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Alaa Abd-Alrazaq
- AI Center for Precision Health, Weill Cornell Medicine-Qatar, Doha P.O. Box 34110, Qatar
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| | - Mowafa Househ
- College of Science and Engineering, Hamad Bin Khalifa University, Doha P.O. Box 34110, Qatar
| |
Collapse
|
10
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
11
|
Multiclass Classification of Imagined Speech Vowels and Words of Electroencephalography Signals Using Deep Learning. ADVANCES IN HUMAN-COMPUTER INTERACTION 2022. [DOI: 10.1155/2022/1374880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The paper’s emphasis is on the imagined speech decoding of electroencephalography (EEG) neural signals of individuals in accordance with the expansion of the brain-computer interface to encompass individuals with speech problems encountering communication challenges. Decoding an individual’s imagined speech from nonstationary and nonlinear EEG neural signals is a complex task. Related research work in the field of imagined speech has revealed that imagined speech decoding performance and accuracy require attention to further improve. The evolution of deep learning technology increases the likelihood of decoding imagined speech from EEG signals with enhanced performance. We proposed a novel supervised deep learning model that combined the temporal convolutional networks and the convolutional neural networks with the intent of retrieving information from the EEG signals. The experiment was carried out using an open-access dataset of fifteen subjects’ imagined speech multichannel signals of vowels and words. The raw multichannel EEG signals of multiple subjects were processed using discrete wavelet transformation technique. The model was trained and evaluated using the preprocessed signals, and the model hyperparameters were adjusted to achieve higher accuracy in the classification of imagined speech. The experiment results demonstrated that the multiclass imagined speech classification of the proposed model exhibited a higher overall accuracy of 0.9649 and a classification error rate of 0.0350. The results of the study indicate that individuals with speech difficulties might well be able to leverage a noninvasive EEG-based imagined speech brain-computer interface system as one of the long-term alternative artificial verbal communication mediums.
Collapse
|
12
|
刘 艳, 龚 安, 丁 鹏, 赵 磊, 钱 谦, 周 建, 苏 磊, 伏 云. [Key technology of brain-computer interaction based on speech imagery]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2022; 39:596-611. [PMID: 35788530 PMCID: PMC10950764 DOI: 10.7507/1001-5515.202107018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 04/14/2022] [Indexed: 06/15/2023]
Abstract
Speech expression is an important high-level cognitive behavior of human beings. The realization of this behavior is closely related to human brain activity. Both true speech expression and speech imagination can activate part of the same brain area. Therefore, speech imagery becomes a new paradigm of brain-computer interaction. Brain-computer interface (BCI) based on speech imagery has the advantages of spontaneous generation, no training, and friendliness to subjects, so it has attracted the attention of many scholars. However, this interactive technology is not mature in the design of experimental paradigms and the choice of imagination materials, and there are many issues that need to be discussed urgently. Therefore, in response to these problems, this article first expounds the neural mechanism of speech imagery. Then, by reviewing the previous BCI research of speech imagery, the mainstream methods and core technologies of experimental paradigm, imagination materials, data processing and so on are systematically analyzed. Finally, the key problems and main challenges that restrict the development of this type of BCI are discussed. And the future development and application perspective of the speech imaginary BCI system are prospected.
Collapse
Affiliation(s)
- 艳鹏 刘
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 安民 龚
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 鹏 丁
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 磊 赵
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 谦 钱
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 建华 周
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 磊 苏
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 云发 伏
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 武警工程大学 信息工程学院(西安 710000)College of Information Engineering, Engineering University of PAP, Xi’an 710000, P. R. China
- 昆明理工大学 理学院(昆明 650500)Faculty of Science, Kunming University of Science and Technology, Kunming 650500, P. R. China
| |
Collapse
|
13
|
Medical Internet-of-Things Based Breast Cancer Diagnosis Using Hyperparameter-Optimized Neural Networks. FUTURE INTERNET 2022. [DOI: 10.3390/fi14050153] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
In today’s healthcare setting, the accurate and timely diagnosis of breast cancer is critical for recovery and treatment in the early stages. In recent years, the Internet of Things (IoT) has experienced a transformation that allows the analysis of real-time and historical data using artificial intelligence (AI) and machine learning (ML) approaches. Medical IoT combines medical devices and AI applications with healthcare infrastructure to support medical diagnostics. The current state-of-the-art approach fails to diagnose breast cancer in its initial period, resulting in the death of most women. As a result, medical professionals and researchers are faced with a tremendous problem in early breast cancer detection. We propose a medical IoT-based diagnostic system that competently identifies malignant and benign people in an IoT environment to resolve the difficulty of identifying early-stage breast cancer. The artificial neural network (ANN) and convolutional neural network (CNN) with hyperparameter optimization are used for malignant vs. benign classification, while the Support Vector Machine (SVM) and Multilayer Perceptron (MLP) were utilized as baseline classifiers for comparison. Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. We employ a particle swarm optimization (PSO) feature selection approach to select more satisfactory features from the breast cancer dataset to enhance the classification performance using MLP and SVM, while grid-based search was used to find the best combination of the hyperparameters of the CNN and ANN models. The Wisconsin Diagnostic Breast Cancer (WDBC) dataset was used to test the proposed approach. The proposed model got a classification accuracy of 98.5% using CNN, and 99.2% using ANN.
Collapse
|
14
|
Lopez-Bernal D, Balderas D, Ponce P, Molina A. A State-of-the-Art Review of EEG-Based Imagined Speech Decoding. Front Hum Neurosci 2022; 16:867281. [PMID: 35558735 PMCID: PMC9086783 DOI: 10.3389/fnhum.2022.867281] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 03/24/2022] [Indexed: 11/13/2022] Open
Abstract
Currently, the most used method to measure brain activity under a non-invasive procedure is the electroencephalogram (EEG). This is because of its high temporal resolution, ease of use, and safety. These signals can be used under a Brain Computer Interface (BCI) framework, which can be implemented to provide a new communication channel to people that are unable to speak due to motor disabilities or other neurological diseases. Nevertheless, EEG-based BCI systems have presented challenges to be implemented in real life situations for imagined speech recognition due to the difficulty to interpret EEG signals because of their low signal-to-noise ratio (SNR). As consequence, in order to help the researcher make a wise decision when approaching this problem, we offer a review article that sums the main findings of the most relevant studies on this subject since 2009. This review focuses mainly on the pre-processing, feature extraction, and classification techniques used by several authors, as well as the target vocabulary. Furthermore, we propose ideas that may be useful for future work in order to achieve a practical application of EEG-based BCI systems toward imagined speech decoding.
Collapse
Affiliation(s)
- Diego Lopez-Bernal
- Tecnologico de Monterrey, National Department of Research, Mexico City, Mexico
| | | | | | | |
Collapse
|
15
|
Li L, Zhang Z, Xiong Y, Hu Z, Liu S, Tu B, Yao Y. Prediction of hospital mortality in mechanically ventilated patients with congestive heart failure using machine learning approaches. Int J Cardiol 2022; 358:59-64. [PMID: 35483478 DOI: 10.1016/j.ijcard.2022.04.063] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 03/14/2022] [Accepted: 04/22/2022] [Indexed: 12/29/2022]
Abstract
BACKGROUND Mechanically ventilated patients with congestive heart failure (CHF) are at high-risk of mortality. We aimed to develop and validate a prediction model based on machine learning (ML) algorithms to predict hospital mortality in mechanically ventilated patients with CHF. METHODS Least absolute shrinkage and selection operator (LASSO) regression was used to identify the key features. Hyperparameters optimization (HPO) was conducted to modify the prediction model. The area under the receiver operating characteristic curve (AUC), accuracy, calibration curve and decision curve analysis were used to evaluate prediction performance. The final model was validated using an external validation set from another database. The prediction results were represented by a nomogram. RESULTS A total of 4530 qualified patients were included. Among 11 ML-algorithms, CatBoost showed the best prediction performance (AUC = 0.833). And 10 key features (10/63) were selected based on the LASSO regression. After HPO, the prediction performance of the CatBoost model based on the key features was significantly improved (AUCs: 0.805 vs. 0.821). Additionally, the CatBoost model also showed the satisfactory prediction performance in the external validation set (AUC = 0.806). CONCLUSION The present study developed and validated a CatBoost model, which could accurately predict hospital mortality in mechanically ventilated patients with CHF.
Collapse
Affiliation(s)
- Le Li
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Zhenhao Zhang
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Yulong Xiong
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Zhao Hu
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Shangyu Liu
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Bin Tu
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China
| | - Yan Yao
- Chinese Academy of Medical Sciences, Peking Union Medical College, National Center for Cardiovascular Diseases, Fu Wai Hospital, Beijing, China.
| |
Collapse
|
16
|
Chandler JA, Van der Loos KI, Boehnke S, Beaudry JS, Buchman DZ, Illes J. Brain Computer Interfaces and Communication Disabilities: Ethical, Legal, and Social Aspects of Decoding Speech From the Brain. Front Hum Neurosci 2022; 16:841035. [PMID: 35529778 PMCID: PMC9069963 DOI: 10.3389/fnhum.2022.841035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 03/03/2022] [Indexed: 11/28/2022] Open
Abstract
A brain-computer interface technology that can decode the neural signals associated with attempted but unarticulated speech could offer a future efficient means of communication for people with severe motor impairments. Recent demonstrations have validated this approach. Here we assume that it will be possible in future to decode imagined (i.e., attempted but unarticulated) speech in people with severe motor impairments, and we consider the characteristics that could maximize the social utility of a BCI for communication. As a social interaction, communication involves the needs and goals of both speaker and listener, particularly in contexts that have significant potential consequences. We explore three high-consequence legal situations in which neurally-decoded speech could have implications: Testimony, where decoded speech is used as evidence; Consent and Capacity, where it may be used as a means of agency and participation such as consent to medical treatment; and Harm, where such communications may be networked or may cause harm to others. We then illustrate how design choices might impact the social and legal acceptability of these technologies.
Collapse
Affiliation(s)
- Jennifer A. Chandler
- Bertram Loeb Research Chair, Faculty of Law, University of Ottawa, Ottawa, ON, Canada
- *Correspondence: Jennifer A. Chandler,
| | | | - Susan Boehnke
- Centre for Neuroscience Studies, Queen’s University, Kingston, ON, Canada
| | - Jonas S. Beaudry
- Institute for Health and Social Policy (IHSP) and Faculty of Law, McGill University, Montreal, QC, Canada
| | - Daniel Z. Buchman
- Centre for Addiction and Mental Health, Dalla Lana School of Public Health, Krembil Research Institute, University of Toronto Joint Centre for Bioethics, Toronto, ON, Canada
| | - Judy Illes
- Division of Neurology, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
17
|
Rethinking the Methods and Algorithms for Inner Speech Decoding and Making Them Reproducible. NEUROSCI 2022. [DOI: 10.3390/neurosci3020017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced spoken words by using various brain–computer interfaces. The main shortcomings of existing work are reproducibility and the availability of data and code. In this work, we investigate various methods (using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), Long Short-Term Memory Networks (LSTM)) for the detection task of five vowels and six words on a publicly available EEG dataset. The main contributions of this work are (1) subject dependent vs. subject-independent approaches, (2) the effect of different preprocessing steps (Independent Component Analysis (ICA), down-sampling and filtering), and (3) word classification (where we achieve state-of-the-art performance on a publicly available dataset). Overall we achieve a performance accuracy of 35.20% and 29.21% when classifying five vowels and six words, respectively, in a publicly available dataset, using our tuned iSpeech-CNN architecture. All of our code and processed data are publicly available to ensure reproducibility. As such, this work contributes to a deeper understanding and reproducibility of experiments in the area of inner speech detection.
Collapse
|
18
|
Sun K, Hu X, Feng Z, Wang H, Lv H, Wang Z, Zhang G, Xu S, You X. Predicting Ca 2+ and Mg 2+ ligand binding sites by deep neural network algorithm. BMC Bioinformatics 2022; 22:324. [PMID: 35045825 PMCID: PMC8772041 DOI: 10.1186/s12859-021-04250-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Accepted: 06/09/2021] [Indexed: 11/25/2022] Open
Abstract
Background Alkaline earth metal ions are important protein binding ligands in human body, and it is of great significance to predict their binding residues. Results In this paper, Mg2+ and Ca2+ ligands are taken as the research objects. Based on the characteristic parameters of protein sequences, amino acids, physicochemical characteristics of amino acids and predicted structural information, deep neural network algorithm is used to predict the binding sites of proteins. By optimizing the hyper-parameters of the deep learning algorithm, the prediction results by the fivefold cross-validation are better than those of the Ionseq method. In addition, to further verify the performance of the proposed model, the undersampling data processing method is adopted, and the prediction results on independent test are better than those obtained by the support vector machine algorithm. Conclusions An efficient method for predicting Mg2+ and Ca2+ ligand binding sites was presented.
Collapse
Affiliation(s)
- Kai Sun
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China.,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China
| | - Xiuzhen Hu
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China. .,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China.
| | - Zhenxing Feng
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China.,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China
| | - Hongbin Wang
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China
| | - Haotian Lv
- College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China
| | - Ziyang Wang
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China.,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China
| | - Gaimei Zhang
- Hohhot First Hospital, Hohhot, 010051, People's Republic of China
| | - Shuang Xu
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China.,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China
| | - Xiaoxiao You
- College of Sciences, Inner Mongolia University of Technology, Hohhot, 010051, People's Republic of China.,Inner Mongolia Key Laboratory of Statistical Analysis Theory for Life Data and Neural Network Modeling, Hohhot, People's Republic of China
| |
Collapse
|
19
|
Proix T, Delgado Saa J, Christen A, Martin S, Pasley BN, Knight RT, Tian X, Poeppel D, Doyle WK, Devinsky O, Arnal LH, Mégevand P, Giraud AL. Imagined speech can be decoded from low- and cross-frequency intracranial EEG features. Nat Commun 2022; 13:48. [PMID: 35013268 PMCID: PMC8748882 DOI: 10.1038/s41467-021-27725-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 12/03/2021] [Indexed: 01/19/2023] Open
Abstract
Reconstructing intended speech from neural activity using brain-computer interfaces holds great promises for people with severe speech production deficits. While decoding overt speech has progressed, decoding imagined speech has met limited success, mainly because the associated neural signals are weak and variable compared to overt speech, hence difficult to decode by learning algorithms. We obtained three electrocorticography datasets from 13 patients, with electrodes implanted for epilepsy evaluation, who performed overt and imagined speech production tasks. Based on recent theories of speech neural processing, we extracted consistent and specific neural features usable for future brain computer interfaces, and assessed their performance to discriminate speech items in articulatory, phonetic, and vocalic representation spaces. While high-frequency activity provided the best signal for overt speech, both low- and higher-frequency power and local cross-frequency contributed to imagined speech decoding, in particular in phonetic and vocalic, i.e. perceptual, spaces. These findings show that low-frequency power and cross-frequency dynamics contain key information for imagined speech decoding.
Collapse
Affiliation(s)
- Timothée Proix
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
| | - Jaime Delgado Saa
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Andy Christen
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Stephanie Martin
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Brian N Pasley
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, USA
| | - Robert T Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, USA
- Department of Psychology, University of California, Berkeley, Berkeley, USA
| | - Xing Tian
- Division of Arts and Sciences, New York University Shanghai, Shanghai, China
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai, China
| | - David Poeppel
- Department of Psychology, New York University, New York, NY, USA
- Ernst Strüngmann Institute for Neuroscience, Frankfurt, Germany
| | - Werner K Doyle
- Department of Neurology, New York University Grossman School of Medicine, New York, NY, USA
| | - Orrin Devinsky
- Department of Neurology, New York University Grossman School of Medicine, New York, NY, USA
| | - Luc H Arnal
- Institut de l'Audition, Institut Pasteur, INSERM, F-75012, Paris, France
| | - Pierre Mégevand
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Division of Neurology, Geneva University Hospitals, Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
20
|
Cooney C, Folli R, Coyle D. A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech. IEEE Trans Biomed Eng 2021; 69:1983-1994. [PMID: 34874850 DOI: 10.1109/tbme.2021.3132861] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE Brain-computer interfaces (BCI) studies are increasingly leveraging different attributes of multiple signal modalities simultaneously. Bimodal data acquisition protocols combining the temporal resolution of electroencephalography (EEG) with the spatial resolution of functional near-infrared spectroscopy (fNIRS) require novel approaches to decoding. METHODS We present an EEG-fNIRS Hybrid BCI that employs a new bimodal deep neural network architecture consisting of two convolutional sub-networks (subnets) to decode overt and imagined speech. Features from each subnet are fused before further feature extraction and classification. Nineteen participants performed overt and imagined speech in a novel cue-based paradigm enabling investigation of stimulus and linguistic effects on decoding. RESULTS Using the hybrid approach, classification accuracies (46.31% and 34.29% for overt and imagined speech, respectively (chance: 25%)) indicated a significant improvement on EEG used independently for imagined speech (p=0.020) while tending towards significance for overt speech (p=0.098). In comparison with fNIRS, significant improvements for both speech-types were achieved with bimodal decoding (p<0.001). There was a mean difference of ~12.02% between overt and imagined speech with accuracies as high as 87.18% and 53%. Deeper subnets enhanced performance while stimulus effected overt and imagined speech in significantly different ways. CONCLUSION The bimodal approach was a significant improvement on unimodal results for several tasks. Results indicate the potential of multi-modal deep learning for enhancing neural signal decoding. SIGNIFICANCE This novel architecture can be used to enhance speech decoding from bimodal neural signals.
Collapse
|
21
|
Vorontsova D, Menshikov I, Zubov A, Orlov K, Rikunov P, Zvereva E, Flitman L, Lanikin A, Sokolova A, Markov S, Bernadotte A. Silent EEG-Speech Recognition Using Convolutional and Recurrent Neural Network with 85% Accuracy of 9 Words Classification. SENSORS 2021; 21:s21206744. [PMID: 34695956 PMCID: PMC8541074 DOI: 10.3390/s21206744] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 10/02/2021] [Accepted: 10/03/2021] [Indexed: 11/21/2022]
Abstract
In this work, we focus on silent speech recognition in electroencephalography (EEG) data of healthy individuals to advance brain–computer interface (BCI) development to include people with neurodegeneration and movement and communication difficulties in society. Our dataset was recorded from 270 healthy subjects during silent speech of eight different Russia words (commands): ‘forward’, ‘backward’, ‘up’, ‘down’, ‘help’, ‘take’, ‘stop’, and ‘release’, and one pseudoword. We began by demonstrating that silent word distributions can be very close statistically and that there are words describing directed movements that share similar patterns of brain activity. However, after training one individual, we achieved 85% accuracy performing 9 words (including pseudoword) classification and 88% accuracy on binary classification on average. We show that a smaller dataset collected on one participant allows for building a more accurate classifier for a given subject than a larger dataset collected on a group of people. At the same time, we show that the learning outcomes on a limited sample of EEG-data are transferable to the general population. Thus, we demonstrate the possibility of using selected command-words to create an EEG-based input device for people on whom the neural network classifier has not been trained, which is particularly important for people with disabilities.
Collapse
Affiliation(s)
- Darya Vorontsova
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
- Software Engineering Department, National Research University of Electronic Technology (MIET), 124498 Moscow, Russia
| | - Ivan Menshikov
- Faculty of Mechanics and Mathematics, Moscow State University, GSP-1, 1 Leninskiye Gory, Main Building, 119991 Moscow, Russia;
- Department of Control and Applied Mathematics, Moscow Institute of Physics and Technology (MIPT), 141700 Dolgoprudny, Russia
| | - Aleksandr Zubov
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
- Department of Information Technologies and Computer Sciences, National University of Science and Technology MISIS (NUST MISIS), 119049 Moscow, Russia
| | - Kirill Orlov
- Research Center of Endovascular Neurosurgery, Federal State Budgetary Institution “Federal Center of Brain Research and Neurotechnologies” of the Federal Medical Biological Agency, Ostrovityanova Street, 1, p. 10, 117997 Moscow, Russia;
- Russia Endovascular Neuro Society (RENS), 107078 Moscow, Russia
| | - Peter Rikunov
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Ekaterina Zvereva
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Lev Flitman
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Anton Lanikin
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Anna Sokolova
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Sergey Markov
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
| | - Alexandra Bernadotte
- Experimental ML Systems Subdivision, SberDevices Department, PJSC Sberbank, 121165 Moscow, Russia; (D.V.); (A.Z.); (P.R.); (E.Z.); (L.F.); (A.L.); (A.S.); (S.M.)
- Faculty of Mechanics and Mathematics, Moscow State University, GSP-1, 1 Leninskiye Gory, Main Building, 119991 Moscow, Russia;
- Department of Information Technologies and Computer Sciences, National University of Science and Technology MISIS (NUST MISIS), 119049 Moscow, Russia
- Correspondence:
| |
Collapse
|
22
|
Sarmiento LC, Villamizar S, López O, Collazos AC, Sarmiento J, Rodríguez JB. Recognition of EEG Signals from Imagined Vowels Using Deep Learning Methods. SENSORS (BASEL, SWITZERLAND) 2021; 21:6503. [PMID: 34640824 PMCID: PMC8512781 DOI: 10.3390/s21196503] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/17/2021] [Accepted: 09/24/2021] [Indexed: 01/27/2023]
Abstract
The use of imagined speech with electroencephalographic (EEG) signals is a promising field of brain-computer interfaces (BCI) that seeks communication between areas of the cerebral cortex related to language and devices or machines. However, the complexity of this brain process makes the analysis and classification of this type of signals a relevant topic of research. The goals of this study were: to develop a new algorithm based on Deep Learning (DL), referred to as CNNeeg1-1, to recognize EEG signals in imagined vowel tasks; to create an imagined speech database with 50 subjects specialized in imagined vowels from the Spanish language (/a/,/e/,/i/,/o/,/u/); and to contrast the performance of the CNNeeg1-1 algorithm with the DL Shallow CNN and EEGNet benchmark algorithms using an open access database (BD1) and the newly developed database (BD2). In this study, a mixed variance analysis of variance was conducted to assess the intra-subject and inter-subject training of the proposed algorithms. The results show that for intra-subject training analysis, the best performance among the Shallow CNN, EEGNet, and CNNeeg1-1 methods in classifying imagined vowels (/a/,/e/,/i/,/o/,/u/) was exhibited by CNNeeg1-1, with an accuracy of 65.62% for BD1 database and 85.66% for BD2 database.
Collapse
Affiliation(s)
- Luis Carlos Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Sergio Villamizar
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| | - Omar López
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Ana Claros Collazos
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jhon Sarmiento
- Departamento de Tecnología, Universidad Pedagógica Nacional, Bogotá 111321, Colombia; (O.L.); (A.C.C.); (J.S.)
| | - Jan Bacca Rodríguez
- Department of Electrical and Electronics Engineering, School of Engineering, Universidad Nacional de Colombia, Bogotá 111321, Colombia; (S.V.); (J.B.R.)
| |
Collapse
|
23
|
Chowdhury MH, Eldaly ABM, Agadagba SK, Cheung RCC, Chan LLH. Machine Learning Based Hardware Architecture for DOA Measurement from Mice EEG. IEEE Trans Biomed Eng 2021; 69:314-324. [PMID: 34351851 DOI: 10.1109/tbme.2021.3093037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
OBJECTIVE This research aims to design a hardware optimized machine learning based Depth of Anesthesia (DOA) measurement framework for mice and its FPGA implementation. METHODS Electroencephalography or EEG signal is acquired from 16 mice in the Neural Interface Research (NIR) Laboratory of the City University of Hong Kong. We present a logistic regression based approach with mathematically uncomplicated feature extraction techniques for efficient hardware implementation to estimate the DOA. RESULTS With the extraction of only two features, the proposed system can classify the state of consciousness with 94% accuracy for a 1 second EEG epoch, leading to a 100% accurate channel prediction after a 7 second run-time on average. CONCLUSION Through performance evaluation and comparative study confirmed the efficacy of the prototype. SIGNIFICANCE Traditionally the DOA is estimated by checking biophysical responses of a patient during the surgery. However, the physical symptoms can be misleading for a decisive conclusion due to the patient's health condition or as a side-effect of anesthetic drugs. Recently, several neuroscientific research works are correlating the EEG signal with conscious states, which is likely to have less interference with the patient's medical condition. This research presents the first-of-its-kind hardware implemented automatic DOA computation system for mice.
Collapse
|
24
|
Panachakel JT, Ramakrishnan AG. Decoding Covert Speech From EEG-A Comprehensive Review. Front Neurosci 2021; 15:642251. [PMID: 33994922 PMCID: PMC8116487 DOI: 10.3389/fnins.2021.642251] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open
Abstract
Over the past decade, many researchers have come up with different implementations of systems for decoding covert or imagined speech from EEG (electroencephalogram). They differ from each other in several aspects, from data acquisition to machine learning algorithms, due to which, a comparison between different implementations is often difficult. This review article puts together all the relevant works published in the last decade on decoding imagined speech from EEG into a single framework. Every important aspect of designing such a system, such as selection of words to be imagined, number of electrodes to be recorded, temporal and spatial filtering, feature extraction and classifier are reviewed. This helps a researcher to compare the relative merits and demerits of the different approaches and choose the one that is most optimal. Speech being the most natural form of communication which human beings acquire even without formal education, imagined speech is an ideal choice of prompt for evoking brain activity patterns for a BCI (brain-computer interface) system, although the research on developing real-time (online) speech imagery based BCI systems is still in its infancy. Covert speech based BCI can help people with disabilities to improve their quality of life. It can also be used for covert communication in environments that do not support vocal communication. This paper also discusses some future directions, which will aid the deployment of speech imagery based BCI for practical applications, rather than only for laboratory experiments.
Collapse
Affiliation(s)
- Jerrin Thomas Panachakel
- Medical Intelligence and Language Engineering Laboratory, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India
| | | |
Collapse
|
25
|
Standardization-refinement domain adaptation method for cross-subject EEG-based classification in imagined speech recognition. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2020.11.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|