1
|
Silva AB, Littlejohn KT, Liu JR, Moses DA, Chang EF. The speech neuroprosthesis. Nat Rev Neurosci 2024; 25:473-492. [PMID: 38745103 DOI: 10.1038/s41583-024-00819-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/12/2024] [Indexed: 05/16/2024]
Abstract
Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by directly decoding speech from intact cortical activity has the potential to restore natural communication and self-expression. Recent discoveries have defined how key features of speech production are facilitated by the coordinated activity of vocal-tract articulatory and motor-planning cortical representations. In this Review, we highlight such progress and how it has led to successful speech decoding, first in individuals implanted with intracranial electrodes for clinical epilepsy monitoring and subsequently in individuals with paralysis as part of early feasibility clinical trials to restore speech. We discuss high-spatiotemporal-resolution neural interfaces and the adaptation of state-of-the-art speech computational algorithms that have driven rapid and substantial progress in decoding neural activity into text, audible speech, and facial movements. Although restoring natural speech is a long-term goal, speech neuroprostheses already have performance levels that surpass communication rates offered by current assistive-communication technology. Given this accelerated rate of progress in the field, we propose key evaluation metrics for speed and accuracy, among others, to help standardize across studies. We finish by highlighting several directions to more fully explore the multidimensional feature space of speech and language, which will continue to accelerate progress towards a clinically viable speech neuroprosthesis.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
2
|
Tankus A, Stern E, Klein G, Kaptzon N, Nash L, Marziano T, Shamia O, Gurevitch G, Bergman L, Goldstein L, Fahoum F, Strauss I. A Speech Neuroprosthesis in the Frontal Lobe and Hippocampus: Decoding High-Frequency Activity into Phonemes. Neurosurgery 2024:00006123-990000000-01250. [PMID: 38934637 DOI: 10.1227/neu.0000000000003068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 05/05/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND AND OBJECTIVES Loss of speech due to injury or disease is devastating. Here, we report a novel speech neuroprosthesis that artificially articulates building blocks of speech based on high-frequency activity in brain areas never harnessed for a neuroprosthesis before: anterior cingulate and orbitofrontal cortices, and hippocampus. METHODS A 37-year-old male neurosurgical epilepsy patient with intact speech, implanted with depth electrodes for clinical reasons only, silently controlled the neuroprosthesis almost immediately and in a natural way to voluntarily produce 2 vowel sounds. RESULTS During the first set of trials, the participant made the neuroprosthesis produce the different vowel sounds artificially with 85% accuracy. In the following trials, performance improved consistently, which may be attributed to neuroplasticity. We show that a neuroprosthesis trained on overt speech data may be controlled silently. CONCLUSION This may open the way for a novel strategy of neuroprosthesis implantation at earlier disease stages (eg, amyotrophic lateral sclerosis), while speech is intact, for improved training that still allows silent control at later stages. The results demonstrate clinical feasibility of direct decoding of high-frequency activity that includes spiking activity in the aforementioned areas for silent production of phonemes that may serve as a part of a neuroprosthesis for replacing lost speech control pathways.
Collapse
Affiliation(s)
- Ariel Tankus
- Functional Neurosurgery Unit, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Department of Neurology and Neurosurgery, School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Einat Stern
- Department of Neurology and Neurosurgery, School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Guy Klein
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Nufar Kaptzon
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Lilac Nash
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Tal Marziano
- School of Electrical Engineering, Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Omer Shamia
- School of Electrical Engineering, Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Guy Gurevitch
- Sagol Brain Institute, Tel-Aviv Sourasky Medical Center, Tel-Aviv, Israel
- Department of Physiology and Pharmacology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Lottem Bergman
- Functional Neurosurgery Unit, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Lilach Goldstein
- Department of Neurology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Firas Fahoum
- Department of Neurology and Neurosurgery, School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Department of Neurology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Ido Strauss
- Functional Neurosurgery Unit, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Department of Neurology and Neurosurgery, School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
3
|
Chau G, Wang C, Talukder S, Subramaniam V, Soedarmadji S, Yue Y, Katz B, Barbu A. Population Transformer: Learning Population-level Representations of Intracranial Activity. ARXIV 2024:arXiv:2406.03044v1. [PMID: 38883237 PMCID: PMC11177958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
We present a self-supervised framework that learns population-level codes for intracranial neural recordings at scale, unlocking the benefits of representation learning for a key neuroscience recording modality. The Population Transformer (PopT) lowers the amount of data required for decoding experiments, while increasing accuracy, even on never-before-seen subjects and tasks. We address two key challenges in developing PopT: sparse electrode distribution and varying electrode location across patients. PopT stacks on top of pretrained representations and enhances downstream tasks by enabling learned aggregation of multiple spatially-sparse data channels. Beyond decoding, we interpret the pretrained PopT and fine-tuned models to show how it can be used to provide neuroscience insights learned from massive amounts of data. We release a pretrained PopT to enable off-the-shelf improvements in multi-channel intracranial data decoding and interpretability, and code is available at https://github.com/czlwang/PopulationTransformer.
Collapse
|
4
|
Wandelt SK, Bjånes DA, Pejsa K, Lee B, Liu C, Andersen RA. Representation of internal speech by single neurons in human supramarginal gyrus. Nat Hum Behav 2024; 8:1136-1149. [PMID: 38740984 PMCID: PMC11199147 DOI: 10.1038/s41562-024-01867-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 03/16/2024] [Indexed: 05/16/2024]
Abstract
Speech brain-machine interfaces (BMIs) translate brain signals into words or audio outputs, enabling communication for people having lost their speech abilities due to diseases or injury. While important advances in vocalized, attempted and mimed speech decoding have been achieved, results for internal speech decoding are sparse and have yet to achieve high functionality. Notably, it is still unclear from which brain areas internal speech can be decoded. Here two participants with tetraplegia with implanted microelectrode arrays located in the supramarginal gyrus (SMG) and primary somatosensory cortex (S1) performed internal and vocalized speech of six words and two pseudowords. In both participants, we found significant neural representation of internal and vocalized speech, at the single neuron and population level in the SMG. From recorded population activity in the SMG, the internally spoken and vocalized words were significantly decodable. In an offline analysis, we achieved average decoding accuracies of 55% and 24% for each participant, respectively (chance level 12.5%), and during an online internal speech BMI task, we averaged 79% and 23% accuracy, respectively. Evidence of shared neural representations between internal speech, word reading and vocalized speech processes was found in participant 1. SMG represented words as well as pseudowords, providing evidence for phonetic encoding. Furthermore, our decoder achieved high classification with multiple internal speech strategies (auditory imagination/visual imagination). Activity in S1 was modulated by vocalized but not internal speech in both participants, suggesting no articulator movements of the vocal tract occurred during internal speech production. This work represents a proof-of-concept for a high-performance internal speech BMI.
Collapse
Affiliation(s)
- Sarah K Wandelt
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA.
| | - David A Bjånes
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
- Rancho Los Amigos National Rehabilitation Center, Downey, CA, USA
| | - Kelsie Pejsa
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
| | - Brian Lee
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Neurological Surgery, Keck School of Medicine of USC, Los Angeles, CA, USA
- USC Neurorestoration Center, Keck School of Medicine of USC, Los Angeles, CA, USA
| | - Charles Liu
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Rancho Los Amigos National Rehabilitation Center, Downey, CA, USA
- Department of Neurological Surgery, Keck School of Medicine of USC, Los Angeles, CA, USA
- USC Neurorestoration Center, Keck School of Medicine of USC, Los Angeles, CA, USA
| | - Richard A Andersen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
5
|
Ng HW, Guan C. Subject-independent meta-learning framework towards optimal training of EEG-based classifiers. Neural Netw 2024; 172:106108. [PMID: 38219680 DOI: 10.1016/j.neunet.2024.106108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 11/13/2023] [Accepted: 01/05/2024] [Indexed: 01/16/2024]
Abstract
Advances in deep learning have shown great promise towards the application of performing high-accuracy Electroencephalography (EEG) signal classification in a variety of tasks. However, many EEG-based datasets are often plagued by the issue of high inter-subject signal variability. Robust deep learning models are notoriously difficult to train under such scenarios, often leading to subpar or widely varying performance across subjects under the leave-one-subject-out paradigm. Recently, the model agnostic meta-learning framework was introduced as a way to increase the model's ability to generalize towards new tasks. While the original framework focused on task-based meta-learning, this research aims to show that the meta-learning methodology can be modified towards subject-based signal classification while maintaining the same task objectives and achieve state-of-the-art performance. Namely, we propose the novel implementation of a few/zero-shot subject-independent meta-learning framework towards multi-class inner speech and binary class motor imagery classification. Compared to current subject-adaptive methods which utilize large number of labels from the target, the proposed framework shows its effectiveness in training zero-calibration and few-shot models for subject-independent EEG classification. The proposed few/zero-shot subject-independent meta-learning mechanism performs well on both small and large datasets and achieves robust, generalized performance across subjects. The results obtained shows a significant improvement over the current state-of-the-art, with the binary class motor imagery achieving 88.70% and the accuracy of multi-class inner speech achieving an average of 31.15%. Codes will be made available to public upon publication.
Collapse
Affiliation(s)
- Han Wei Ng
- Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore; AI Singapore, 3 Research Link, 117602, Singapore.
| | - Cuntai Guan
- Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
| |
Collapse
|
6
|
Fernyhough C, Borghi AM. Inner speech as language process and cognitive tool. Trends Cogn Sci 2023; 27:1180-1193. [PMID: 37770286 DOI: 10.1016/j.tics.2023.08.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/16/2023] [Accepted: 08/22/2023] [Indexed: 09/30/2023]
Abstract
Many people report a form of internal language known as inner speech (IS). This review examines recent growth of research interest in the phenomenon, which has broadly supported a theoretical model in which IS is a functional language process that can confer benefits for cognition in a range of domains. A key insight to have emerged in recent years is that IS is an embodied experience characterized by varied subjective qualities, which can be usefully modeled in artificial systems and whose neural signals have the potential to be decoded through advancing brain-computer interface technologies. Challenges for future research include understanding individual differences in IS and mapping form to function across IS subtypes.
Collapse
Affiliation(s)
- Charles Fernyhough
- Department of Psychology and Centre for Research into Inner Experience, Durham University, Durham DH1 3LE, UK.
| | - Anna M Borghi
- Department of Dynamic and Clinical Psychology, and Health Studies, Sapienza University of Rome and Institute of Cognitive Sciences and Technologies, Italian National Research Council, 00185 Rome, Italy
| |
Collapse
|
7
|
Canny E, Vansteensel MJ, van der Salm SMA, Müller-Putz GR, Berezutskaya J. Boosting brain-computer interfaces with functional electrical stimulation: potential applications in people with locked-in syndrome. J Neuroeng Rehabil 2023; 20:157. [PMID: 37980536 PMCID: PMC10656959 DOI: 10.1186/s12984-023-01272-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/23/2023] [Indexed: 11/20/2023] Open
Abstract
Individuals with a locked-in state live with severe whole-body paralysis that limits their ability to communicate with family and loved ones. Recent advances in brain-computer interface (BCI) technology have presented a potential alternative for these people to communicate by detecting neural activity associated with attempted hand or speech movements and translating the decoded intended movements to a control signal for a computer. A technique that could potentially enrich the communication capacity of BCIs is functional electrical stimulation (FES) of paralyzed limbs and face to restore body and facial movements of paralyzed individuals, allowing to add body language and facial expression to communication BCI utterances. Here, we review the current state of the art of existing BCI and FES work in people with paralysis of body and face and propose that a combined BCI-FES approach, which has already proved successful in several applications in stroke and spinal cord injury, can provide a novel promising mode of communication for locked-in individuals.
Collapse
Affiliation(s)
- Evan Canny
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mariska J Vansteensel
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sandra M A van der Salm
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Gernot R Müller-Putz
- Institute of Neural Engineering, Laboratory of Brain-Computer Interfaces, Graz University of Technology, Graz, Austria
| | - Julia Berezutskaya
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
8
|
Sankaran N, Moses D, Chiong W, Chang EF. Recommendations for promoting user agency in the design of speech neuroprostheses. Front Hum Neurosci 2023; 17:1298129. [PMID: 37920562 PMCID: PMC10619159 DOI: 10.3389/fnhum.2023.1298129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/04/2023] [Indexed: 11/04/2023] Open
Abstract
Brain-computer interfaces (BCI) that directly decode speech from brain activity aim to restore communication in people with paralysis who cannot speak. Despite recent advances, neural inference of speech remains imperfect, limiting the ability for speech BCIs to enable experiences such as fluent conversation that promote agency - that is, the ability for users to author and transmit messages enacting their intentions. Here, we make recommendations for promoting agency based on existing and emerging strategies in neural engineering. The focus is on achieving fast, accurate, and reliable performance while ensuring volitional control over when a decoder is engaged, what exactly is decoded, and how messages are expressed. Additionally, alongside neuroscientific progress within controlled experimental settings, we argue that a parallel line of research must consider how to translate experimental successes into real-world environments. While such research will ultimately require input from prospective users, here we identify and describe design choices inspired by human-factors work conducted in existing fields of assistive technology, which address practical issues likely to emerge in future real-world speech BCI applications.
Collapse
Affiliation(s)
- Narayan Sankaran
- Kavli Center for Ethics, Science and the Public, University of California, Berkeley, Berkeley, CA, United States
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - David Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - Winston Chiong
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, United States
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
9
|
Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J Neural Eng 2023; 20:056010. [PMID: 37467739 PMCID: PMC10510111 DOI: 10.1088/1741-2552/ace8be] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 07/12/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Objective.Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.Approach.In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.Main results.We show that (1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; (2) individual word decoding in reconstructed speech achieves 92%-100% accuracy (chance level is 8%); (3) direct reconstruction from sensorimotor brain activity produces intelligible speech.Significance.These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Mariska J Vansteensel
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Erik J Aarnoutse
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Marcel A J van Gerven
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| |
Collapse
|
10
|
Meng K, Goodarzy F, Kim E, Park YJ, Kim JS, Cook MJ, Chung CK, Grayden DB. Continuous synthesis of artificial speech sounds from human cortical surface recordings during silent speech production. J Neural Eng 2023; 20:046019. [PMID: 37459853 DOI: 10.1088/1741-2552/ace7f6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 07/17/2023] [Indexed: 07/28/2023]
Abstract
Objective. Brain-computer interfaces can restore various forms of communication in paralyzed patients who have lost their ability to articulate intelligible speech. This study aimed to demonstrate the feasibility of closed-loop synthesis of artificial speech sounds from human cortical surface recordings during silent speech production.Approach. Ten participants with intractable epilepsy were temporarily implanted with intracranial electrode arrays over cortical surfaces. A decoding model that predicted audible outputs directly from patient-specific neural feature inputs was trained during overt word reading and immediately tested with overt, mimed and imagined word reading. Predicted outputs were later assessed objectively against corresponding voice recordings and subjectively through human perceptual judgments.Main results. Artificial speech sounds were successfully synthesized during overt and mimed utterances by two participants with some coverage of the precentral gyrus. About a third of these sounds were correctly identified by naïve listeners in two-alternative forced-choice tasks. A similar outcome could not be achieved during imagined utterances by any of the participants. However, neural feature contribution analyses suggested the presence of exploitable activation patterns during imagined speech in the postcentral gyrus and the superior temporal gyrus. In future work, a more comprehensive coverage of cortical surfaces, including posterior parts of the middle frontal gyrus and the inferior frontal gyrus, could improve synthesis performance during imagined speech.Significance.As the field of speech neuroprostheses is rapidly moving toward clinical trials, this study addressed important considerations about task instructions and brain coverage when conducting research on silent speech with non-target participants.
Collapse
Affiliation(s)
- Kevin Meng
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
| | - Farhad Goodarzy
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| | - EuiYoung Kim
- Interdisciplinary Program in Neuroscience, Seoul National University, Seoul, Republic of Korea
| | - Ye Jin Park
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
| | - June Sic Kim
- Research Institute of Basic Sciences, Seoul National University, Seoul, Republic of Korea
| | - Mark J Cook
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| | - Chun Kee Chung
- Department of Brain and Cognitive Sciences, Seoul National University, Seoul, Republic of Korea
- Department of Neurosurgery, Seoul National University Hospital, Seoul, Republic of Korea
| | - David B Grayden
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Graeme Clark Institute for Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- Department of Medicine, St Vincent's Hospital, The University of Melbourne, Melbourne, Australia
| |
Collapse
|
11
|
Simistira Liwicki F, Gupta V, Saini R, De K, Abid N, Rakesh S, Wellington S, Wilson H, Liwicki M, Eriksson J. Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition. Sci Data 2023; 10:378. [PMID: 37311807 DOI: 10.1038/s41597-023-02286-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 06/01/2023] [Indexed: 06/15/2023] Open
Abstract
The recognition of inner speech, which could give a 'voice' to patients that have no ability to speak or move, is a challenge for brain-computer interfaces (BCIs). A shortcoming of the available datasets is that they do not combine modalities to increase the performance of inner speech recognition. Multimodal datasets of brain data enable the fusion of neuroimaging modalities with complimentary properties, such as the high spatial resolution of functional magnetic resonance imaging (fMRI) and the temporal resolution of electroencephalography (EEG), and therefore are promising for decoding inner speech. This paper presents the first publicly available bimodal dataset containing EEG and fMRI data acquired nonsimultaneously during inner-speech production. Data were obtained from four healthy, right-handed participants during an inner-speech task with words in either a social or numerical category. Each of the 8-word stimuli were assessed with 40 trials, resulting in 320 trials in each modality for each participant. The aim of this work is to provide a publicly available bimodal dataset on inner speech, contributing towards speech prostheses.
Collapse
Affiliation(s)
- Foteini Simistira Liwicki
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden.
| | - Vibha Gupta
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | - Rajkumar Saini
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | - Kanjar De
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | - Nosheen Abid
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | - Sumit Rakesh
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | | | - Holly Wilson
- University of Bath, Department of Computer Science, Bath, UK
| | - Marcus Liwicki
- Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Intelligent Systems LAB, Luleå, Sweden
| | - Johan Eriksson
- Umeå University, Department of Integrative Medical Biology (IMB) and Umeå Center for Functional Brain Imaging (UFBI), Umeå, Sweden
| |
Collapse
|
12
|
Nitta T, Horikawa J, Iribe Y, Taguchi R, Katsurada K, Shinohara S, Kawai G. Linguistic representation of vowels in speech imagery EEG. Front Hum Neurosci 2023; 17:1163578. [PMID: 37275343 PMCID: PMC10237317 DOI: 10.3389/fnhum.2023.1163578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 04/27/2023] [Indexed: 06/07/2023] Open
Abstract
Speech imagery recognition from electroencephalograms (EEGs) could potentially become a strong contender among non-invasive brain-computer interfaces (BCIs). In this report, first we extract language representations as the difference of line-spectra of phones by statistically analyzing many EEG signals from the Broca area. Then we extract vowels by using iterative search from hand-labeled short-syllable data. The iterative search process consists of principal component analysis (PCA) that visualizes linguistic representation of vowels through eigen-vectors φ(m), and subspace method (SM) that searches an optimum line-spectrum for redesigning φ(m). The extracted linguistic representation of Japanese vowels /i/ /e/ /a/ /o/ /u/ shows 2 distinguished spectral peaks (P1, P2) in the upper frequency range. The 5 vowels are aligned on the P1-P2 chart. A 5-vowel recognition experiment using a data set of 5 subjects and a convolutional neural network (CNN) classifier gave a mean accuracy rate of 72.6%.
Collapse
Affiliation(s)
- Tsuneo Nitta
- Graduate School of Engineering, Toyohashi University of Technology, Toyohashi, Japan
| | - Junsei Horikawa
- Graduate School of Engineering, Toyohashi University of Technology, Toyohashi, Japan
| | - Yurie Iribe
- Graduate School of Information Science and Technology, Aichi Prefectural University, Nagakute, Japan
| | - Ryo Taguchi
- Graduate School of Information, Nagoya Institute of Technology, Nagoya, Japan
| | - Kouichi Katsurada
- Faculty of Science and Technology, Tokyo University of Science, Noda, Japan
| | - Shuji Shinohara
- School of Science and Engineering, Tokyo Denki University, Saitama, Japan
| | - Goh Kawai
- Online Learning Support Team, Tokyo University of Foreign Studies, Tokyo, Japan
| |
Collapse
|
13
|
EEG-based covert speech decoding using random rotation extreme learning machine ensemble for intuitive BCI communication. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
14
|
Blenkmann AO, Solbakk AK, Ivanovic J, Larsson PG, Knight RT, Endestad T. Modeling intracranial electrodes. A simulation platform for the evaluation of localization algorithms. Front Neuroinform 2022; 16:788685. [PMID: 36277477 PMCID: PMC9582989 DOI: 10.3389/fninf.2022.788685] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open
Abstract
Introduction Intracranial electrodes are implanted in patients with drug-resistant epilepsy as part of their pre-surgical evaluation. This allows the investigation of normal and pathological brain functions with excellent spatial and temporal resolution. The spatial resolution relies on methods that precisely localize the implanted electrodes in the cerebral cortex, which is critical for drawing valid inferences about the anatomical localization of brain function. Multiple methods have been developed to localize the electrodes, mainly relying on pre-implantation MRI and post-implantation computer tomography (CT) images. However, they are hard to validate because there is no ground truth data to test them and there is no standard approach to systematically quantify their performance. In other words, their validation lacks standardization. Our work aimed to model intracranial electrode arrays and simulate realistic implantation scenarios, thereby providing localization algorithms with new ways to evaluate and optimize their performance. Results We implemented novel methods to model the coordinates of implanted grids, strips, and depth electrodes, as well as the CT artifacts produced by these. We successfully modeled realistic implantation scenarios, including different sizes, inter-electrode distances, and brain areas. In total, ∼3,300 grids and strips were fitted over the brain surface, and ∼850 depth electrode arrays penetrating the cortical tissue were modeled. Realistic CT artifacts were simulated at the electrode locations under 12 different noise levels. Altogether, ∼50,000 thresholded CT artifact arrays were simulated in these scenarios, and validated with real data from 17 patients regarding the coordinates' spatial deformation, and the CT artifacts' shape, intensity distribution, and noise level. Finally, we provide an example of how the simulation platform is used to characterize the performance of two cluster-based localization methods. Conclusion We successfully developed the first platform to model implanted intracranial grids, strips, and depth electrodes and realistically simulate thresholded CT artifacts and their noise. These methods provide a basis for developing more complex models, while simulations allow systematic evaluation of the performance of electrode localization techniques. The methods described in this article, and the results obtained from the simulations, are freely available via open repositories. A graphical user interface implementation is also accessible via the open-source iElectrodes toolbox.
Collapse
Affiliation(s)
- Alejandro O. Blenkmann
- Department of Psychology, University of Oslo, Oslo, Norway
- RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
| | - Anne-Kristin Solbakk
- Department of Psychology, University of Oslo, Oslo, Norway
- RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Neurosurgery, Oslo University Hospital, Oslo, Norway
- Department of Neuropsychology, Helgeland Hospital, Mosjøen, Norway
| | | | | | - Robert T. Knight
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Tor Endestad
- Department of Psychology, University of Oslo, Oslo, Norway
- RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Neuropsychology, Helgeland Hospital, Mosjøen, Norway
| |
Collapse
|
15
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
16
|
Ward LM, Guevara R. Qualia and Phenomenal Consciousness Arise From the Information Structure of an Electromagnetic Field in the Brain. Front Hum Neurosci 2022; 16:874241. [PMID: 35860400 PMCID: PMC9289677 DOI: 10.3389/fnhum.2022.874241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/17/2022] [Indexed: 11/17/2022] Open
Abstract
In this paper we address the following problems and provide realistic answers to them: (1) What could be the physical substrate for subjective, phenomenal, consciousness (P-consciousness)? Our answer: the electromagnetic (EM) field generated by the movement and changes of electrical charges in the brain. (2) Is this substrate generated in some particular part of the brains of conscious entities or does it comprise the entirety of the brain/body? Our answer: a part of the thalamus in mammals, and homologous parts of other brains generates the critical EM field. (3) From whence arise the qualia experienced in P-consciousness? Our answer, the relevant EM field is “structured” by emulating in the brain the information in EM fields arising from both external (the environment) and internal (the body) sources. (4) What differentiates the P-conscious EM field from other EM fields, e.g., the flux of photons scattered from object surfaces, the EM field of an electro-magnet, or the EM fields generated in the brain that do not enter P-consciousness, such as those generated in the retina or occipital cortex, or those generated in brain areas that guide behavior through visual information in persons exhibiting “blindsight”? Our answer: living systems express a boundary between themselves and the environment, requiring them to model (coarsely emulate) information from their environment in order to control through actions, to the extent possible, the vast sea of variety in which they are immersed. This model, expressed in an EM field, is P-consciousness. The model is the best possible representation of the moment-to-moment niche-relevant (action-relevant: affordance) information an organism can generate (a Gestalt). Information that is at a lower level than niche-relevant, such as the unanalyzed retinal vector-field, is not represented in P-consciousness because it is not niche-relevant. Living organisms have sensory and other systems that have evolved to supply such information, albeit in a coarse form.
Collapse
Affiliation(s)
- Lawrence M. Ward
- Department of Psychology and Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada
- *Correspondence: Lawrence M. Ward,
| | - Ramón Guevara
- Department of Physics and Astronomy, University of Padua, Padua, Italy
- Department of Developmental Psychology and Socialization, Padova Neuroscience Center, University of Padua, Padua, Italy
| |
Collapse
|
17
|
Lee KW, Lee DH, Kim SJ, Lee SW. Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1977-1980. [PMID: 36086641 DOI: 10.1109/embc48229.2022.9871721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal information and low signal-to-noise ratio. In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese. Our assumption was that English, a non-tonal and phonogram-based language, would have spectral differences in neural computation compared to Chinese, a tonal and ideogram-based language. The results showed the significant difference in the relative power spectral density between English and Chinese in specific frequency band groups. Also, the spatial evaluation of Chinese native speakers in the theta band was distinctive during the imagination task. Hence, this paper would suggest the key spectral and spatial information of word imagination with specialized language while decoding the neural signals of speech. Clinical Relevance- Imagined speech-related studies lead to the development of assistive communication technology especially for patients with speech disorders such as aphasia due to brain damage. This study suggests significant spectral features by analyzing cross-language differences of EEG-based imagined speech using two widely used languages.
Collapse
|
18
|
刘 艳, 龚 安, 丁 鹏, 赵 磊, 钱 谦, 周 建, 苏 磊, 伏 云. [Key technology of brain-computer interaction based on speech imagery]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2022; 39:596-611. [PMID: 35788530 PMCID: PMC10950764 DOI: 10.7507/1001-5515.202107018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 04/14/2022] [Indexed: 06/15/2023]
Abstract
Speech expression is an important high-level cognitive behavior of human beings. The realization of this behavior is closely related to human brain activity. Both true speech expression and speech imagination can activate part of the same brain area. Therefore, speech imagery becomes a new paradigm of brain-computer interaction. Brain-computer interface (BCI) based on speech imagery has the advantages of spontaneous generation, no training, and friendliness to subjects, so it has attracted the attention of many scholars. However, this interactive technology is not mature in the design of experimental paradigms and the choice of imagination materials, and there are many issues that need to be discussed urgently. Therefore, in response to these problems, this article first expounds the neural mechanism of speech imagery. Then, by reviewing the previous BCI research of speech imagery, the mainstream methods and core technologies of experimental paradigm, imagination materials, data processing and so on are systematically analyzed. Finally, the key problems and main challenges that restrict the development of this type of BCI are discussed. And the future development and application perspective of the speech imaginary BCI system are prospected.
Collapse
Affiliation(s)
- 艳鹏 刘
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 安民 龚
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 鹏 丁
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 磊 赵
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 谦 钱
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 建华 周
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 磊 苏
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
| | - 云发 伏
- 昆明理工大学 信息工程与自动化学院(昆明 650500)School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 昆明理工大学 脑认知与脑机智能融合创新团队(昆明 650500)Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming 650500, P. R. China
- 武警工程大学 信息工程学院(西安 710000)College of Information Engineering, Engineering University of PAP, Xi’an 710000, P. R. China
- 昆明理工大学 理学院(昆明 650500)Faculty of Science, Kunming University of Science and Technology, Kunming 650500, P. R. China
| |
Collapse
|
19
|
Wilson BS, Tucci DL, Moses DA, Chang EF, Young NM, Zeng FG, Lesica NA, Bur AM, Kavookjian H, Mussatto C, Penn J, Goodwin S, Kraft S, Wang G, Cohen JM, Ginsburg GS, Dawson G, Francis HW. Harnessing the Power of Artificial Intelligence in Otolaryngology and the Communication Sciences. J Assoc Res Otolaryngol 2022; 23:319-349. [PMID: 35441936 PMCID: PMC9086071 DOI: 10.1007/s10162-022-00846-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 04/02/2022] [Indexed: 02/01/2023] Open
Abstract
Use of artificial intelligence (AI) is a burgeoning field in otolaryngology and the communication sciences. A virtual symposium on the topic was convened from Duke University on October 26, 2020, and was attended by more than 170 participants worldwide. This review presents summaries of all but one of the talks presented during the symposium; recordings of all the talks, along with the discussions for the talks, are available at https://www.youtube.com/watch?v=ktfewrXvEFg and https://www.youtube.com/watch?v=-gQ5qX2v3rg . Each of the summaries is about 2500 words in length and each summary includes two figures. This level of detail far exceeds the brief summaries presented in traditional reviews and thus provides a more-informed glimpse into the power and diversity of current AI applications in otolaryngology and the communication sciences and how to harness that power for future applications.
Collapse
Affiliation(s)
- Blake S. Wilson
- grid.26009.3d0000 0004 1936 7961Department of Head and Neck Surgery & Communication Sciences, Duke University School of Medicine, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Duke Hearing Center, Duke University School of Medicine, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Department of Electrical & Computer Engineering, Duke University, Durham, NC 27708 USA ,grid.26009.3d0000 0004 1936 7961Department of Biomedical Engineering, Duke University, Durham, NC 27708 USA ,grid.410711.20000 0001 1034 1720Department of Otolaryngology – Head & Neck Surgery, University of North Carolina, Chapel Hill, Chapel Hill, NC 27599 USA
| | - Debara L. Tucci
- grid.26009.3d0000 0004 1936 7961Department of Head and Neck Surgery & Communication Sciences, Duke University School of Medicine, Durham, NC 27710 USA ,grid.214431.10000 0001 2226 8444National Institute On Deafness and Other Communication Disorders, National Institutes of Health, Bethesda, MD 20892 USA
| | - David A. Moses
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA ,grid.266102.10000 0001 2297 6811UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94117 USA
| | - Edward F. Chang
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA 94143 USA ,grid.266102.10000 0001 2297 6811UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94117 USA
| | - Nancy M. Young
- grid.413808.60000 0004 0388 2248Division of Otolaryngology, Ann and Robert H. Lurie Childrens Hospital of Chicago, Chicago, IL 60611 USA ,grid.16753.360000 0001 2299 3507Department of Otolaryngology - Head and Neck Surgery, Northwestern University Feinberg School of Medicine, Chicago, IL 60611 USA ,grid.16753.360000 0001 2299 3507Department of Communication, Knowles Hearing Center, Northwestern University, Evanston, IL 60208 USA
| | - Fan-Gang Zeng
- grid.266093.80000 0001 0668 7243Center for Hearing Research, University of California, Irvine, Irvine, CA 92697 USA ,grid.266093.80000 0001 0668 7243Department of Anatomy and Neurobiology, University of California, Irvine, Irvine, CA 92697 USA ,grid.266093.80000 0001 0668 7243Department of Biomedical Engineering, University of California, Irvine, Irvine, CA 92697 USA ,grid.266093.80000 0001 0668 7243Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697 USA ,grid.266093.80000 0001 0668 7243Department of Otolaryngology – Head and Neck Surgery, University of California, Irvine, CA 92697 USA
| | - Nicholas A. Lesica
- grid.83440.3b0000000121901201UCL Ear Institute, University College London, London, WC1X 8EE UK
| | - Andrés M. Bur
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Hannah Kavookjian
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Caroline Mussatto
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Joseph Penn
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Sara Goodwin
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Shannon Kraft
- grid.266515.30000 0001 2106 0692Department of Otolaryngology - Head and Neck Surgery, Medical Center, University of Kansas, Kansas City, KS 66160 USA
| | - Guanghui Wang
- grid.68312.3e0000 0004 1936 9422Department of Computer Science, Ryerson University, Toronto, ON M5B 2K3 Canada
| | - Jonathan M. Cohen
- grid.26009.3d0000 0004 1936 7961Department of Head and Neck Surgery & Communication Sciences, Duke University School of Medicine, Durham, NC 27710 USA ,grid.415014.50000 0004 0575 3669ENT Department, Kaplan Medical Center, 7661041 Rehovot, Israel
| | - Geoffrey S. Ginsburg
- grid.26009.3d0000 0004 1936 7961Department of Biomedical Engineering, Duke University, Durham, NC 27708 USA ,grid.26009.3d0000 0004 1936 7961MEDx (Medicine & Engineering at Duke), Duke University, Durham, NC 27708 USA ,grid.26009.3d0000 0004 1936 7961Center for Applied Genomics & Precision Medicine, Duke University School of Medicine, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Department of Medicine, Duke University School of Medicine, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Department of Pathology, Duke University School of Medicine, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710 USA
| | - Geraldine Dawson
- grid.26009.3d0000 0004 1936 7961Duke Institute for Brain Sciences, Duke University, Durham, NC 27710 USA ,grid.26009.3d0000 0004 1936 7961Duke Center for Autism and Brain Development, Duke University School of Medicine and the Duke Institute for Brain Sciences, NIH Autism Center of Excellence, Durham, NC 27705 USA ,grid.26009.3d0000 0004 1936 7961Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC 27701 USA
| | - Howard W. Francis
- grid.26009.3d0000 0004 1936 7961Department of Head and Neck Surgery & Communication Sciences, Duke University School of Medicine, Durham, NC 27710 USA
| |
Collapse
|
20
|
Fujiwara Y, Ushiba J. Deep Residual Convolutional Neural Networks for Brain-Computer Interface to Visualize Neural Processing of Hand Movements in the Human Brain. Front Comput Neurosci 2022; 16:882290. [PMID: 35669388 PMCID: PMC9165810 DOI: 10.3389/fncom.2022.882290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 04/19/2022] [Indexed: 11/29/2022] Open
Abstract
Concomitant with the development of deep learning, brain-computer interface (BCI) decoding technology has been rapidly evolving. Convolutional neural networks (CNNs), which are generally used as electroencephalography (EEG) classification models, are often deployed in BCI prototypes to improve the estimation accuracy of a participant's brain activity. However, because most BCI models are trained, validated, and tested via within-subject cross-validation and there is no corresponding generalization model, their applicability to unknown participants is not guaranteed. In this study, to facilitate the generalization of BCI model performance to unknown participants, we trained a model comprising multiple layers of residual CNNs and visualized the reasons for BCI classification to reveal the location and timing of neural activities that contribute to classification. Specifically, to develop a BCI that can distinguish between rest, left-hand movement, and right-hand movement tasks with high accuracy, we created multilayers of CNNs, inserted residual networks into the multilayers, and used a larger dataset than in previous studies. The constructed model was analyzed with gradient-class activation mapping (Grad-CAM). We evaluated the developed model via subject cross-validation and found that it achieved significantly improved accuracy (85.69 ± 1.10%) compared with conventional models or without residual networks. Grad-CAM analysis of the classification of cases in which our model produced correct answers showed localized activity near the premotor cortex. These results confirm the effectiveness of inserting residual networks into CNNs for tuning BCI. Further, they suggest that recording EEG signals over the premotor cortex and some other areas contributes to high classification accuracy.
Collapse
Affiliation(s)
- Yosuke Fujiwara
- Graduate School of Science and Technology, Keio University, Yokohama, Japan
- Information Services International-Dentsu, Ltd., Tokyo, Japan
| | - Junichi Ushiba
- Faculty of Science and Technology, Keio University, Yokohama, Japan
| |
Collapse
|
21
|
Rethinking the Methods and Algorithms for Inner Speech Decoding and Making Them Reproducible. NEUROSCI 2022. [DOI: 10.3390/neurosci3020017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced spoken words by using various brain–computer interfaces. The main shortcomings of existing work are reproducibility and the availability of data and code. In this work, we investigate various methods (using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), Long Short-Term Memory Networks (LSTM)) for the detection task of five vowels and six words on a publicly available EEG dataset. The main contributions of this work are (1) subject dependent vs. subject-independent approaches, (2) the effect of different preprocessing steps (Independent Component Analysis (ICA), down-sampling and filtering), and (3) word classification (where we achieve state-of-the-art performance on a publicly available dataset). Overall we achieve a performance accuracy of 35.20% and 29.21% when classifying five vowels and six words, respectively, in a publicly available dataset, using our tuned iSpeech-CNN architecture. All of our code and processed data are publicly available to ensure reproducibility. As such, this work contributes to a deeper understanding and reproducibility of experiments in the area of inner speech detection.
Collapse
|
22
|
Luo S, Rabbani Q, Crone NE. Brain-Computer Interface: Applications to Speech Decoding and Synthesis to Augment Communication. Neurotherapeutics 2022; 19:263-273. [PMID: 35099768 PMCID: PMC9130409 DOI: 10.1007/s13311-022-01190-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 01/03/2023] Open
Abstract
Damage or degeneration of motor pathways necessary for speech and other movements, as in brainstem strokes or amyotrophic lateral sclerosis (ALS), can interfere with efficient communication without affecting brain structures responsible for language or cognition. In the worst-case scenario, this can result in the locked in syndrome (LIS), a condition in which individuals cannot initiate communication and can only express themselves by answering yes/no questions with eye blinks or other rudimentary movements. Existing augmentative and alternative communication (AAC) devices that rely on eye tracking can improve the quality of life for people with this condition, but brain-computer interfaces (BCIs) are also increasingly being investigated as AAC devices, particularly when eye tracking is too slow or unreliable. Moreover, with recent and ongoing advances in machine learning and neural recording technologies, BCIs may offer the only means to go beyond cursor control and text generation on a computer, to allow real-time synthesis of speech, which would arguably offer the most efficient and expressive channel for communication. The potential for BCI speech synthesis has only recently been realized because of seminal studies of the neuroanatomical and neurophysiological underpinnings of speech production using intracranial electrocorticographic (ECoG) recordings in patients undergoing epilepsy surgery. These studies have shown that cortical areas responsible for vocalization and articulation are distributed over a large area of ventral sensorimotor cortex, and that it is possible to decode speech and reconstruct its acoustics from ECoG if these areas are recorded with sufficiently dense and comprehensive electrode arrays. In this article, we review these advances, including the latest neural decoding strategies that range from deep learning models to the direct concatenation of speech units. We also discuss state-of-the-art vocoders that are integral in constructing natural-sounding audio waveforms for speech BCIs. Finally, this review outlines some of the challenges ahead in directly synthesizing speech for patients with LIS.
Collapse
Affiliation(s)
- Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
23
|
Kiroy V, Bakhtin O, Krivko E, Lazurenko D, Aslanyan E, Shaposhnikov D, Shcherban I. Spoken and Inner Speech-related EEG Connectivity in Different Spatial Direction. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103224] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
24
|
Cooney C, Folli R, Coyle D. A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech. IEEE Trans Biomed Eng 2021; 69:1983-1994. [PMID: 34874850 DOI: 10.1109/tbme.2021.3132861] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE Brain-computer interfaces (BCI) studies are increasingly leveraging different attributes of multiple signal modalities simultaneously. Bimodal data acquisition protocols combining the temporal resolution of electroencephalography (EEG) with the spatial resolution of functional near-infrared spectroscopy (fNIRS) require novel approaches to decoding. METHODS We present an EEG-fNIRS Hybrid BCI that employs a new bimodal deep neural network architecture consisting of two convolutional sub-networks (subnets) to decode overt and imagined speech. Features from each subnet are fused before further feature extraction and classification. Nineteen participants performed overt and imagined speech in a novel cue-based paradigm enabling investigation of stimulus and linguistic effects on decoding. RESULTS Using the hybrid approach, classification accuracies (46.31% and 34.29% for overt and imagined speech, respectively (chance: 25%)) indicated a significant improvement on EEG used independently for imagined speech (p=0.020) while tending towards significance for overt speech (p=0.098). In comparison with fNIRS, significant improvements for both speech-types were achieved with bimodal decoding (p<0.001). There was a mean difference of ~12.02% between overt and imagined speech with accuracies as high as 87.18% and 53%. Deeper subnets enhanced performance while stimulus effected overt and imagined speech in significantly different ways. CONCLUSION The bimodal approach was a significant improvement on unimodal results for several tasks. Results indicate the potential of multi-modal deep learning for enhancing neural signal decoding. SIGNIFICANCE This novel architecture can be used to enhance speech decoding from bimodal neural signals.
Collapse
|
25
|
Cajigas I, Davis KC, Meschede-Krasa B, Prins NW, Gallo S, Naeem JA, Palermo A, Wilson A, Guerra S, Parks BA, Zimmerman L, Gant K, Levi AD, Dietrich WD, Fisher L, Vanni S, Tauber JM, Garwood IC, Abel JH, Brown EN, Ivan ME, Prasad A, Jagid J. Implantable brain-computer interface for neuroprosthetic-enabled volitional hand grasp restoration in spinal cord injury. Brain Commun 2021; 3:fcab248. [PMID: 34870202 PMCID: PMC8637800 DOI: 10.1093/braincomms/fcab248] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/27/2021] [Accepted: 08/19/2021] [Indexed: 11/12/2022] Open
Abstract
Loss of hand function after cervical spinal cord injury severely impairs functional independence. We describe a method for restoring volitional control of hand grasp in one 21-year-old male subject with complete cervical quadriplegia (C5 American Spinal Injury Association Impairment Scale A) using a portable fully implanted brain-computer interface within the home environment. The brain-computer interface consists of subdural surface electrodes placed over the dominant-hand motor cortex and connects to a transmitter implanted subcutaneously below the clavicle, which allows continuous reading of the electrocorticographic activity. Movement-intent was used to trigger functional electrical stimulation of the dominant hand during an initial 29-weeks laboratory study and subsequently via a mechanical hand orthosis during in-home use. Movement-intent information could be decoded consistently throughout the 29-weeks in-laboratory study with a mean accuracy of 89.0% (range 78-93.3%). Improvements were observed in both the speed and accuracy of various upper extremity tasks, including lifting small objects and transferring objects to specific targets. At-home decoding accuracy during open-loop trials reached an accuracy of 91.3% (range 80-98.95%) and an accuracy of 88.3% (range 77.6-95.5%) during closed-loop trials. Importantly, the temporal stability of both the functional outcomes and decoder metrics were not explored in this study. A fully implanted brain-computer interface can be safely used to reliably decode movement-intent from motor cortex, allowing for accurate volitional control of hand grasp.
Collapse
Affiliation(s)
- Iahn Cajigas
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
| | - Kevin C Davis
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Benyamin Meschede-Krasa
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Noeline W Prins
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
- Department of Electrical and Information Engineering, Faculty of Engineering, University of Ruhuna, Hapugala, Galle 80000, Sri Lanka
| | - Sebastian Gallo
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Jasim Ahmad Naeem
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Anne Palermo
- Department of Physical Therapy, University of Miami, Miami, FL 33146, USA
| | - Audrey Wilson
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - Santiago Guerra
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Brandon A Parks
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Lauren Zimmerman
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
| | - Katie Gant
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - Allan D Levi
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - W Dalton Dietrich
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - Letitia Fisher
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - Steven Vanni
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - John Michael Tauber
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Indie C Garwood
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - John H Abel
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Emery N Brown
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Michael E Ivan
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
| | - Abhishek Prasad
- Department of Biomedical Engineering, University of Miami, Miami, FL 33146, USA
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| | - Jonathan Jagid
- Department of Neurological Surgery, University of Miami, Miami, FL 33136, USA
- Miami Project to Cure Paralysis, University of Miami, Miami, FL 33136, USA
| |
Collapse
|
26
|
Duffau H. The death of localizationism: The concepts of functional connectome and neuroplasticity deciphered by awake mapping, and their implications for best care of brain-damaged patients. Rev Neurol (Paris) 2021; 177:1093-1103. [PMID: 34563375 DOI: 10.1016/j.neurol.2021.07.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 06/20/2021] [Accepted: 07/23/2021] [Indexed: 11/28/2022]
Abstract
Although clinical neurology was mainly erected on the dogma of localizationism, numerous reports have described functional recovery after lesions involving presumed non-compensable areas in an inflexible view of brain processing. Here, the purpose is to review new insights into the functional connectome and the mechanisms underpinning neural plasticity, gained from intraoperative direct electrostimulation mapping and real-time behavioral monitoring in awake patients, combined with perioperative neuropsychological and neuroimaging data. Such longitudinal anatomo-functional correlations resulted in the reappraisal of classical models of cognition, especially by highlighting the dynamic interplay within and between neural circuits, leading to the concept of meta-network (network of networks), as well as by emphasizing that subcortical connectivity is the main limitation of neuroplastic potential. Beyond their contribution to basic neurosciences, these findings might also be helpful for an optimization of care for brain-damaged patients, such as in resective oncological or epilepsy neurosurgery in structures traditionally deemed inoperable (e.g., in Broca's area) as well as for elaborating new programs of functional rehabilitation, eventually combined with transcranial brain stimulation, aiming to change the connectivity patterns in order to enhance cognitive competences following cerebral injury.
Collapse
Affiliation(s)
- H Duffau
- Department of Neurosurgery, Gui-de-Chauliac Hospital, Montpellier University Medical Center, 80, avenue Augustin-Fliche, 34295 Montpellier, France; National Institute for Health and Medical Research (INSERM), U1191 Laboratory, Team "Brain Plasticity, Stem Cells and Low-Grade Gliomas", Institute of Functional Genomics, University of Montpellier, 34091 Montpellier, France.
| |
Collapse
|
27
|
Local field potentials in a pre-motor region predict learned vocal sequences. PLoS Comput Biol 2021; 17:e1008100. [PMID: 34555020 PMCID: PMC8460039 DOI: 10.1371/journal.pcbi.1008100] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 07/08/2021] [Indexed: 11/19/2022] Open
Abstract
Neuronal activity within the premotor region HVC is tightly synchronized to, and crucial for, the articulate production of learned song in birds. Characterizations of this neural activity detail patterns of sequential bursting in small, carefully identified subsets of neurons in the HVC population. The dynamics of HVC are well described by these characterizations, but have not been verified beyond this scale of measurement. There is a rich history of using local field potentials (LFP) to extract information about behavior that extends beyond the contribution of individual cells. These signals have the advantage of being stable over longer periods of time, and they have been used to study and decode human speech and other complex motor behaviors. Here we characterize LFP signals presumptively from the HVC of freely behaving male zebra finches during song production to determine if population activity may yield similar insights into the mechanisms underlying complex motor-vocal behavior. Following an initial observation that structured changes in the LFP were distinct to all vocalizations during song, we show that it is possible to extract time-varying features from multiple frequency bands to decode the identity of specific vocalization elements (syllables) and to predict their temporal onsets within the motif. This demonstrates the utility of LFP for studying vocal behavior in songbirds. Surprisingly, the time frequency structure of HVC LFP is qualitatively similar to well-established oscillations found in both human and non-human mammalian motor areas. This physiological similarity, despite distinct anatomical structures, may give insight into common computational principles for learning and/or generating complex motor-vocal behaviors. Vocalizations, such as speech and song, are a motor process that requires the coordination of numerous muscle groups receiving instructions from specific brain regions. In songbirds, HVC is a premotor brain region required for singing; it is populated by a set of neurons that fire sparsely during song. How HVC enables song generation is not well understood. Here we describe network activity presumptively from HVC that precedes the initiation of each vocal element during singing. This network activity can be used to predict both the identity of each vocal element (syllable) and when it will occur during song. In addition, this network activity is similar to activity that has been documented in human, non-human primate, and mammalian premotor regions tied to muscle movements. These similarities add to a growing body of literature that finds parallels between songbirds and humans in respect to the motor control of vocal organs. Furthermore, given the similarities of the songbird and human motor-vocal systems, these results suggest that the songbird model could be leveraged to accelerate the development of clinically translatable speech prosthesis.
Collapse
|
28
|
Lestrell E, O'Brien CM, Elnathan R, Voelcker NH. Vertically Aligned Nanostructured Topographies for Human Neural Stem Cell Differentiation and Neuronal Cell Interrogation. ADVANCED THERAPEUTICS 2021. [DOI: 10.1002/adtp.202100061] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Esther Lestrell
- Faculty of Pharmacy and Pharmaceutical Sciences Monash University Parkville VIC 3052 Australia
- Melbourne Centre for Nanofabrication Victorian Node of the Australian National Fabrication Facility 151 Wellington Road Clayton Victoria 3168 Australia
- CSIRO Manufacturing Clayton Victoria 3168 Australia
| | - Carmel M. O'Brien
- CSIRO Manufacturing Clayton Victoria 3168 Australia
- Australian Regenerative Medicine Institute Monash University Clayton Victoria 3168 Australia
| | - Roey Elnathan
- Faculty of Pharmacy and Pharmaceutical Sciences Monash University Parkville VIC 3052 Australia
- Melbourne Centre for Nanofabrication Victorian Node of the Australian National Fabrication Facility 151 Wellington Road Clayton Victoria 3168 Australia
| | - Nicolas H. Voelcker
- Faculty of Pharmacy and Pharmaceutical Sciences Monash University Parkville VIC 3052 Australia
- Melbourne Centre for Nanofabrication Victorian Node of the Australian National Fabrication Facility 151 Wellington Road Clayton Victoria 3168 Australia
- CSIRO Manufacturing Clayton Victoria 3168 Australia
| |
Collapse
|
29
|
Li F, Chao W, Li Y, Fu B, Ji Y, Wu H, Shi G. Decoding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network. J Neural Eng 2021; 18. [PMID: 34256357 DOI: 10.1088/1741-2552/ac13c0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 07/13/2021] [Indexed: 11/12/2022]
Abstract
Objective.Directly decoding imagined speech from electroencephalogram (EEG) signals has attracted much interest in brain-computer interface applications, because it provides a natural and intuitive communication method for locked-in patients. Several methods have been applied to imagined speech decoding, but how to construct spatial-temporal dependencies and capture long-range contextual cues in EEG signals to better decode imagined speech should be considered.Approach.In this study, we propose a novel model called hybrid-scale spatial-temporal dilated convolution network (HS-STDCN) for EEG-based imagined speech recognition. HS-STDCN integrates feature learning from temporal and spatial information into a unified end-to-end model. To characterize the temporal dependencies of the EEG sequences, we adopted a hybrid-scale temporal convolution layer to capture temporal information at multiple levels. A depthwise spatial convolution layer was then designed to construct intrinsic spatial relationships of EEG electrodes, which can produce a spatial-temporal representation of the input EEG data. Based on the spatial-temporal representation, dilated convolution layers were further employed to learn long-range discriminative features for the final classification.Main results.To evaluate the proposed method, we compared the HS-STDCN with other existing methods on our collected dataset. The HS-STDCN achieved an averaged classification accuracy of 54.31% for decoding eight imagined words, which is significantly better than other methods at a significance level of 0.05.Significance.The proposed HS-STDCN model provided an effective approach to make use of both the temporal and spatial dependencies of the input EEG signals for imagined speech recognition. We also visualized the word semantic differences to analyze the impact of word semantics on imagined speech recognition, investigated the important regions in the decoding process, and explored the use of fewer electrodes to achieve comparable performance.
Collapse
Affiliation(s)
- Fu Li
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Weibing Chao
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Yang Li
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Boxun Fu
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Youshuo Ji
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Hao Wu
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| | - Guangming Shi
- Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China
| |
Collapse
|
30
|
Moses DA, Metzger SL, Liu JR, Anumanchipalli GK, Makin JG, Sun PF, Chartier J, Dougherty ME, Liu PM, Abrams GM, Tu-Chan A, Ganguly K, Chang EF. Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria. N Engl J Med 2021; 385:217-227. [PMID: 34260835 PMCID: PMC8972947 DOI: 10.1056/nejmoa2027540] [Citation(s) in RCA: 134] [Impact Index Per Article: 44.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
BACKGROUND Technology to restore the ability to communicate in paralyzed persons who cannot speak has the potential to improve autonomy and quality of life. An approach that decodes words and sentences directly from the cerebral cortical activity of such patients may represent an advancement over existing methods for assisted communication. METHODS We implanted a subdural, high-density, multielectrode array over the area of the sensorimotor cortex that controls speech in a person with anarthria (the loss of the ability to articulate speech) and spastic quadriparesis caused by a brain-stem stroke. Over the course of 48 sessions, we recorded 22 hours of cortical activity while the participant attempted to say individual words from a vocabulary set of 50 words. We used deep-learning algorithms to create computational models for the detection and classification of words from patterns in the recorded cortical activity. We applied these computational models, as well as a natural-language model that yielded next-word probabilities given the preceding words in a sequence, to decode full sentences as the participant attempted to say them. RESULTS We decoded sentences from the participant's cortical activity in real time at a median rate of 15.2 words per minute, with a median word error rate of 25.6%. In post hoc analyses, we detected 98% of the attempts by the participant to produce individual words, and we classified words with 47.1% accuracy using cortical signals that were stable throughout the 81-week study period. CONCLUSIONS In a person with anarthria and spastic quadriparesis caused by a brain-stem stroke, words and sentences were decoded directly from cortical activity during attempted speech with the use of deep-learning models and a natural-language model. (Funded by Facebook and others; ClinicalTrials.gov number, NCT03698149.).
Collapse
Affiliation(s)
- David A Moses
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Sean L Metzger
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Jessie R Liu
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Gopala K Anumanchipalli
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Joseph G Makin
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Pengfei F Sun
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Josh Chartier
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Maximilian E Dougherty
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Patricia M Liu
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Gary M Abrams
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Adelyn Tu-Chan
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Karunesh Ganguly
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Edward F Chang
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| |
Collapse
|
31
|
Delfino E, Pastore A, Zucchini E, Cruz MFP, Ius T, Vomero M, D'Ausilio A, Casile A, Skrap M, Stieglitz T, Fadiga L. Prediction of Speech Onset by Micro-Electrocorticography of the Human Brain. Int J Neural Syst 2021; 31:2150025. [PMID: 34130614 DOI: 10.1142/s0129065721500258] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Recent technological advances show the feasibility of offline decoding speech from neuronal signals, paving the way to the development of chronically implanted speech brain computer interfaces (sBCI). Two key steps that still need to be addressed for the online deployment of sBCI are, on the one hand, the definition of relevant design parameters of the recording arrays, on the other hand, the identification of robust physiological markers of the patient's intention to speak, which can be used to online trigger the decoding process. To address these issues, we acutely recorded speech-related signals from the frontal cortex of two human patients undergoing awake neurosurgery for brain tumors using three different micro-electrocorticographic ([Formula: see text]ECoG) devices. First, we observed that, at the smallest investigated pitch (600[Formula: see text][Formula: see text]m), neighboring channels are highly correlated, suggesting that more closely spaced electrodes would provide some redundant information. Second, we trained a classifier to recognize speech-related motor preparation from high-gamma oscillations (70-150[Formula: see text]Hz), demonstrating that these neuronal signals can be used to reliably predict speech onset. Notably, our model generalized both across subjects and recording devices showing the robustness of its performance. These findings provide crucial information for the design of future online sBCI.
Collapse
Affiliation(s)
- Emanuela Delfino
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| | - Aldo Pastore
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| | - Elena Zucchini
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| | - Maria Francisca Porto Cruz
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Laboratory for Biomedical Microtechnology, Department of Microsystems Engineering (IMTEK), University of Freiburg, Georges-Köhler-Allee 102, Freiburg im Breisgau 79110, Germany
| | - Tamara Ius
- Struttura Complessa di Neurochirurgia, Azienda Ospedaliero-Universitaria Santa Maria, della Misericordia, Piazzale Santa Maria, della Misericordia 15, Udine 33100, Italy
| | - Maria Vomero
- Bioelectronic Systems Laboratory, Columbia University, 500 West 120th Street, New York, NY 10027, USA
| | - Alessandro D'Ausilio
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| | - Antonino Casile
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| | - Miran Skrap
- Struttura Complessa di Neurochirurgia, Azienda Ospedaliero-Universitaria Santa Maria, della Misericordia, Piazzale Santa Maria, della Misericordia 15, Udine 33100, Italy
| | - Thomas Stieglitz
- Laboratory for Biomedical Microtechnology, Department of Microsystems Engineering (IMTEK), University of Freiburg, Georges-Köhler-Allee 102, Freiburg im Breisgau 79110, Germany.,BrainLinks-BrainTools Center, University of Freiburg, Georges-Köhler-Allee 80, Freiburg im Breisgau 79110, Germany
| | - Luciano Fadiga
- Center for Translational Neurophysiology, Istituto Italiano di Tecnologia, Via Fossato di Mortara 17-19, Ferrara 44121, Italy.,Section of Physiology, University of Ferrara, Via Fossato di Mortara 17-19, Ferrara 44121, Italy
| |
Collapse
|
32
|
Geraci A, D'Amico A, Pipitone A, Seidita V, Chella A. Automation Inner Speech as an Anthropomorphic Feature Affecting Human Trust: Current Issues and Future Directions. Front Robot AI 2021; 8:620026. [PMID: 33969001 PMCID: PMC8102901 DOI: 10.3389/frobt.2021.620026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 02/26/2021] [Indexed: 11/18/2022] Open
Abstract
This paper aims to discuss the possible role of inner speech in influencing trust in human–automation interaction. Inner speech is an everyday covert inner monolog or dialog with oneself, which is essential for human psychological life and functioning as it is linked to self-regulation and self-awareness. Recently, in the field of machine consciousness, computational models using different forms of robot speech have been developed that make it possible to implement inner speech in robots. As is discussed, robot inner speech could be a new feature affecting human trust by increasing robot transparency and anthropomorphism.
Collapse
Affiliation(s)
- Alessandro Geraci
- Robotics Lab, Department of Engineering, University of Palermo, Palermo, Italy.,Department of Psychology, Educational Science and Human Movement, University of Palermo, Palermo, Italy
| | - Antonella D'Amico
- Department of Psychology, Educational Science and Human Movement, University of Palermo, Palermo, Italy
| | - Arianna Pipitone
- Robotics Lab, Department of Engineering, University of Palermo, Palermo, Italy
| | - Valeria Seidita
- Robotics Lab, Department of Engineering, University of Palermo, Palermo, Italy
| | - Antonio Chella
- Robotics Lab, Department of Engineering, University of Palermo, Palermo, Italy
| |
Collapse
|
33
|
Panachakel JT, Ramakrishnan AG. Decoding Covert Speech From EEG-A Comprehensive Review. Front Neurosci 2021; 15:642251. [PMID: 33994922 PMCID: PMC8116487 DOI: 10.3389/fnins.2021.642251] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 03/18/2021] [Indexed: 11/13/2022] Open
Abstract
Over the past decade, many researchers have come up with different implementations of systems for decoding covert or imagined speech from EEG (electroencephalogram). They differ from each other in several aspects, from data acquisition to machine learning algorithms, due to which, a comparison between different implementations is often difficult. This review article puts together all the relevant works published in the last decade on decoding imagined speech from EEG into a single framework. Every important aspect of designing such a system, such as selection of words to be imagined, number of electrodes to be recorded, temporal and spatial filtering, feature extraction and classifier are reviewed. This helps a researcher to compare the relative merits and demerits of the different approaches and choose the one that is most optimal. Speech being the most natural form of communication which human beings acquire even without formal education, imagined speech is an ideal choice of prompt for evoking brain activity patterns for a BCI (brain-computer interface) system, although the research on developing real-time (online) speech imagery based BCI systems is still in its infancy. Covert speech based BCI can help people with disabilities to improve their quality of life. It can also be used for covert communication in environments that do not support vocal communication. This paper also discusses some future directions, which will aid the deployment of speech imagery based BCI for practical applications, rather than only for laboratory experiments.
Collapse
Affiliation(s)
- Jerrin Thomas Panachakel
- Medical Intelligence and Language Engineering Laboratory, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India
| | | |
Collapse
|
34
|
Brain connectomics applied to oncological neuroscience: from a traditional surgical strategy focusing on glioma topography to a meta-network approach. Acta Neurochir (Wien) 2021; 163:905-917. [PMID: 33564906 DOI: 10.1007/s00701-021-04752-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 02/01/2021] [Indexed: 02/07/2023]
Abstract
The classical way for surgical selection and planning in cerebral glioma mainly focused on tumor topography. The emerging science of connectomics, which aims of mapping brain connectivity, resulted in a paradigmatic shift from a modular account of cerebral organization to a meta-network perspective. Adaptive behavior is actually mediated by constant changes in interactions within and across large-scale delocalized neural systems underlying conation, cognition, and emotion. Here, to optimize the onco-functional balance of glioma surgery, the purpose is to switch toward a connectome-based resection taking account of both relationships between the tumor and critical distributed circuits (especially subcortical pathways) as well as the perpetual instability of the meta-network. Such dynamic in the neural spatiotemporal integration permits functional reallocation leading to neurological recovery after massive resection in structures traditionally thought as "inoperable." This better understanding of connectome increases benefit/risk ratio of surgery (i) by selecting resection in areas deemed "eloquent" according to a localizationist dogma; (ii), conversely, by refining intraoperative awake cognitive mapping and monitoring in so-called non-eloquent areas; (iii) by improving preoperative information, enabling an optimal selection of intrasurgical tasks tailored to the patient's wishes; (iv) by developing an "oncological disconnection surgery"; (v) by defining a personalized multistep surgical strategy adapted to individual brain reshaping potential; and (vi) ultimately by preserving environmentally and socially appropriate behavior, including return to work, while increasing the extent of (possibly repeated) resection(s). Such a holistic vision of neural processing can enhance reliability of connectomal surgery in oncological neuroscience and may also be applied to restorative neurosurgery.
Collapse
|
35
|
Wilson GH, Stavisky SD, Willett FR, Avansino DT, Kelemen JN, Hochberg LR, Henderson JM, Druckmann S, Shenoy KV. Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus. J Neural Eng 2020; 17:066007. [PMID: 33236720 PMCID: PMC8293867 DOI: 10.1088/1741-2552/abbfef] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
OBJECTIVE To evaluate the potential of intracortical electrode array signals for brain-computer interfaces (BCIs) to restore lost speech, we measured the performance of decoders trained to discriminate a comprehensive basis set of 39 English phonemes and to synthesize speech sounds via a neural pattern matching method. We decoded neural correlates of spoken-out-loud words in the 'hand knob' area of precentral gyrus, a step toward the eventual goal of decoding attempted speech from ventral speech areas in patients who are unable to speak. APPROACH Neural and audio data were recorded while two BrainGate2 pilot clinical trial participants, each with two chronically-implanted 96-electrode arrays, spoke 420 different words that broadly sampled English phonemes. Phoneme onsets were identified from audio recordings, and their identities were then classified from neural features consisting of each electrode's binned action potential counts or high-frequency local field potential power. Speech synthesis was performed using the 'Brain-to-Speech' pattern matching method. We also examined two potential confounds specific to decoding overt speech: acoustic contamination of neural signals and systematic differences in labeling different phonemes' onset times. MAIN RESULTS A linear decoder achieved up to 29.3% classification accuracy (chance = 6%) across 39 phonemes, while an RNN classifier achieved 33.9% accuracy. Parameter sweeps indicated that performance did not saturate when adding more electrodes or more training data, and that accuracy improved when utilizing time-varying structure in the data. Microphonic contamination and phoneme onset differences modestly increased decoding accuracy, but could be mitigated by acoustic artifact subtraction and using a neural speech onset marker, respectively. Speech synthesis achieved r = 0.523 correlation between true and reconstructed audio. SIGNIFICANCE The ability to decode speech using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.
Collapse
Affiliation(s)
- Guy H Wilson
- Neurosciences Graduate Program, Stanford University, Stanford, CA, United States of America
| | - Sergey D Stavisky
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
| | - Francis R Willett
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
- Howard Hughes Medical Institute at Stanford University, Stanford, CA, United States of America
| | - Donald T Avansino
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
| | - Jessica N Kelemen
- Department of Neurology, Harvard Medical School, Boston, MA, United States of America
| | - Leigh R Hochberg
- Department of Neurology, Harvard Medical School, Boston, MA, United States of America
- Center for Neurotechnology and Neurorecovery, Dept. of Neurology, Massachusetts General Hospital, Boston, MA, United States of America
- VA RR&D Center for Neurorestoration and Neurotechnology, Rehabilitation R&D Service, Providence VA Medical Center, Providence, RI, United States of America
- Carney Institute for Brain Science and School of Engineering, Brown University, Providence, RI, United States of America
| | - Jaimie M Henderson
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
| | - Shaul Druckmann
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Neurobiology, Stanford University, Stanford, CA, United States of America
| | - Krishna V Shenoy
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
- Howard Hughes Medical Institute at Stanford University, Stanford, CA, United States of America
- Department of Neurobiology, Stanford University, Stanford, CA, United States of America
- Department of Bioengineering, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
36
|
Dash D, Wisler A, Ferrari P, Davenport EM, Maldjian J, Wang J. MEG Sensor Selection for Neural Speech Decoding. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:182320-182337. [PMID: 33204579 PMCID: PMC7668411 DOI: 10.1109/access.2020.3028831] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Direct decoding of speech from the brain is a faster alternative to current electroencephalography (EEG) speller-based brain-computer interfaces (BCI) in providing communication assistance to locked-in patients. Magnetoencephalography (MEG) has recently shown great potential as a non-invasive neuroimaging modality for neural speech decoding, owing in part to its spatial selectivity over other high-temporal resolution devices. Standard MEG systems have a large number of cryogenically cooled channels/sensors (200 - 300) encapsulated within a fixed liquid helium dewar, precluding their use as wearable BCI devices. Fortunately, recently developed optically pumped magnetometers (OPM) do not require cryogens, and have the potential to be wearable and movable making them more suitable for BCI applications. This design is also modular allowing for customized montages to include only the sensors necessary for a particular task. As the number of sensors bears a heavy influence on the cost, size, and weight of MEG systems, minimizing the number of sensors is critical for designing practical MEG-based BCIs in the future. In this study, we sought to identify an optimal set of MEG channels to decode imagined and spoken phrases from the MEG signals. Using a forward selection algorithm with a support vector machine classifier we found that nine optimally located MEG gradiometers provided higher decoding accuracy compared to using all channels. Additionally, the forward selection algorithm achieved similar performance to dimensionality reduction using a stacked-sparse-autoencoder. Analysis of spatial dynamics of speech decoding suggested that both left and right hemisphere sensors contribute to speech decoding. Sensors approximately located near Broca's area were found to be commonly contributing among the higher-ranked sensors across all subjects.
Collapse
Affiliation(s)
- Debadatta Dash
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Alan Wisler
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Paul Ferrari
- MEG Laboratory, Dell Children's Medical Center, Austin, TX 78723, USA
- Department of Psychology, The University of Texas at Austin, Austin, TX 78712, USA
| | | | - Joseph Maldjian
- Department of Radiology, University of Texas at Southwestern, Dallas, TX 75390, USA
| | - Jun Wang
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
37
|
Langland-Hassan P. Inner speech. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2020; 12:e1544. [PMID: 32949083 DOI: 10.1002/wcs.1544] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Revised: 06/25/2020] [Accepted: 08/13/2020] [Indexed: 11/07/2022]
Abstract
Inner speech travels under many aliases: the inner voice, verbal thought, thinking in words, internal verbalization, "talking in your head," the "little voice in the head," and so on. It is both a familiar element of first-person experience and a psychological phenomenon whose complex cognitive components and distributed neural bases are increasingly well understood. There is evidence that inner speech plays a variety of cognitive roles, from enabling abstract thought, to supporting metacognition, memory, and executive function. One active area of controversy concerns the relation of inner speech to auditory verbal hallucinations (AVHs) in schizophrenia, with a common proposal being that sufferers of AVH misidentify their own inner speech as being generated by someone else. Recently, researchers have used artificial intelligence to translate the neural and neuromuscular signatures of inner speech into corresponding outer speech signals, laying the groundwork for a variety of new applications and interventions. This article is categorized under: Philosophy > Foundations of Cognitive Science Linguistics > Language in Mind and Brain Philosophy > Consciousness Philosophy > Psychological Capacities.
Collapse
|
38
|
Gearing M, Kennedy P. Histological Confirmation of Myelinated Neural Filaments Within the Tip of the Neurotrophic Electrode After a Decade of Neural Recordings. Front Hum Neurosci 2020; 14:111. [PMID: 32372930 PMCID: PMC7187752 DOI: 10.3389/fnhum.2020.00111] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 03/11/2020] [Indexed: 11/13/2022] Open
Abstract
Aim Electrodes that provide brain to machine or computer interfacing must survive the lifetime of the person to be considered an acceptable prosthetic. The electrodes may be external such as with electroencephalographic (EEG), internal extracortical such as electrocorticographic (ECoG) or intracortical. Methods Most intracortical electrodes are placed close to the neuropil being recorded and do not survive years of recording. However, the Neurotrophic Electrode is placed within the cortex and the neuropil grows inside and through the hollow tip of the electrode and is thus trapped inside. Highly flexible coiled lead wires minimize the strain on the electrode tip. Histological analysis includes immunohistochemical detection of neurofilaments and the absence of gliosis. Results This configuration led to a decade long recording in this locked-in person. At year nine, the neural activity underwent conditioning experiments indicating that the neural activity was functional and not noise. This paper presents data on the histological analysis of the tissue inside the electrode tip after 13 years of implantation. Conclusion This paper is a singular example of histological analysis after a decade of recording. The histological analysis laid out herein is strong evidence that the brain can grow neurites into the electrode tip and record for a decade. This is profoundly important in the field of brain to machine or computer interfacing by implying that long term electrodes should incorporate some means of growing the neuropil into the electrode rather than placing the electrode into the neuropil.
Collapse
Affiliation(s)
- Marla Gearing
- Laboratory Medicine and Neurology, Department of Pathology, Emory University School of Medicine, Atlanta, GA, United States
| | | |
Collapse
|
39
|
Iturrate I, Chavarriaga R, Millán JDR. General principles of machine learning for brain-computer interfacing. HANDBOOK OF CLINICAL NEUROLOGY 2020; 168:311-328. [PMID: 32164862 DOI: 10.1016/b978-0-444-63934-9.00023-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Brain-computer interfaces (BCIs) are systems that translate brain activity patterns into commands that can be executed by an artificial device. This enables the possibility of controlling devices such as a prosthetic arm or exoskeleton, a wheelchair, typewriting applications, or games directly by modulating our brain activity. For this purpose, BCI systems rely on signal processing and machine learning algorithms to decode the brain activity. This chapter provides an overview of the main steps required to do such a process, including signal preprocessing, feature extraction and selection, and decoding. Given the large amount of possible methods that can be used for these processes, a comprehensive review of them is beyond the scope of this chapter, and it is focused instead on the general principles that should be taken into account, as well as discussing good practices on how these methods should be applied and evaluated for proper design of reliable BCI systems.
Collapse
Affiliation(s)
- Iñaki Iturrate
- Center for Neuroprosthetics, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland
| | - Ricardo Chavarriaga
- Center for Neuroprosthetics, École Polytechnique Fédérale de Lausanne, Geneva, Switzerland; Institute of Applied Information Technology (InIT), Zurich University of Applied Sciences ZHAW, Winterthur, Switzerland.
| | - José Del R Millán
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States; Department of Neurology, The University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
40
|
Taitz A, Assaneo MF, Shalom DE, Trevisan MA. Motor representations underlie the reading of unfamiliar letter combinations. Sci Rep 2020; 10:3828. [PMID: 32123186 PMCID: PMC7052247 DOI: 10.1038/s41598-020-59199-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 12/13/2019] [Indexed: 12/03/2022] Open
Abstract
Silent reading is a cognitive operation that produces verbal content with no vocal output. One relevant question is the extent to which this verbal content is processed as overt speech in the brain. To address this, we acquired sound, eye trajectories and lips' dynamics during the reading of consonant-consonant-vowel (CCV) combinations which are infrequent in the language. We found that the duration of the first fixations on the CCVs during silent reading correlate with the duration of the transitions between consonants when the CCVs are actually uttered. With the aid of an articulatory model of the vocal system, we show that transitions measure the articulatory effort required to produce the CCVs. This means that first fixations during silent reading are lengthened when the CCVs require a greater laryngeal and/or articulatory effort to be pronounced. Our results support that a speech motor code is used for the recognition of infrequent text strings during silent reading.
Collapse
Affiliation(s)
- Alan Taitz
- Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina.
| | - M Florencia Assaneo
- Department of Psychology, New York University, New York, NY, 10003, USA
- Instituto de Neurobiología, UNAM, Campus Juriquilla, Querétaro, México
| | - Diego E Shalom
- Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina
- Department of Physics, University of Buenos Aires (UBA), Buenos Aires, 1428EGA, Argentina
| | - Marcos A Trevisan
- Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina
- Department of Physics, University of Buenos Aires (UBA), Buenos Aires, 1428EGA, Argentina
| |
Collapse
|
41
|
Tottrup L, Leerskov K, Hadsund JT, Kamavuako EN, Kaseler RL, Jochumsen M. Decoding covert speech for intuitive control of brain-computer interfaces based on single-trial EEG: a feasibility study. IEEE Int Conf Rehabil Robot 2020; 2019:689-693. [PMID: 31374711 DOI: 10.1109/icorr.2019.8779499] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
For individuals with severe motor deficiencies, controlling external devices such as robotic arms or wheelchairs can be challenging, as many devices require some degree of motor control to be operated, e.g. when controlled using a joystick. A brain-computer interface (BCI) relies only on signals from the brain and may be used as a controller instead of muscles. Motor imagery (MI) has been used in many studies as a control signal for BCIs. However, MI may not be suitable for all control purposes, and several people cannot obtain BCI control with MI. In this study, the aim was to investigate the feasibility of decoding covert speech from single-trial EEG and compare and combine it with MI. In seven healthy subjects, EEG was recorded with twenty-five channels during six different actions: Speaking three words (both covert and overt speech), two arm movements (both motor imagery and execution), and one idle class. Temporal and spectral features were derived from the epochs and classified with a random forest classifier. The average classification accuracy was $67 \pm 9$ % and $75\pm 7$ % for covert and overt speech, respectively; this was 5-10 % lower than the movement classification. The performance of the combined movement-speech decoder was $61 \pm 9$ % and $67\pm 7$ % (covert and overt), but it is possible to have more classes available for control. The possibility of using covert speech for controlling a BCI was outlined; this is a step towards a multimodal BCI system for improved usability.
Collapse
|
42
|
Annen J, Laureys S, Gosseries O. Brain-computer interfaces for consciousness assessment and communication in severely brain-injured patients. BRAIN-COMPUTER INTERFACES 2020; 168:137-152. [DOI: 10.1016/b978-0-444-63934-9.00011-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
43
|
|
44
|
Rabbani Q, Milsap G, Crone NE. The Potential for a Speech Brain-Computer Interface Using Chronic Electrocorticography. Neurotherapeutics 2019; 16:144-165. [PMID: 30617653 PMCID: PMC6361062 DOI: 10.1007/s13311-018-00692-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A brain-computer interface (BCI) is a technology that uses neural features to restore or augment the capabilities of its user. A BCI for speech would enable communication in real time via neural correlates of attempted or imagined speech. Such a technology would potentially restore communication and improve quality of life for locked-in patients and other patients with severe communication disorders. There have been many recent developments in neural decoders, neural feature extraction, and brain recording modalities facilitating BCI for the control of prosthetics and in automatic speech recognition (ASR). Indeed, ASR and related fields have developed significantly over the past years, and many lend many insights into the requirements, goals, and strategies for speech BCI. Neural speech decoding is a comparatively new field but has shown much promise with recent studies demonstrating semantic, auditory, and articulatory decoding using electrocorticography (ECoG) and other neural recording modalities. Because the neural representations for speech and language are widely distributed over cortical regions spanning the frontal, parietal, and temporal lobes, the mesoscopic scale of population activity captured by ECoG surface electrode arrays may have distinct advantages for speech BCI, in contrast to the advantages of microelectrode arrays for upper-limb BCI. Nevertheless, there remain many challenges for the translation of speech BCIs to clinical populations. This review discusses and outlines the current state-of-the-art for speech BCI and explores what a speech BCI using chronic ECoG might entail.
Collapse
Affiliation(s)
- Qinwan Rabbani
- Department of Electrical Engineering, The Johns Hopkins University Whiting School of Engineering, Baltimore, MD, USA.
| | - Griffin Milsap
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|