1
|
Wyse-Sookoo K, Luo S, Candrea D, Schippers A, Tippett DC, Wester B, Fifer M, Vansteensel MJ, Ramsey NF, Crone NE. Stability of ECoG high gamma signals during speech and implications for a speech BCI system in an individual with ALS: a year-long longitudinal study. J Neural Eng 2024; 21:10.1088/1741-2552/ad5c02. [PMID: 38925110 PMCID: PMC11245360 DOI: 10.1088/1741-2552/ad5c02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 06/26/2024] [Indexed: 06/28/2024]
Abstract
Objective.Speech brain-computer interfaces (BCIs) have the potential to augment communication in individuals with impaired speech due to muscle weakness, for example in amyotrophic lateral sclerosis (ALS) and other neurological disorders. However, to achieve long-term, reliable use of a speech BCI, it is essential for speech-related neural signal changes to be stable over long periods of time. Here we study, for the first time, the stability of speech-related electrocorticographic (ECoG) signals recorded from a chronically implanted ECoG BCI over a 12 month period.Approach.ECoG signals were recorded by an ECoG array implanted over the ventral sensorimotor cortex in a clinical trial participant with ALS. Because ECoG-based speech decoding has most often relied on broadband high gamma (HG) signal changes relative to baseline (non-speech) conditions, we studied longitudinal changes of HG band power at baseline and during speech, and we compared these with residual high frequency noise levels at baseline. Stability was further assessed by longitudinal measurements of signal-to-noise ratio, activation ratio, and peak speech-related HG response magnitude (HG response peaks). Lastly, we analyzed the stability of the event-related HG power changes (HG responses) for individual syllables at each electrode.Main Results.We found that speech-related ECoG signal responses were stable over a range of syllables activating different articulators for the first year after implantation.Significance.Together, our results indicate that ECoG can be a stable recording modality for long-term speech BCI systems for those living with severe paralysis.Clinical Trial Information.ClinicalTrials.gov, registration number NCT03567213.
Collapse
Affiliation(s)
- Kimberley Wyse-Sookoo
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Shiyu Luo
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Daniel Candrea
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Anouck Schippers
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Donna C Tippett
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
- Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
- Department of Physical Medicine and Rehabilitation, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| | - Brock Wester
- Research and Exploratory Development Department, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States of America
| | - Matthew Fifer
- Research and Exploratory Development Department, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States of America
| | - Mariska J Vansteensel
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nick F Ramsey
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nathan E Crone
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, United States of America
| |
Collapse
|
2
|
Silva AB, Littlejohn KT, Liu JR, Moses DA, Chang EF. The speech neuroprosthesis. Nat Rev Neurosci 2024; 25:473-492. [PMID: 38745103 DOI: 10.1038/s41583-024-00819-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/12/2024] [Indexed: 05/16/2024]
Abstract
Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by directly decoding speech from intact cortical activity has the potential to restore natural communication and self-expression. Recent discoveries have defined how key features of speech production are facilitated by the coordinated activity of vocal-tract articulatory and motor-planning cortical representations. In this Review, we highlight such progress and how it has led to successful speech decoding, first in individuals implanted with intracranial electrodes for clinical epilepsy monitoring and subsequently in individuals with paralysis as part of early feasibility clinical trials to restore speech. We discuss high-spatiotemporal-resolution neural interfaces and the adaptation of state-of-the-art speech computational algorithms that have driven rapid and substantial progress in decoding neural activity into text, audible speech, and facial movements. Although restoring natural speech is a long-term goal, speech neuroprostheses already have performance levels that surpass communication rates offered by current assistive-communication technology. Given this accelerated rate of progress in the field, we propose key evaluation metrics for speed and accuracy, among others, to help standardize across studies. We finish by highlighting several directions to more fully explore the multidimensional feature space of speech and language, which will continue to accelerate progress towards a clinically viable speech neuroprosthesis.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
3
|
Wu X, Wellington S, Fu Z, Zhang D. Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods. J Neural Eng 2024; 21:036055. [PMID: 38885688 DOI: 10.1088/1741-2552/ad593a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 06/17/2024] [Indexed: 06/20/2024]
Abstract
Objective.Brain-computer interfaces (BCIs) are technologies that bypass damaged or disrupted neural pathways and directly decode brain signals to perform intended actions. BCIs for speech have the potential to restore communication by decoding the intended speech directly. Many studies have demonstrated promising results using invasive micro-electrode arrays and electrocorticography. However, the use of stereo-electroencephalography (sEEG) for speech decoding has not been fully recognized.Approach.In this research, recently released sEEG data were used to decode Dutch words spoken by epileptic participants. We decoded speech waveforms from sEEG data using advanced deep-learning methods. Three methods were implemented: a linear regression method, an recurrent neural network (RNN)-based sequence-to-sequence model (RNN), and a transformer model.Main results.Our RNN and transformer models outperformed the linear regression significantly, while no significant difference was found between the two deep-learning methods. Further investigation on individual electrodes showed that the same decoding result can be obtained using only a few of the electrodes.Significance.This study demonstrated that decoding speech from sEEG signals is possible, and the location of the electrodes is critical to the decoding performance.
Collapse
Affiliation(s)
- Xiaolong Wu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Scott Wellington
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Zhichun Fu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Dingguo Zhang
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| |
Collapse
|
4
|
van der Heijden K, Patel P, Bickel S, Herrero JL, Mehta AD, Mesgarani N. Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.13.593814. [PMID: 38798551 PMCID: PMC11118436 DOI: 10.1101/2024.05.13.593814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Listeners readily extract multi-dimensional auditory objects such as a 'localized talker' from complex acoustic scenes with multiple talkers. Yet, the neural mechanisms underlying simultaneous encoding and linking of different sound features - for example, a talker's voice and location - are poorly understood. We analyzed invasive intracranial recordings in neurosurgical patients attending to a localized talker in real-life cocktail party scenarios. We found that sensitivity to an individual talker's voice and location features was distributed throughout auditory cortex and that neural sites exhibited a gradient from sensitivity to a single feature to joint sensitivity to both features. On a population level, cortical response patterns of both dual-feature sensitive sites but also single-feature sensitive sites revealed simultaneous encoding of an attended talker's voice and location features. However, for single-feature sensitive sites, the representation of the primary feature was more precise. Further, sites which selective tracked an attended speech stream concurrently encoded an attended talker's voice and location features, indicating that such sites combine selective tracking of an attended auditory object with encoding of the object's features. Finally, we found that attending a localized talker selectively enhanced temporal coherence between single-feature voice sensitive sites and single-feature location sensitive sites, providing an additional mechanism for linking voice and location in multi-talker scenes. These results demonstrate that a talker's voice and location features are linked during multi-dimensional object formation in naturalistic multi-talker scenes by joint population coding as well as by temporal coherence between neural sites. SIGNIFICANCE STATEMENT Listeners effortlessly extract auditory objects from complex acoustic scenes consisting of multiple sound sources in naturalistic, spatial sound scenes. Yet, how the brain links different sound features to form a multi-dimensional auditory object is poorly understood. We investigated how neural responses encode and integrate an attended talker's voice and location features in spatial multi-talker sound scenes to elucidate which neural mechanisms underlie simultaneous encoding and linking of different auditory features. Our results show that joint population coding as well as temporal coherence mechanisms contribute to distributed multi-dimensional auditory object encoding. These findings shed new light on cortical functional specialization and multidimensional auditory object formation in complex, naturalistic listening scenes. HIGHLIGHTS Cortical responses to an single talker exhibit a distributed gradient, ranging from sites that are sensitive to both a talker's voice and location (dual-feature sensitive sites) to sites that are sensitive to either voice or location (single-feature sensitive sites).Population response patterns of dual-feature sensitive sites encode voice and location features of the attended talker in multi-talker scenes jointly and with equal precision.Despite their sensitivity to a single feature at the level of individual cortical sites, population response patterns of single-feature sensitive sites also encode location and voice features of a talker jointly, but with higher precision for the feature they are primarily sensitive to.Neural sites which selectively track an attended speech stream concurrently encode the attended talker's voice and location features.Attention selectively enhances temporal coherence between voice and location selective sites over time.Joint population coding as well as temporal coherence mechanisms underlie distributed multi-dimensional auditory object encoding in auditory cortex.
Collapse
|
5
|
Guerreiro Fernandes F, Raemaekers M, Freudenburg Z, Ramsey N. Considerations for implanting speech brain computer interfaces based on functional magnetic resonance imaging. J Neural Eng 2024; 21:036005. [PMID: 38648782 DOI: 10.1088/1741-2552/ad4178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 04/22/2024] [Indexed: 04/25/2024]
Abstract
Objective.Brain-computer interfaces (BCIs) have the potential to reinstate lost communication faculties. Results from speech decoding studies indicate that a usable speech BCI based on activity in the sensorimotor cortex (SMC) can be achieved using subdurally implanted electrodes. However, the optimal characteristics for a successful speech implant are largely unknown. We address this topic in a high field blood oxygenation level dependent functional magnetic resonance imaging (fMRI) study, by assessing the decodability of spoken words as a function of hemisphere, gyrus, sulcal depth, and position along the ventral/dorsal-axis.Approach.Twelve subjects conducted a 7T fMRI experiment in which they pronounced 6 different pseudo-words over 6 runs. We divided the SMC by hemisphere, gyrus, sulcal depth, and position along the ventral/dorsal axis. Classification was performed on in these SMC areas using multiclass support vector machine (SVM).Main results.Significant classification was possible from the SMC, but no preference for the left or right hemisphere, nor for the precentral or postcentral gyrus for optimal word classification was detected. Classification while using information from the cortical surface was slightly better than when using information from deep in the central sulcus and was highest within the ventral 50% of SMC. Confusion matrices where highly similar across the entire SMC. An SVM-searchlight analysis revealed significant classification in the superior temporal gyrus and left planum temporale in addition to the SMC.Significance.The current results support a unilateral implant using surface electrodes, covering the ventral 50% of the SMC. The added value of depth electrodes is unclear. We did not observe evidence for variations in the qualitative nature of information across SMC. The current results need to be confirmed in paralyzed patients performing attempted speech.
Collapse
Affiliation(s)
- F Guerreiro Fernandes
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - M Raemaekers
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Z Freudenburg
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - N Ramsey
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
6
|
Angrick M, Luo S, Rabbani Q, Candrea DN, Shah S, Milsap GW, Anderson WS, Gordon CR, Rosenblatt KR, Clawson L, Tippett DC, Maragakis N, Tenore FV, Fifer MS, Hermansky H, Ramsey NF, Crone NE. Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS. Sci Rep 2024; 14:9617. [PMID: 38671062 PMCID: PMC11053081 DOI: 10.1038/s41598-024-60277-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 04/21/2024] [Indexed: 04/28/2024] Open
Abstract
Brain-computer interfaces (BCIs) that reconstruct and synthesize speech using brain activity recorded with intracranial electrodes may pave the way toward novel communication interfaces for people who have lost their ability to speak, or who are at high risk of losing this ability, due to neurological disorders. Here, we report online synthesis of intelligible words using a chronically implanted brain-computer interface (BCI) in a man with impaired articulation due to ALS, participating in a clinical trial (ClinicalTrials.gov, NCT03567213) exploring different strategies for BCI communication. The 3-stage approach reported here relies on recurrent neural networks to identify, decode and synthesize speech from electrocorticographic (ECoG) signals acquired across motor, premotor and somatosensory cortices. We demonstrate a reliable BCI that synthesizes commands freely chosen and spoken by the participant from a vocabulary of 6 keywords previously used for decoding commands to control a communication board. Evaluation of the intelligibility of the synthesized speech indicates that 80% of the words can be correctly recognized by human listeners. Our results show that a speech-impaired individual with ALS can use a chronically implanted BCI to reliably produce synthesized words while preserving the participant's voice profile, and provide further evidence for the stability of ECoG for speech-based BCIs.
Collapse
Affiliation(s)
- Miguel Angrick
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Daniel N Candrea
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Samyak Shah
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Griffin W Milsap
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - William S Anderson
- Department of Neurosurgery, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Chad R Gordon
- Department of Neurosurgery, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Section of Neuroplastic and Reconstructive Surgery, Department of Plastic Surgery, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kathryn R Rosenblatt
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Anesthesiology & Critical Care Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Lora Clawson
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Donna C Tippett
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Otolaryngology-Head and Neck Surgery, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Physical Medicine and Rehabilitation, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nicholas Maragakis
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Francesco V Tenore
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - Matthew S Fifer
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - Hynek Hermansky
- Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, USA
- Human Language Technology Center of Excellence, The Johns Hopkins University, Baltimore, MD, USA
| | - Nick F Ramsey
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
7
|
Anastasopoulou I, Cheyne DO, van Lieshout P, Johnson BW. Decoding kinematic information from beta-band motor rhythms of speech motor cortex: a methodological/analytic approach using concurrent speech movement tracking and magnetoencephalography. Front Hum Neurosci 2024; 18:1305058. [PMID: 38646159 PMCID: PMC11027130 DOI: 10.3389/fnhum.2024.1305058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 02/26/2024] [Indexed: 04/23/2024] Open
Abstract
Introduction Articulography and functional neuroimaging are two major tools for studying the neurobiology of speech production. Until now, however, it has generally not been feasible to use both in the same experimental setup because of technical incompatibilities between the two methodologies. Methods Here we describe results from a novel articulography system dubbed Magneto-articulography for the Assessment of Speech Kinematics (MASK), which is technically compatible with magnetoencephalography (MEG) brain scanning systems. In the present paper we describe our methodological and analytic approach for extracting brain motor activities related to key kinematic and coordination event parameters derived from time-registered MASK tracking measurements. Data were collected from 10 healthy adults with tracking coils on the tongue, lips, and jaw. Analyses targeted the gestural landmarks of reiterated utterances/ipa/ and /api/, produced at normal and faster rates. Results The results show that (1) Speech sensorimotor cortex can be reliably located in peri-rolandic regions of the left hemisphere; (2) mu (8-12 Hz) and beta band (13-30 Hz) neuromotor oscillations are present in the speech signals and contain information structures that are independent of those present in higher-frequency bands; and (3) hypotheses concerning the information content of speech motor rhythms can be systematically evaluated with multivariate pattern analytic techniques. Discussion These results show that MASK provides the capability, for deriving subject-specific articulatory parameters, based on well-established and robust motor control parameters, in the same experimental setup as the MEG brain recordings and in temporal and spatial co-register with the brain data. The analytic approach described here provides new capabilities for testing hypotheses concerning the types of kinematic information that are encoded and processed within specific components of the speech neuromotor system.
Collapse
Affiliation(s)
| | - Douglas Owen Cheyne
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- Hospital for Sick Children Research Institute, Toronto, ON, Canada
| | - Pascal van Lieshout
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
| | | |
Collapse
|
8
|
Chen J, Chen X, Wang R, Le C, Khalilian-Gourtani A, Jensen E, Dugan P, Doyle W, Devinsky O, Friedman D, Flinker A, Wang Y. Subject-Agnostic Transformer-Based Neural Speech Decoding from Surface and Depth Electrode Signals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584533. [PMID: 38559163 PMCID: PMC10980022 DOI: 10.1101/2024.03.11.584533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Objective This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements and the trained model should perform well on participants unseen during training. Approach We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes, by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train both subject-specific models using data from a single participant as well as multi-patient models exploiting data from multiple participants. Main Results The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. The multi-subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. Significance The proposed SwinTW decoder enables future speech neuroprostheses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuroprostheses for people with speech disability without relying on their own neural data for training, which is not always feasible.
Collapse
Affiliation(s)
- Junbo Chen
- Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| | - Xupeng Chen
- Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| | - Ran Wang
- Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| | - Chenqian Le
- Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| | | | - Erika Jensen
- Neurology Department, New York University, 223 East 34th Street, Manhattan, 10016, NY, USA
| | - Patricia Dugan
- Neurology Department, New York University, 223 East 34th Street, Manhattan, 10016, NY, USA
| | - Werner Doyle
- Neurosurgery Department, New York University, 550 1st Avenue, Manhattan, 10016, NY, USA
| | - Orrin Devinsky
- Neurology Department, New York University, 223 East 34th Street, Manhattan, 10016, NY, USA
| | - Daniel Friedman
- Neurology Department, New York University, 223 East 34th Street, Manhattan, 10016, NY, USA
| | - Adeen Flinker
- Neurology Department, New York University, 223 East 34th Street, Manhattan, 10016, NY, USA
- Biomedical Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| | - Yao Wang
- Electrical and Computer Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
- Biomedical Engineering Department, New York University, 370 Jay Street, Brooklyn, 11201, NY, USA
| |
Collapse
|
9
|
Vitória MA, Fernandes FG, van den Boom M, Ramsey N, Raemaekers M. Decoding Single and Paired Phonemes Using 7T Functional MRI. Brain Topogr 2024:10.1007/s10548-024-01034-6. [PMID: 38261272 DOI: 10.1007/s10548-024-01034-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/12/2024] [Indexed: 01/24/2024]
Abstract
Several studies have shown that mouth movements related to the pronunciation of individual phonemes are represented in the sensorimotor cortex. This would theoretically allow for brain computer interfaces that are capable of decoding continuous speech by training classifiers based on the activity in the sensorimotor cortex related to the production of individual phonemes. To address this, we investigated the decodability of trials with individual and paired phonemes (pronounced consecutively with one second interval) using activity in the sensorimotor cortex. Fifteen participants pronounced 3 different phonemes and 3 combinations of two of the same phonemes in a 7T functional MRI experiment. We confirmed that support vector machine (SVM) classification of single and paired phonemes was possible. Importantly, by combining classifiers trained on single phonemes, we were able to classify paired phonemes with an accuracy of 53% (33% chance level), demonstrating that activity of isolated phonemes is present and distinguishable in combined phonemes. A SVM searchlight analysis showed that the phoneme representations are widely distributed in the ventral sensorimotor cortex. These findings provide insights about the neural representations of single and paired phonemes. Furthermore, it supports the notion that speech BCI may be feasible based on machine learning algorithms trained on individual phonemes using intracranial electrode grids.
Collapse
Affiliation(s)
- Maria Araújo Vitória
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Francisco Guerreiro Fernandes
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Max van den Boom
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Nick Ramsey
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mathijs Raemaekers
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
10
|
Canny E, Vansteensel MJ, van der Salm SMA, Müller-Putz GR, Berezutskaya J. Boosting brain-computer interfaces with functional electrical stimulation: potential applications in people with locked-in syndrome. J Neuroeng Rehabil 2023; 20:157. [PMID: 37980536 PMCID: PMC10656959 DOI: 10.1186/s12984-023-01272-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/23/2023] [Indexed: 11/20/2023] Open
Abstract
Individuals with a locked-in state live with severe whole-body paralysis that limits their ability to communicate with family and loved ones. Recent advances in brain-computer interface (BCI) technology have presented a potential alternative for these people to communicate by detecting neural activity associated with attempted hand or speech movements and translating the decoded intended movements to a control signal for a computer. A technique that could potentially enrich the communication capacity of BCIs is functional electrical stimulation (FES) of paralyzed limbs and face to restore body and facial movements of paralyzed individuals, allowing to add body language and facial expression to communication BCI utterances. Here, we review the current state of the art of existing BCI and FES work in people with paralysis of body and face and propose that a combined BCI-FES approach, which has already proved successful in several applications in stroke and spinal cord injury, can provide a novel promising mode of communication for locked-in individuals.
Collapse
Affiliation(s)
- Evan Canny
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mariska J Vansteensel
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sandra M A van der Salm
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Gernot R Müller-Putz
- Institute of Neural Engineering, Laboratory of Brain-Computer Interfaces, Graz University of Technology, Graz, Austria
| | - Julia Berezutskaya
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
11
|
Duraivel S, Rahimpour S, Chiang CH, Trumpis M, Wang C, Barth K, Harward SC, Lad SP, Friedman AH, Southwell DG, Sinha SR, Viventi J, Cogan GB. High-resolution neural recordings improve the accuracy of speech decoding. Nat Commun 2023; 14:6938. [PMID: 37932250 PMCID: PMC10628285 DOI: 10.1038/s41467-023-42555-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 10/13/2023] [Indexed: 11/08/2023] Open
Abstract
Patients suffering from debilitating neurodegenerative diseases often lose the ability to communicate, detrimentally affecting their quality of life. One solution to restore communication is to decode signals directly from the brain to enable neural speech prostheses. However, decoding has been limited by coarse neural recordings which inadequately capture the rich spatio-temporal structure of human brain signals. To resolve this limitation, we performed high-resolution, micro-electrocorticographic (µECoG) neural recordings during intra-operative speech production. We obtained neural signals with 57× higher spatial resolution and 48% higher signal-to-noise ratio compared to macro-ECoG and SEEG. This increased signal quality improved decoding by 35% compared to standard intracranial signals. Accurate decoding was dependent on the high-spatial resolution of the neural interface. Non-linear decoding models designed to utilize enhanced spatio-temporal neural information produced better results than linear techniques. We show that high-density µECoG can enable high-quality speech decoding for future neural speech prostheses.
Collapse
Affiliation(s)
| | - Shervin Rahimpour
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA
- Department of Neurosurgery, Clinical Neuroscience Center, University of Utah, Salt Lake City, UT, USA
| | - Chia-Han Chiang
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Michael Trumpis
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Charles Wang
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Katrina Barth
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Stephen C Harward
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA
- Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC, USA
| | - Shivanand P Lad
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA
| | - Allan H Friedman
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA
| | - Derek G Southwell
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA
- Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC, USA
- Department of Neurobiology, Duke School of Medicine, Durham, NC, USA
| | - Saurabh R Sinha
- Penn Epilepsy Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jonathan Viventi
- Department of Biomedical Engineering, Duke University, Durham, NC, USA.
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA.
- Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC, USA.
- Department of Neurobiology, Duke School of Medicine, Durham, NC, USA.
| | - Gregory B Cogan
- Department of Biomedical Engineering, Duke University, Durham, NC, USA.
- Department of Neurosurgery, Duke School of Medicine, Durham, NC, USA.
- Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC, USA.
- Department of Neurology, Duke School of Medicine, Durham, NC, USA.
- Department of Psychology and Neuroscience, Duke University, Durham, NC, USA.
- Center for Cognitive Neuroscience, Duke University, Durham, NC, USA.
| |
Collapse
|
12
|
Wilmskoetter J, Roth R, McDowell K, Munsell B, Fontenot S, Andrews K, Chang A, Johnson LP, Sangtian S, Behroozmand R, van Mierlo P, Fridriksson J, Bonilha L. Semantic Categorization of Naming Responses Based on Prearticulatory Electrical Brain Activity. J Clin Neurophysiol 2023; 40:608-615. [PMID: 37931162 PMCID: PMC10628367 DOI: 10.1097/wnp.0000000000000933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
PURPOSE Object naming requires visual decoding, conceptualization, semantic categorization, and phonological encoding, all within 400 to 600 ms of stimulus presentation and before a word is spoken. In this study, we sought to predict semantic categories of naming responses based on prearticulatory brain activity recorded with scalp EEG in healthy individuals. METHODS We assessed 19 healthy individuals who completed a naming task while undergoing EEG. The naming task consisted of 120 drawings of animate/inanimate objects or abstract drawings. We applied a one-dimensional, two-layer, neural network to predict the semantic categories of naming responses based on prearticulatory brain activity. RESULTS Classifications of animate, inanimate, and abstract responses had an average accuracy of 80%, sensitivity of 72%, and specificity of 87% across participants. Across participants, time points with the highest average weights were between 470 and 490 milliseconds after stimulus presentation, and electrodes with the highest weights were located over the left and right frontal brain areas. CONCLUSIONS Scalp EEG can be successfully used in predicting naming responses through prearticulatory brain activity. Interparticipant variability in feature weights suggests that individualized models are necessary for highest accuracy. Our findings may inform future applications of EEG in reconstructing speech for individuals with and without speech impairments.
Collapse
Affiliation(s)
- Janina Wilmskoetter
- Department of Rehabilitation Sciences, College of Health
Professions, Medical University of South Carolina; Charleston, SC 29425, USA
| | - Rebecca Roth
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| | - Konnor McDowell
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| | - Brent Munsell
- Department of Computer Science, College of Arts and
Sciences; University of North Carolina-Chapel Hill; Chapel Hill, NC 27599, USA
| | - Skyler Fontenot
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| | - Keeghan Andrews
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| | - Allen Chang
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| | - Lorelei Phillip Johnson
- Department of Communication Sciences and Disorders;
University of South Carolina; Columbia, SC 29208, USA
| | - Stacey Sangtian
- Department of Communication Sciences and Disorders;
University of South Carolina; Columbia, SC 29208, USA
| | - Roozbeh Behroozmand
- Department of Communication Sciences and Disorders;
University of South Carolina; Columbia, SC 29208, USA
| | | | - Julius Fridriksson
- Department of Communication Sciences and Disorders;
University of South Carolina; Columbia, SC 29208, USA
| | - Leonardo Bonilha
- Department of Neurology, College of Medicine; Medical
University of South Carolina; Charleston, SC 29425, USA
| |
Collapse
|
13
|
Sankaran N, Moses D, Chiong W, Chang EF. Recommendations for promoting user agency in the design of speech neuroprostheses. Front Hum Neurosci 2023; 17:1298129. [PMID: 37920562 PMCID: PMC10619159 DOI: 10.3389/fnhum.2023.1298129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/04/2023] [Indexed: 11/04/2023] Open
Abstract
Brain-computer interfaces (BCI) that directly decode speech from brain activity aim to restore communication in people with paralysis who cannot speak. Despite recent advances, neural inference of speech remains imperfect, limiting the ability for speech BCIs to enable experiences such as fluent conversation that promote agency - that is, the ability for users to author and transmit messages enacting their intentions. Here, we make recommendations for promoting agency based on existing and emerging strategies in neural engineering. The focus is on achieving fast, accurate, and reliable performance while ensuring volitional control over when a decoder is engaged, what exactly is decoded, and how messages are expressed. Additionally, alongside neuroscientific progress within controlled experimental settings, we argue that a parallel line of research must consider how to translate experimental successes into real-world environments. While such research will ultimately require input from prospective users, here we identify and describe design choices inspired by human-factors work conducted in existing fields of assistive technology, which address practical issues likely to emerge in future real-world speech BCI applications.
Collapse
Affiliation(s)
- Narayan Sankaran
- Kavli Center for Ethics, Science and the Public, University of California, Berkeley, Berkeley, CA, United States
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - David Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - Winston Chiong
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, United States
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
14
|
Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J Neural Eng 2023; 20:056010. [PMID: 37467739 PMCID: PMC10510111 DOI: 10.1088/1741-2552/ace8be] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 07/12/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Objective.Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.Approach.In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.Main results.We show that (1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; (2) individual word decoding in reconstructed speech achieves 92%-100% accuracy (chance level is 8%); (3) direct reconstruction from sensorimotor brain activity produces intelligible speech.Significance.These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Mariska J Vansteensel
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Erik J Aarnoutse
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Marcel A J van Gerven
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| |
Collapse
|
15
|
Zhao Y, Chen Y, Cheng K, Huang W. Artificial intelligence based multimodal language decoding from brain activity: A review. Brain Res Bull 2023; 201:110713. [PMID: 37487829 DOI: 10.1016/j.brainresbull.2023.110713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/26/2023] [Accepted: 07/20/2023] [Indexed: 07/26/2023]
Abstract
Decoding brain activity is conducive to the breakthrough of brain-computer interface (BCI) technology. The development of artificial intelligence (AI) continually promotes the progress of brain language decoding technology. Existent research has mainly focused on a single modality and paid insufficient attention to AI methods. Therefore, our objective is to provide an overview of relevant decoding research from the perspective of different modalities and methodologies. The modalities involve text, speech, image, and video, whereas the core method is using AI-built decoders to translate brain signals induced by multimodal stimuli into text or vocal language. The semantic information of brain activity can be successfully decoded into a language at various levels, ranging from words through sentences to discourses. However, the decoding effect is affected by various factors, such as the decoding model, vector representation model, and brain regions. Challenges and future directions are also discussed. The advances in brain language decoding and BCI technology will potentially assist patients with clinical aphasia in regaining the ability to communicate.
Collapse
Affiliation(s)
- Yuhao Zhao
- College of Language Intelligence, Sichuan International Studies University, Chongqing 400031, PR China
| | - Yu Chen
- Technical College for the Deaf, Tianjin University of Technology, Tianjin 300384, PR China
| | - Kaiwen Cheng
- College of Language Intelligence, Sichuan International Studies University, Chongqing 400031, PR China.
| | - Wei Huang
- Sichuan Provincial Key Laboratory for Human Disease Gene Study, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, PR China.
| |
Collapse
|
16
|
Easthope E, Shamei A, Liu Y, Gick B, Fels S. Cortical control of posture in fine motor skills: evidence from inter-utterance rest position. Front Hum Neurosci 2023; 17:1139569. [PMID: 37662639 PMCID: PMC10469778 DOI: 10.3389/fnhum.2023.1139569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 06/12/2023] [Indexed: 09/05/2023] Open
Abstract
The vocal tract continuously employs tonic muscle activity in the maintenance of postural configurations. Gamma-band activity in the sensorimotor cortex underlies transient movements during speech production, yet little is known about the neural control of postural states in the vocal tract. Simultaneously, there is evidence that sensorimotor beta-band activations contribute to a system of inhibition and state maintenance that is integral to postural control in the body. Here we use electrocorticography to assess the contribution of sensorimotor beta-band activity during speech articulation and postural maintenance, and demonstrate that beta-band activity corresponds to the inhibition of discrete speech movements and the maintenance of tonic postural states in the vocal tract. Our findings identify consistencies between the neural control of posture in speech and what is previously reported in gross motor contexts, providing support for a unified theory of postural control across gross and fine motor skills.
Collapse
Affiliation(s)
- Eric Easthope
- Human Communication Technologies Lab, Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Arian Shamei
- Integrated Speech Research Lab, Department of Linguistics, University of British Columbia, Vancouver, BC, Canada
| | - Yadong Liu
- Integrated Speech Research Lab, Department of Linguistics, University of British Columbia, Vancouver, BC, Canada
| | - Bryan Gick
- Integrated Speech Research Lab, Department of Linguistics, University of British Columbia, Vancouver, BC, Canada
- Haskins Laboratories, New Haven, CT, United States
| | - Sidney Fels
- Human Communication Technologies Lab, Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
17
|
Angrick M, Luo S, Rabbani Q, Candrea DN, Shah S, Milsap GW, Anderson WS, Gordon CR, Rosenblatt KR, Clawson L, Maragakis N, Tenore FV, Fifer MS, Hermansky H, Ramsey NF, Crone NE. Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.30.23291352. [PMID: 37425721 PMCID: PMC10327279 DOI: 10.1101/2023.06.30.23291352] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Recent studies have shown that speech can be reconstructed and synthesized using only brain activity recorded with intracranial electrodes, but until now this has only been done using retrospective analyses of recordings from able-bodied patients temporarily implanted with electrodes for epilepsy surgery. Here, we report online synthesis of intelligible words using a chronically implanted brain-computer interface (BCI) in a clinical trial participant (ClinicalTrials.gov, NCT03567213) with dysarthria due to amyotrophic lateral sclerosis (ALS). We demonstrate a reliable BCI that synthesizes commands freely chosen and spoken by the user from a vocabulary of 6 keywords originally designed to allow intuitive selection of items on a communication board. Our results show for the first time that a speech-impaired individual with ALS can use a chronically implanted BCI to reliably produce synthesized words that are intelligible to human listeners while preserving the participants voice profile.
Collapse
Affiliation(s)
- Miguel Angrick
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Daniel N Candrea
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Samyak Shah
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Griffin W Milsap
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - William S Anderson
- Department of Neurosurgery, The Johns Hopkins University School of Medicine, Baltimore, MD
| | - Chad R Gordon
- Department of Neurosurgery, The Johns Hopkins University School of Medicine, Baltimore, MD
- Section of Neuroplastic and Reconstructive Surgery, Department of Plastic Surgery, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kathryn R Rosenblatt
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Anesthesiology & Critical Care Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Lora Clawson
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nicholas Maragakis
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Francesco V Tenore
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - Matthew S Fifer
- Research and Exploratory Development Department, Johns Hopkins Applied Physics Laboratory, Laurel, MD, USA
| | - Hynek Hermansky
- Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, USA
- Human Language Technology Center of Excellence, The Johns Hopkins University, Baltimore, MD, USA
| | - Nick F Ramsey
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
18
|
Soroush PZ, Herff C, Ries SK, Shih JJ, Schultz T, Krusienski DJ. The nested hierarchy of overt, mouthed, and imagined speech activity evident in intracranial recordings. Neuroimage 2023; 269:119913. [PMID: 36731812 DOI: 10.1016/j.neuroimage.2023.119913] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 01/05/2023] [Accepted: 01/29/2023] [Indexed: 02/01/2023] Open
Abstract
Recent studies have demonstrated that it is possible to decode and synthesize various aspects of acoustic speech directly from intracranial measurements of electrophysiological brain activity. In order to continue progressing toward the development of a practical speech neuroprosthesis for the individuals with speech impairments, better understanding and modeling of imagined speech processes are required. The present study uses intracranial brain recordings from participants that performed a speaking task with trials consisting of overt, mouthed, and imagined speech modes, representing various degrees of decreasing behavioral output. Speech activity detection models are constructed using spatial, spectral, and temporal brain activity features, and the features and model performances are characterized and compared across the three degrees of behavioral output. The results indicate the existence of a hierarchy in which the relevant channels for the lower behavioral output modes form nested subsets of the relevant channels from the higher behavioral output modes. This provides important insights for the elusive goal of developing more effective imagined speech decoding models with respect to the better-established overt speech decoding counterparts.
Collapse
|
19
|
Branco MP, Geukes SH, Aarnoutse EJ, Ramsey NF, Vansteensel MJ. Nine decades of electrocorticography: A comparison between epidural and subdural recordings. Eur J Neurosci 2023; 57:1260-1288. [PMID: 36843389 DOI: 10.1111/ejn.15941] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 02/10/2023] [Accepted: 02/18/2023] [Indexed: 02/28/2023]
Abstract
In recent years, electrocorticography (ECoG) has arisen as a neural signal recording tool in the development of clinically viable neural interfaces. ECoG electrodes are generally placed below the dura mater (subdural) but can also be placed on top of the dura (epidural). In deciding which of these modalities best suits long-term implants, complications and signal quality are important considerations. Conceptually, epidural placement may present a lower risk of complications as the dura is left intact but also a lower signal quality due to the dura acting as a signal attenuator. The extent to which complications and signal quality are affected by the dura, however, has been a matter of debate. To improve our understanding of the effects of the dura on complications and signal quality, we conducted a literature review. We inventorized the effect of the dura on signal quality, decodability and longevity of acute and chronic ECoG recordings in humans and non-human primates. Also, we compared the incidence and nature of serious complications in studies that employed epidural and subdural ECoG. Overall, we found that, even though epidural recordings exhibit attenuated signal amplitude over subdural recordings, particularly for high-density grids, the decodability of epidural recorded signals does not seem to be markedly affected. Additionally, we found that the nature of serious complications was comparable between epidural and subdural recordings. These results indicate that both epidural and subdural ECoG may be suited for long-term neural signal recordings, at least for current generations of clinical and high-density ECoG grids.
Collapse
Affiliation(s)
- Mariana P Branco
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Simon H Geukes
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Erik J Aarnoutse
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Nick F Ramsey
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Mariska J Vansteensel
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
20
|
Towards clinical application of implantable brain-computer interfaces for people with late-stage ALS: medical and ethical considerations. J Neurol 2023; 270:1323-1336. [PMID: 36450968 PMCID: PMC9971103 DOI: 10.1007/s00415-022-11464-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 10/26/2022] [Accepted: 10/27/2022] [Indexed: 12/05/2022]
Abstract
Individuals with amyotrophic lateral sclerosis (ALS) frequently develop speech and communication problems in the course of their disease. Currently available augmentative and alternative communication technologies do not present a solution for many people with advanced ALS, because these devices depend on residual and reliable motor activity. Brain-computer interfaces (BCIs) use neural signals for computer control and may allow people with late-stage ALS to communicate even when conventional technology falls short. Recent years have witnessed fast progression in the development and validation of implanted BCIs, which place neural signal recording electrodes in or on the cortex. Eventual widespread clinical application of implanted BCIs as an assistive communication technology for people with ALS will have significant consequences for their daily life, as well as for the clinical management of the disease, among others because of the potential interaction between the BCI and other procedures people with ALS undergo, such as tracheostomy. This article aims to facilitate responsible real-world implementation of implanted BCIs. We review the state of the art of research on implanted BCIs for communication, as well as the medical and ethical implications of the clinical application of this technology. We conclude that the contribution of all BCI stakeholders, including clinicians of the various ALS-related disciplines, will be needed to develop procedures for, and shape the process of, the responsible clinical application of implanted BCIs.
Collapse
|
21
|
Verwoert M, Ottenhoff MC, Goulis S, Colon AJ, Wagner L, Tousseyn S, van Dijk JP, Kubben PL, Herff C. Dataset of Speech Production in intracranial.Electroencephalography. Sci Data 2022; 9:434. [PMID: 35869138 PMCID: PMC9307753 DOI: 10.1038/s41597-022-01542-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022] Open
Abstract
Speech production is an intricate process involving a large number of muscles and cognitive processes. The neural processes underlying speech production are not completely understood. As speech is a uniquely human ability, it can not be investigated in animal models. High-fidelity human data can only be obtained in clinical settings and is therefore not easily available to all researchers. Here, we provide a dataset of 10 participants reading out individual words while we measured intracranial EEG from a total of 1103 electrodes. The data, with its high temporal resolution and coverage of a large variety of cortical and sub-cortical brain regions, can help in understanding the speech production process better. Simultaneously, the data can be used to test speech decoding and synthesis approaches from neural data to develop speech Brain-Computer Interfaces and speech neuroprostheses. Measurement(s) | Brain activity | Technology Type(s) | Stereotactic electroencephalography | Sample Characteristic - Organism | Homo sapiens | Sample Characteristic - Environment | Epilepsy monitoring center | Sample Characteristic - Location | The Netherlands |
Collapse
|
22
|
Petrosyan A, Voskoboinikov A, Sukhinin D, Makarova A, Skalnaya A, Arkhipova N, Sinkin M, Ossadtchi A. Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network. J Neural Eng 2022; 19. [PMID: 36356309 DOI: 10.1088/1741-2552/aca1e1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/10/2022] [Indexed: 11/12/2022]
Abstract
Objective. Speech decoding, one of the most intriguing brain-computer interface applications, opens up plentiful opportunities from rehabilitation of patients to direct and seamless communication between human species. Typical solutions rely on invasive recordings with a large number of distributed electrodes implanted through craniotomy. Here we explored the possibility of creating speech prosthesis in a minimally invasive setting with a small number of spatially segregated intracranial electrodes.Approach. We collected one hour of data (from two sessions) in two patients implanted with invasive electrodes. We then used only the contacts that pertained to a single stereotactic electroencephalographic (sEEG) shaft or an electrocorticographic (ECoG) stripe to decode neural activity into 26 words and one silence class. We employed a compact convolutional network-based architecture whose spatial and temporal filter weights allow for a physiologically plausible interpretation.Mainresults. We achieved on average 55% accuracy using only six channels of data recorded with a single minimally invasive sEEG electrode in the first patient and 70% accuracy using only eight channels of data recorded for a single ECoG strip in the second patient in classifying 26+1 overtly pronounced words. Our compact architecture did not require the use of pre-engineered features, learned fast and resulted in a stable, interpretable and physiologically meaningful decision rule successfully operating over a contiguous dataset collected during a different time interval than that used for training. Spatial characteristics of the pivotal neuronal populations corroborate with active and passive speech mapping results and exhibit the inverse space-frequency relationship characteristic of neural activity. Compared to other architectures our compact solution performed on par or better than those recently featured in neural speech decoding literature.Significance. We showcase the possibility of building a speech prosthesis with a small number of electrodes and based on a compact feature engineering free decoder derived from a small amount of training data.
Collapse
Affiliation(s)
- Artur Petrosyan
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | | | - Dmitrii Sukhinin
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | - Anna Makarova
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia
| | | | | | - Mikhail Sinkin
- Moscow State University of Medicine and Dentistry, Scientific Research Institute of First Aid to them. N.V. Sklifosovsky, Moscow, Russia
| | - Alexei Ossadtchi
- Center for Bioelectric Interfaces, Higher School of Economics, Moscow, Russia.,Artificial Intelligence Research Institute, AIRI, Moscow, Russia
| |
Collapse
|
23
|
Rainey S. Speaker Responsibility for Synthetic Speech Derived from Neural Activity. THE JOURNAL OF MEDICINE AND PHILOSOPHY 2022; 47:503-515. [PMID: 36333930 DOI: 10.1093/jmp/jhac011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This article provides analysis of the mechanisms and outputs involved in language-use mediated by a neuroprosthetic device. It is motivated by the thought that users of speech neuroprostheses require sufficient control over what their devices externalize as synthetic speech if they are to be thought of as responsible for it, but that the nature of this control, and so the status of their responsibility, is not clear.
Collapse
|
24
|
Vansteensel MJ, Branco MP, Leinders S, Freudenburg ZF, Schippers A, Geukes SH, Gaytant MA, Gosselaar PH, Aarnoutse EJ, Ramsey NF. Methodological Recommendations for Studies on the Daily Life Implementation of Implantable Communication-Brain-Computer Interfaces for Individuals With Locked-in Syndrome. Neurorehabil Neural Repair 2022; 36:666-677. [PMID: 36124975 DOI: 10.1177/15459683221125788] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Implantable brain-computer interfaces (BCIs) promise to be a viable means to restore communication in individuals with locked-in syndrome (LIS). In 2016, we presented the world-first fully implantable BCI system that uses subdural electrocorticography electrodes to record brain signals and a subcutaneous amplifier to transmit the signals to the outside world, and that enabled an individual with LIS to communicate via a tablet computer by selecting icons in spelling software. For future clinical implementation of implantable communication-BCIs, however, much work is still needed, for example, to validate these systems in daily life settings with more participants, and to improve the speed of communication. We believe the design and execution of future studies on these and other topics may benefit from the experience we have gained. Therefore, based on relevant literature and our own experiences, we here provide an overview of procedures, as well as recommendations, for recruitment, screening, inclusion, imaging, hospital admission, implantation, training, and support of participants with LIS, for studies on daily life implementation of implantable communication-BCIs. With this article, we not only aim to inform the BCI community about important topics of concern, but also hope to contribute to improved methodological standardization of implantable BCI research.
Collapse
Affiliation(s)
- Mariska J Vansteensel
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mariana P Branco
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sacha Leinders
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Zac F Freudenburg
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Anouck Schippers
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Simon H Geukes
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Michael A Gaytant
- Department of Pulmonary Diseases/Home Mechanical Ventilation, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Peter H Gosselaar
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Erik J Aarnoutse
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Nick F Ramsey
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
25
|
Kennedy P, Cervantes AJ. Recruitment and Differential Firing Patterns of Single Units During Conditioning to a Tone in a Mute Locked-In Human. Front Hum Neurosci 2022; 16:864983. [PMID: 36211127 PMCID: PMC9532552 DOI: 10.3389/fnhum.2022.864983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Single units that are not related to the desired task can become related to the task by conditioning their firing rates. We theorized that, during conditioning of firing rates to a tone, (a) unrelated single units would be recruited to the task; (b) the recruitment would depend on the phase of the task; (c) tones of different frequencies would produce different patterns of single unit recruitment. In our mute locked-in participant, we conditioned single units using tones of different frequencies emitted from a tone generator. The conditioning task had three phases: Listen to the tone for 20 s, then silently sing the tone for 10 s, with a prior control period of resting for 10 s. Twenty single units were recorded simultaneously while feedback of one of the twenty single units was made audible to the mute locked-in participant. The results indicate that (a) some of the non-audible single units were recruited during conditioning, (b) some were recruited differentially depending on the phase of the paradigm (listen, rest, or silent sing), and (c) single unit firing patterns were specific for different tone frequencies such that the tone could be recognized from the pattern of single unit firings. These data are important when conditioning single unit firings in brain-computer interfacing tasks because they provide evidence that increased numbers of previously unrelated single units can be incorporated into the task. This incorporation expands the bandwidth of the recorded single unit population and thus enhances the brain-computer interface. This is the first report of conditioning of single unit firings in a human participant with a brain to computer implant.
Collapse
Affiliation(s)
- Philip Kennedy
- Neural Signals, Inc., Duluth, GA, United States
- *Correspondence: Philip Kennedy,
| | | |
Collapse
|
26
|
Mercier MR, Dubarry AS, Tadel F, Avanzini P, Axmacher N, Cellier D, Vecchio MD, Hamilton LS, Hermes D, Kahana MJ, Knight RT, Llorens A, Megevand P, Melloni L, Miller KJ, Piai V, Puce A, Ramsey NF, Schwiedrzik CM, Smith SE, Stolk A, Swann NC, Vansteensel MJ, Voytek B, Wang L, Lachaux JP, Oostenveld R. Advances in human intracranial electroencephalography research, guidelines and good practices. Neuroimage 2022; 260:119438. [PMID: 35792291 DOI: 10.1016/j.neuroimage.2022.119438] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 05/23/2022] [Accepted: 06/30/2022] [Indexed: 12/11/2022] Open
Abstract
Since the second-half of the twentieth century, intracranial electroencephalography (iEEG), including both electrocorticography (ECoG) and stereo-electroencephalography (sEEG), has provided an intimate view into the human brain. At the interface between fundamental research and the clinic, iEEG provides both high temporal resolution and high spatial specificity but comes with constraints, such as the individual's tailored sparsity of electrode sampling. Over the years, researchers in neuroscience developed their practices to make the most of the iEEG approach. Here we offer a critical review of iEEG research practices in a didactic framework for newcomers, as well addressing issues encountered by proficient researchers. The scope is threefold: (i) review common practices in iEEG research, (ii) suggest potential guidelines for working with iEEG data and answer frequently asked questions based on the most widespread practices, and (iii) based on current neurophysiological knowledge and methodologies, pave the way to good practice standards in iEEG research. The organization of this paper follows the steps of iEEG data processing. The first section contextualizes iEEG data collection. The second section focuses on localization of intracranial electrodes. The third section highlights the main pre-processing steps. The fourth section presents iEEG signal analysis methods. The fifth section discusses statistical approaches. The sixth section draws some unique perspectives on iEEG research. Finally, to ensure a consistent nomenclature throughout the manuscript and to align with other guidelines, e.g., Brain Imaging Data Structure (BIDS) and the OHBM Committee on Best Practices in Data Analysis and Sharing (COBIDAS), we provide a glossary to disambiguate terms related to iEEG research.
Collapse
|
27
|
Favero P, Berezutskaya J, Ramsey NF, Nazarov A, Freudenburg ZV. Mapping Acoustics to Articulatory Gestures in Dutch: Relating Speech Gestures, Acoustics and Neural Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:802-806. [PMID: 36085697 DOI: 10.1109/embc48229.2022.9871909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Completely locked-in patients suffer from paralysis affecting every muscle in their body, reducing their communication means to brain-computer interfaces (BCIs). State-of-the-art BCIs have a slow spelling rate, which inevitably places a burden on patients' quality of life. Novel techniques address this problem by following a bio-mimetic approach, which consists of decoding sensory-motor cortex (SMC) activity that underlies the movements of the vocal tract's articulators. As recording articulatory data in combination with neural recordings is often unfeasible, the goal of this study was to develop an acoustic-to-articulatory inversion (AAI) model, i.e. an algorithm that generates articulatory data (speech gestures) from acoustics. A fully convolutional neural network was trained to solve the AAI mapping, and was tested on an unseen acoustic set, recorded simultaneously with neural data. Representational similarity analysis was then used to assess the relationship between predicted gestures and neural responses. The network's predictions and targets were significantly correlated. Moreover, SMC neural activity was correlated to the vocal tract gestural dynamics. The present AAI model has the potential to further our understanding of the relationship between neural, gestural and acoustic signals and lay the foundations for the development of a bio-mimetic speech BCI. Clinical Relevance- This study investigates the relationship between articulatory gestures during speech and the underlying neural activity. The topic is central for development of brain-computer interfaces for severely paralysed individuals.
Collapse
|
28
|
Berezutskaya J, Ambrogioni L, Ramsey NF, van Gerven MAJ. Towards Naturalistic Speech Decoding from Intracranial Brain Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:3100-3104. [PMID: 36085779 DOI: 10.1109/embc48229.2022.9871301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech decoding from brain activity can enable development of brain-computer interfaces (BCIs) to restore naturalistic communication in paralyzed patients. Previous work has focused on development of decoding models from isolated speech data with a clean background and multiple repetitions of the material. In this study, we describe a novel approach to speech decoding that relies on a generative adversarial neural network (GAN) to reconstruct speech from brain data recorded during a naturalistic speech listening task (watching a movie). We compared the GAN-based approach, where reconstruction was done from the compressed latent representation of sound decoded from the brain, with several baseline models that reconstructed sound spectrogram directly. We show that the novel approach provides more accurate reconstructions compared to the baselines. These results underscore the potential of GAN models for speech decoding in naturalistic noisy environments and further advancing of BCIs for naturalistic communication. Clinical Relevance - This study presents a novel speech decoding paradigm that combines advances in deep learning, speech synthesis and neural engineering, and has the potential to advance the field of BCI for severely paralyzed individuals.
Collapse
|
29
|
Mirchi N, Warsi NM, Zhang F, Wong SM, Suresh H, Mithani K, Erdman L, Ibrahim GM. Decoding Intracranial EEG With Machine Learning: A Systematic Review. Front Hum Neurosci 2022; 16:913777. [PMID: 35832872 PMCID: PMC9271576 DOI: 10.3389/fnhum.2022.913777] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/31/2022] [Indexed: 11/13/2022] Open
Abstract
Advances in intracranial electroencephalography (iEEG) and neurophysiology have enabled the study of previously inaccessible brain regions with high fidelity temporal and spatial resolution. Studies of iEEG have revealed a rich neural code subserving healthy brain function and which fails in disease states. Machine learning (ML), a form of artificial intelligence, is a modern tool that may be able to better decode complex neural signals and enhance interpretation of these data. To date, a number of publications have applied ML to iEEG, but clinician awareness of these techniques and their relevance to neurosurgery, has been limited. The present work presents a review of existing applications of ML techniques in iEEG data, discusses the relative merits and limitations of the various approaches, and examines potential avenues for clinical translation in neurosurgery. One-hundred-seven articles examining artificial intelligence applications to iEEG were identified from 3 databases. Clinical applications of ML from these articles were categorized into 4 domains: i) seizure analysis, ii) motor tasks, iii) cognitive assessment, and iv) sleep staging. The review revealed that supervised algorithms were most commonly used across studies and often leveraged publicly available timeseries datasets. We conclude with recommendations for future work and potential clinical applications.
Collapse
Affiliation(s)
- Nykan Mirchi
- Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Nebras M. Warsi
- Division of Neurosurgery, Hospital for Sick Children, Department of Surgery, University of Toronto, Toronto, ON, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Frederick Zhang
- Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Simeon M. Wong
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
- Program in Neuroscience and Mental Health, Hospital for Sick Children Research Institute, Toronto, ON, Canada
| | - Hrishikesh Suresh
- Division of Neurosurgery, Hospital for Sick Children, Department of Surgery, University of Toronto, Toronto, ON, Canada
| | - Karim Mithani
- Division of Neurosurgery, Hospital for Sick Children, Department of Surgery, University of Toronto, Toronto, ON, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Lauren Erdman
- Vector Institute for Artificial Intelligence, MaRS Centre, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Hospital for Sick Children, Toronto, ON, Canada
| | - George M. Ibrahim
- Division of Neurosurgery, Hospital for Sick Children, Department of Surgery, University of Toronto, Toronto, ON, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
- Program in Neuroscience and Mental Health, Hospital for Sick Children Research Institute, Toronto, ON, Canada
- Institute of Medical Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
30
|
Lin Y, Hsieh PJ. Neural decoding of speech with semantic-based classification. Cortex 2022; 154:231-240. [DOI: 10.1016/j.cortex.2022.05.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 04/17/2022] [Accepted: 05/09/2022] [Indexed: 11/16/2022]
|
31
|
Michail G, Senkowski D, Holtkamp M, Wächter B, Keil J. Early beta oscillations in multisensory association areas underlie crossmodal performance enhancement. Neuroimage 2022; 257:119307. [PMID: 35577024 DOI: 10.1016/j.neuroimage.2022.119307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 11/28/2022] Open
Abstract
The combination of signals from different sensory modalities can enhance perception and facilitate behavioral responses. While previous research described crossmodal influences in a wide range of tasks, it remains unclear how such influences drive performance enhancements. In particular, the neural mechanisms underlying performance-relevant crossmodal influences, as well as the latency and spatial profile of such influences are not well understood. Here, we examined data from high-density electroencephalography (N = 30) recordings to characterize the oscillatory signatures of crossmodal facilitation of response speed, as manifested in the speeding of visual responses by concurrent task-irrelevant auditory information. Using a data-driven analysis approach, we found that individual gains in response speed correlated with larger beta power difference (13-25 Hz) between the audiovisual and the visual condition, starting within 80 ms after stimulus onset in the secondary visual cortex and in multisensory association areas in the parietal cortex. In addition, we examined data from electrocorticography (ECoG) recordings in four epileptic patients in a comparable paradigm. These ECoG data revealed reduced beta power in audiovisual compared with visual trials in the superior temporal gyrus (STG). Collectively, our data suggest that the crossmodal facilitation of response speed is associated with reduced early beta power in multisensory association and secondary visual areas. The reduced early beta power may reflect an auditory-driven feedback signal to improve visual processing through attentional gating. These findings improve our understanding of the neural mechanisms underlying crossmodal response speed facilitation and highlight the critical role of beta oscillations in mediating behaviorally relevant multisensory processing.
Collapse
Affiliation(s)
- Georgios Michail
- Department of Psychiatry and Psychotherapy, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin 10117, Germany.
| | - Daniel Senkowski
- Department of Psychiatry and Psychotherapy, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin 10117, Germany
| | - Martin Holtkamp
- Epilepsy-Center Berlin-Brandenburg, Institute for Diagnostics of Epilepsy, Berlin 10365, Germany; Department of Neurology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité Campus Mitte (CCM), Charitéplatz 1, Berlin 10117, Germany
| | - Bettina Wächter
- Epilepsy-Center Berlin-Brandenburg, Institute for Diagnostics of Epilepsy, Berlin 10365, Germany
| | - Julian Keil
- Biological Psychology, Christian-Albrechts-University Kiel, Kiel 24118, Germany
| |
Collapse
|
32
|
Merk T, Peterson V, Köhler R, Haufe S, Richardson RM, Neumann WJ. Machine learning based brain signal decoding for intelligent adaptive deep brain stimulation. Exp Neurol 2022; 351:113993. [PMID: 35104499 PMCID: PMC10521329 DOI: 10.1016/j.expneurol.2022.113993] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 11/18/2021] [Accepted: 01/22/2022] [Indexed: 12/30/2022]
Abstract
Sensing enabled implantable devices and next-generation neurotechnology allow real-time adjustments of invasive neuromodulation. The identification of symptom and disease-specific biomarkers in invasive brain signal recordings has inspired the idea of demand dependent adaptive deep brain stimulation (aDBS). Expanding the clinical utility of aDBS with machine learning may hold the potential for the next breakthrough in the therapeutic success of clinical brain computer interfaces. To this end, sophisticated machine learning algorithms optimized for decoding of brain states from neural time-series must be developed. To support this venture, this review summarizes the current state of machine learning studies for invasive neurophysiology. After a brief introduction to the machine learning terminology, the transformation of brain recordings into meaningful features for decoding of symptoms and behavior is described. Commonly used machine learning models are explained and analyzed from the perspective of utility for aDBS. This is followed by a critical review on good practices for training and testing to ensure conceptual and practical generalizability for real-time adaptation in clinical settings. Finally, first studies combining machine learning with aDBS are highlighted. This review takes a glimpse into the promising future of intelligent adaptive DBS (iDBS) and concludes by identifying four key ingredients on the road for successful clinical adoption: i) multidisciplinary research teams, ii) publicly available datasets, iii) open-source algorithmic solutions and iv) strong world-wide research collaborations.
Collapse
Affiliation(s)
- Timon Merk
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité - Universitätsmedizin Berlin, Chariteplatz 1, 10117 Berlin, Germany
| | - Victoria Peterson
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
| | - Richard Köhler
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité - Universitätsmedizin Berlin, Chariteplatz 1, 10117 Berlin, Germany
| | - Stefan Haufe
- Berlin Center for Advanced Neuroimaging (BCAN), Charité - Universitätsmedizin Berlin, Chariteplatz 1, 10117 Berlin, Germany
| | - R Mark Richardson
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
| | - Wolf-Julian Neumann
- Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité - Universitätsmedizin Berlin, Chariteplatz 1, 10117 Berlin, Germany.
| |
Collapse
|
33
|
Glanz O, Hader M, Schulze-Bonhage A, Auer P, Ball T. A Study of Word Complexity Under Conditions of Non-experimental, Natural Overt Speech Production Using ECoG. Front Hum Neurosci 2022; 15:711886. [PMID: 35185491 PMCID: PMC8854223 DOI: 10.3389/fnhum.2021.711886] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/15/2021] [Indexed: 11/25/2022] Open
Abstract
The linguistic complexity of words has largely been studied on the behavioral level and in experimental settings. Only little is known about the neural processes underlying it in uninstructed, spontaneous conversations. We built up a multimodal neurolinguistic corpus composed of synchronized audio, video, and electrocorticographic (ECoG) recordings from the fronto-temporo-parietal cortex to address this phenomenon based on uninstructed, spontaneous speech production. We performed extensive linguistic annotations of the language material and calculated word complexity using several numeric parameters. We orthogonalized the parameters with the help of a linear regression model. Then, we correlated the spectral components of neural activity with the individual linguistic parameters and with the residuals of the linear regression model, and compared the results. The proportional relation between the number of consonants and vowels, which was the most informative parameter with regard to the neural representation of word complexity, showed effects in two areas: the frontal one was at the junction of the premotor cortex, the prefrontal cortex, and Brodmann area 44. The postcentral one lay directly above the lateral sulcus and comprised the ventral central sulcus, the parietal operculum and the adjacent inferior parietal cortex. Beyond the physiological findings summarized here, our methods may be useful for those interested in ways of studying neural effects related to natural language production and in surmounting the intrinsic problem of collinearity between multiple features of spontaneously spoken material.
Collapse
Affiliation(s)
- Olga Glanz
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Olga Glanz (Iljina),
| | - Marina Hader
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
| | - Andreas Schulze-Bonhage
- Department of Neurosurgery, Faculty of Medicine, Epilepsy Center, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| | - Peter Auer
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
| | - Tonio Ball
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
- *Correspondence: Tonio Ball,
| |
Collapse
|
34
|
Luo S, Rabbani Q, Crone NE. Brain-Computer Interface: Applications to Speech Decoding and Synthesis to Augment Communication. Neurotherapeutics 2022; 19:263-273. [PMID: 35099768 PMCID: PMC9130409 DOI: 10.1007/s13311-022-01190-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 01/03/2023] Open
Abstract
Damage or degeneration of motor pathways necessary for speech and other movements, as in brainstem strokes or amyotrophic lateral sclerosis (ALS), can interfere with efficient communication without affecting brain structures responsible for language or cognition. In the worst-case scenario, this can result in the locked in syndrome (LIS), a condition in which individuals cannot initiate communication and can only express themselves by answering yes/no questions with eye blinks or other rudimentary movements. Existing augmentative and alternative communication (AAC) devices that rely on eye tracking can improve the quality of life for people with this condition, but brain-computer interfaces (BCIs) are also increasingly being investigated as AAC devices, particularly when eye tracking is too slow or unreliable. Moreover, with recent and ongoing advances in machine learning and neural recording technologies, BCIs may offer the only means to go beyond cursor control and text generation on a computer, to allow real-time synthesis of speech, which would arguably offer the most efficient and expressive channel for communication. The potential for BCI speech synthesis has only recently been realized because of seminal studies of the neuroanatomical and neurophysiological underpinnings of speech production using intracranial electrocorticographic (ECoG) recordings in patients undergoing epilepsy surgery. These studies have shown that cortical areas responsible for vocalization and articulation are distributed over a large area of ventral sensorimotor cortex, and that it is possible to decode speech and reconstruct its acoustics from ECoG if these areas are recorded with sufficiently dense and comprehensive electrode arrays. In this article, we review these advances, including the latest neural decoding strategies that range from deep learning models to the direct concatenation of speech units. We also discuss state-of-the-art vocoders that are integral in constructing natural-sounding audio waveforms for speech BCIs. Finally, this review outlines some of the challenges ahead in directly synthesizing speech for patients with LIS.
Collapse
Affiliation(s)
- Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
35
|
Dash D, Ferrari P, Babajani-Feremi A, Borna A, Schwindt PDD, Wang J. Magnetometers vs Gradiometers for Neural Speech Decoding. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:6543-6546. [PMID: 34892608 DOI: 10.1109/embc46164.2021.9630489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Neural speech decoding aims at providing natural rate communication assistance to patients with locked-in state (e.g. due to amyotrophic lateral sclerosis, ALS) in contrast to the traditional brain-computer interface (BCI) spellers which are slow. Recent studies have shown that Magnetoencephalography (MEG) is a suitable neuroimaging modality to study neural speech decoding considering its excellent temporal resolution that can characterize the fast dynamics of speech. Gradiometers have been the preferred choice for sensor space analysis with MEG, due to their efficacy in noise suppression over magnetometers. However, recent development of optically pumped magnetometers (OPM) based wearable-MEG devices have shown great potential in future BCI applications, yet, no prior study has evaluated the performance of magnetometers in neural speech decoding. In this study, we decoded imagined and spoken speech from the MEG signals of seven healthy participants and compared the performance of magnetometers and gradiometers. Experimental results indicated that magnetometers also have the potential for neural speech decoding, although the performance was significantly lower than that obtained with gradiometers. Further, we implemented a wavelet based denoising strategy that improved the performance of both magnetometers and gradiometers significantly. These findings reconfirm that gradiometers are preferable in MEG based decoding analysis but also provide the possibility towards the use of magnetometers (or OPMs) for the development of the next-generation speech-BCIs.
Collapse
|
36
|
Verwoert M, Vansteensel MJ, Freudenburg ZV, Aarnoutse EJ, Leijten FS, Ramsey NF, Branco MP. Decoding four hand gestures with a single bipolar pair of electrocorticography electrodes. J Neural Eng 2021; 18:10.1088/1741-2552/ac2c9f. [PMID: 34607318 PMCID: PMC8744490 DOI: 10.1088/1741-2552/ac2c9f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 10/04/2021] [Indexed: 11/12/2022]
Abstract
Objective.Electrocorticography (ECoG) based brain-computer interfaces (BCIs) can be used to restore communication in individuals with locked-in syndrome. In motor-based BCIs, the number of degrees-of-freedom, and thus the speed of the BCI, directly depends on the number of classes that can be discriminated from the neural activity in the sensorimotor cortex. When considering minimally invasive BCI implants, the size of the subdural ECoG implant must be minimized without compromising the number of degrees-of-freedom.Approach.Here we investigated if four hand gestures could be decoded using a single ECoG strip of four consecutive electrodes spaced 1 cm apart and compared the performance between a unipolar and a bipolar montage. For that we collected data of seven individuals with intractable epilepsy implanted with ECoG grids, covering the hand region of the sensorimotor cortex. Based on the implanted grids, we generated virtual ECoG strips and compared the decoding accuracy between (a) a single unipolar electrode (Unipolar Electrode), (b) a combination of four unipolar electrodes (Unipolar Strip), (c) a single bipolar pair (Bipolar Pair) and (d) a combination of six bipolar pairs (Bipolar Strip).Main results.We show that four hand gestures can be equally well decoded using 'Unipolar Strips' (mean 67.4 ± 11.7%), 'Bipolar Strips' (mean 66.6 ± 12.1%) and 'Bipolar Pairs' (mean 67.6 ± 9.4%), while 'Unipolar Electrodes' (61.6 ± 5.9%) performed significantly worse compared to 'Unipolar Strips' and 'Bipolar Pairs'.Significance.We conclude that a single bipolar pair is a potential candidate for minimally invasive motor-based BCIs and encourage the use of ECoG as a robust and reliable BCI platform for multi-class movement decoding.
Collapse
Affiliation(s)
- Maxime Verwoert
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Mariska J. Vansteensel
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Zachary V. Freudenburg
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Erik J. Aarnoutse
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Frans S.S. Leijten
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Nick F. Ramsey
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| | - Mariana P. Branco
- Department of Neurology and Neurosurgery, University Medical Center Utrecht Brain Center, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
37
|
Chaudhary U, Chander BS, Ohry A, Jaramillo-Gonzalez A, Lulé D, Birbaumer N. Brain Computer Interfaces for Assisted Communication in Paralysis and Quality of Life. Int J Neural Syst 2021; 31:2130003. [PMID: 34587854 DOI: 10.1142/s0129065721300035] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The rapid evolution of Brain-Computer Interface (BCI) technology and the exponential growth of BCI literature during the past 20 years is a consequence of increasing computational power and the achievements of statistical learning theory and machine learning since the 1960s. Despite this rapid scientific progress, the range of successful clinical and societal applications remained limited, with some notable exceptions in the rehabilitation of chronic stroke and first steps towards BCI-based assisted verbal communication in paralysis. In this contribution, we focus on the effects of noninvasive and invasive BCI-based verbal communication on the quality of life (QoL) of patients with amyotrophic lateral sclerosis (ALS) in the locked-in state (LIS) and the completely locked-in state (CLIS). Despite a substantial lack of replicated scientific data, this paper complements the existing methodological knowledge and focuses future investigators' attention on (1) Social determinants of QoL and (2) Brain reorganization and behavior. While it is not documented in controlled studies that the good QoL in these patients is a consequence of BCI-based neurorehabilitation, the proposed determinants of QoL might become the theoretical background needed to develop clinically more useful BCI systems and to evaluate the effects of BCI-based communication on QoL for advanced ALS patients and other forms of severe paralysis.
Collapse
Affiliation(s)
- Ujwal Chaudhary
- Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen 72076, Germany.,ALSVOICE gGmbH, Mössingen 72116, Germany
| | - Bankim Subhash Chander
- ALSVOICE gGmbH, Mössingen 72116, Germany.,Department of Psychiatry and Psychotherapy, Center for Innovative Psychiatric and Psychotherapeutic Research, Central Institute of Mental Health Mannheim, Medical Faculty Mannheim, University of Heidelberg, Mannheim 68159, Germany
| | - Avi Ohry
- Sackler Faculty of Medicine, Tel Aviv University & Reuth Medical & Rehabilitation Center, Tel Aviv, Israel
| | - Andres Jaramillo-Gonzalez
- Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen 72076, Germany
| | | | - Niels Birbaumer
- Institute of Medical Psychology and Behavioral Neurobiology, University of Tübingen, Tübingen 72076, Germany.,ALSVOICE gGmbH, Mössingen 72116, Germany
| |
Collapse
|
38
|
Wittevrongel B, Holmes N, Boto E, Hill R, Rea M, Libert A, Khachatryan E, Van Hulle MM, Bowtell R, Brookes MJ. Practical real-time MEG-based neural interfacing with optically pumped magnetometers. BMC Biol 2021; 19:158. [PMID: 34376215 PMCID: PMC8356471 DOI: 10.1186/s12915-021-01073-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 04/25/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Brain-computer interfaces decode intentions directly from the human brain with the aim to restore lost functionality, control external devices or augment daily experiences. To combine optimal performance with wide applicability, high-quality brain signals should be captured non-invasively. Magnetoencephalography (MEG) is a potent candidate but currently requires costly and confining recording hardware. The recently developed optically pumped magnetometers (OPMs) promise to overcome this limitation, but are currently untested in the context of neural interfacing. RESULTS In this work, we show that OPM-MEG allows robust single-trial analysis which we exploited in a real-time 'mind-spelling' application yielding an average accuracy of 97.7%. CONCLUSIONS This shows that OPM-MEG can be used to exploit neuro-magnetic brain responses in a practical and flexible manner, and opens up new avenues for a wide range of new neural interface applications in the future.
Collapse
Affiliation(s)
- Benjamin Wittevrongel
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, Leuven, Belgium. .,Leuven Institute for Artificial Intelligence (Leuven.AI), Leuven, Belgium. .,Leuven Brain Institute (LBI), Leuven, Belgium.
| | - Niall Holmes
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| | - Elena Boto
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| | - Ryan Hill
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| | - Molly Rea
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| | - Arno Libert
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, Leuven, Belgium.,Leuven Brain Institute (LBI), Leuven, Belgium
| | - Elvira Khachatryan
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, Leuven, Belgium.,Leuven Brain Institute (LBI), Leuven, Belgium
| | - Marc M Van Hulle
- Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven, Leuven, Belgium.,Leuven Institute for Artificial Intelligence (Leuven.AI), Leuven, Belgium.,Leuven Brain Institute (LBI), Leuven, Belgium
| | - Richard Bowtell
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| | - Matthew J Brookes
- Sir Peter Mansfield Imaging Centre, School of Physics and Astronomy, University of Nottingham, Nottingham, UK
| |
Collapse
|
39
|
Trumpis M, Chiang CH, Orsborn AL, Bent B, Li J, Rogers JA, Pesaran B, Cogan G, Viventi J. Sufficient sampling for kriging prediction of cortical potential in rat, monkey, and human µECoG. J Neural Eng 2021; 18. [PMID: 33326943 DOI: 10.1088/1741-2552/abd460] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 12/16/2020] [Indexed: 12/22/2022]
Abstract
Objective. Large channel count surface-based electrophysiology arrays (e.g. µECoG) are high-throughput neural interfaces with good chronic stability. Electrode spacing remains ad hoc due to redundancy and nonstationarity of field dynamics. Here, we establish a criterion for electrode spacing based on the expected accuracy of predicting unsampled field potential from sampled sites.Approach. We applied spatial covariance modeling and field prediction techniques based on geospatial kriging to quantify sufficient sampling for thousands of 500 ms µECoG snapshots in human, monkey, and rat. We calculated a probably approximately correct (PAC) spacing based on kriging that would be required to predict µECoG fields at≤10% error for most cases (95% of observations).Main results. Kriging theory accurately explained the competing effects of electrode density and noise on predicting field potential. Across five frequency bands from 4-7 to 75-300 Hz, PAC spacing was sub-millimeter for auditory cortex in anesthetized and awake rats, and posterior superior temporal gyrus in anesthetized human. At 75-300 Hz, sub-millimeter PAC spacing was required in all species and cortical areas.Significance. PAC spacing accounted for the effect of signal-to-noise on prediction quality and was sensitive to the full distribution of non-stationary covariance states. Our results show that µECoG arrays should sample at sub-millimeter resolution for applications in diverse cortical areas and for noise resilience.
Collapse
Affiliation(s)
- Michael Trumpis
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Chia-Han Chiang
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Amy L Orsborn
- Center for Neural Science, New York University, New York, NY 10003, United States of America.,Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, United States of America.,Department of Bioengineering, University of Washington, Seattle, Washington 98105, United States of America.,Washington National Primate Research Center, Seattle, Washington 98195, United States of America
| | - Brinnae Bent
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America
| | - Jinghua Li
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Department of Materials Science and Engineering, The Ohio State University, Columbus, OH 43210, United States of America.,Chronic Brain Injury Program, The Ohio State University, Columbus, OH 43210, United States of America
| | - John A Rogers
- Department of Materials Science and Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Simpson Querrey Institute, Northwestern University, Chicago, IL 60611, United States of America.,Department of Biomedical Engineering, Northwestern University, Evanston, IL 60208, United States of America.,Department of Neurological Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, United States of America
| | - Bijan Pesaran
- Center for Neural Science, New York University, New York, NY 10003, United States of America
| | - Gregory Cogan
- Department of Neurosurgery, Duke School of Medicine, Durham, NC 27710, United States of America.,Department of Psychology and Neuroscience, Duke University, Durham, NC 27708, United States of America.,Center for Cognitive Neuroscience, Duke University, Durham, NC 27708, United States of America.,Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC 27710, United States of America
| | - Jonathan Viventi
- Department of Biomedical Engineering, Duke University, Durham, NC 27708, United States of America.,Department of Neurosurgery, Duke School of Medicine, Durham, NC 27710, United States of America.,Duke Comprehensive Epilepsy Center, Duke School of Medicine, Durham, NC 27710, United States of America.,Department of Neurobiology, Duke School of Medicine, Durham, NC 27710, United States of America
| |
Collapse
|
40
|
Wilson GH, Stavisky SD, Willett FR, Avansino DT, Kelemen JN, Hochberg LR, Henderson JM, Druckmann S, Shenoy KV. Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus. J Neural Eng 2020; 17:066007. [PMID: 33236720 PMCID: PMC8293867 DOI: 10.1088/1741-2552/abbfef] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
OBJECTIVE To evaluate the potential of intracortical electrode array signals for brain-computer interfaces (BCIs) to restore lost speech, we measured the performance of decoders trained to discriminate a comprehensive basis set of 39 English phonemes and to synthesize speech sounds via a neural pattern matching method. We decoded neural correlates of spoken-out-loud words in the 'hand knob' area of precentral gyrus, a step toward the eventual goal of decoding attempted speech from ventral speech areas in patients who are unable to speak. APPROACH Neural and audio data were recorded while two BrainGate2 pilot clinical trial participants, each with two chronically-implanted 96-electrode arrays, spoke 420 different words that broadly sampled English phonemes. Phoneme onsets were identified from audio recordings, and their identities were then classified from neural features consisting of each electrode's binned action potential counts or high-frequency local field potential power. Speech synthesis was performed using the 'Brain-to-Speech' pattern matching method. We also examined two potential confounds specific to decoding overt speech: acoustic contamination of neural signals and systematic differences in labeling different phonemes' onset times. MAIN RESULTS A linear decoder achieved up to 29.3% classification accuracy (chance = 6%) across 39 phonemes, while an RNN classifier achieved 33.9% accuracy. Parameter sweeps indicated that performance did not saturate when adding more electrodes or more training data, and that accuracy improved when utilizing time-varying structure in the data. Microphonic contamination and phoneme onset differences modestly increased decoding accuracy, but could be mitigated by acoustic artifact subtraction and using a neural speech onset marker, respectively. Speech synthesis achieved r = 0.523 correlation between true and reconstructed audio. SIGNIFICANCE The ability to decode speech using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.
Collapse
Affiliation(s)
- Guy H Wilson
- Neurosciences Graduate Program, Stanford University, Stanford, CA, United States of America
| | - Sergey D Stavisky
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
| | - Francis R Willett
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
- Howard Hughes Medical Institute at Stanford University, Stanford, CA, United States of America
| | - Donald T Avansino
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
| | - Jessica N Kelemen
- Department of Neurology, Harvard Medical School, Boston, MA, United States of America
| | - Leigh R Hochberg
- Department of Neurology, Harvard Medical School, Boston, MA, United States of America
- Center for Neurotechnology and Neurorecovery, Dept. of Neurology, Massachusetts General Hospital, Boston, MA, United States of America
- VA RR&D Center for Neurorestoration and Neurotechnology, Rehabilitation R&D Service, Providence VA Medical Center, Providence, RI, United States of America
- Carney Institute for Brain Science and School of Engineering, Brown University, Providence, RI, United States of America
| | - Jaimie M Henderson
- Department of Neurosurgery, Stanford University, Stanford, CA, United States of America
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
| | - Shaul Druckmann
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Neurobiology, Stanford University, Stanford, CA, United States of America
| | - Krishna V Shenoy
- Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
- Howard Hughes Medical Institute at Stanford University, Stanford, CA, United States of America
- Department of Neurobiology, Stanford University, Stanford, CA, United States of America
- Department of Bioengineering, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
41
|
Dash D, Wisler A, Ferrari P, Davenport EM, Maldjian J, Wang J. MEG Sensor Selection for Neural Speech Decoding. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:182320-182337. [PMID: 33204579 PMCID: PMC7668411 DOI: 10.1109/access.2020.3028831] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Direct decoding of speech from the brain is a faster alternative to current electroencephalography (EEG) speller-based brain-computer interfaces (BCI) in providing communication assistance to locked-in patients. Magnetoencephalography (MEG) has recently shown great potential as a non-invasive neuroimaging modality for neural speech decoding, owing in part to its spatial selectivity over other high-temporal resolution devices. Standard MEG systems have a large number of cryogenically cooled channels/sensors (200 - 300) encapsulated within a fixed liquid helium dewar, precluding their use as wearable BCI devices. Fortunately, recently developed optically pumped magnetometers (OPM) do not require cryogens, and have the potential to be wearable and movable making them more suitable for BCI applications. This design is also modular allowing for customized montages to include only the sensors necessary for a particular task. As the number of sensors bears a heavy influence on the cost, size, and weight of MEG systems, minimizing the number of sensors is critical for designing practical MEG-based BCIs in the future. In this study, we sought to identify an optimal set of MEG channels to decode imagined and spoken phrases from the MEG signals. Using a forward selection algorithm with a support vector machine classifier we found that nine optimally located MEG gradiometers provided higher decoding accuracy compared to using all channels. Additionally, the forward selection algorithm achieved similar performance to dimensionality reduction using a stacked-sparse-autoencoder. Analysis of spatial dynamics of speech decoding suggested that both left and right hemisphere sensors contribute to speech decoding. Sensors approximately located near Broca's area were found to be commonly contributing among the higher-ranked sensors across all subjects.
Collapse
Affiliation(s)
- Debadatta Dash
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Alan Wisler
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Paul Ferrari
- MEG Laboratory, Dell Children's Medical Center, Austin, TX 78723, USA
- Department of Psychology, The University of Texas at Austin, Austin, TX 78712, USA
| | | | - Joseph Maldjian
- Department of Radiology, University of Texas at Southwestern, Dallas, TX 75390, USA
| | - Jun Wang
- Department of Neurology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
42
|
Gearing M, Kennedy P. Histological Confirmation of Myelinated Neural Filaments Within the Tip of the Neurotrophic Electrode After a Decade of Neural Recordings. Front Hum Neurosci 2020; 14:111. [PMID: 32372930 PMCID: PMC7187752 DOI: 10.3389/fnhum.2020.00111] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 03/11/2020] [Indexed: 11/13/2022] Open
Abstract
Aim Electrodes that provide brain to machine or computer interfacing must survive the lifetime of the person to be considered an acceptable prosthetic. The electrodes may be external such as with electroencephalographic (EEG), internal extracortical such as electrocorticographic (ECoG) or intracortical. Methods Most intracortical electrodes are placed close to the neuropil being recorded and do not survive years of recording. However, the Neurotrophic Electrode is placed within the cortex and the neuropil grows inside and through the hollow tip of the electrode and is thus trapped inside. Highly flexible coiled lead wires minimize the strain on the electrode tip. Histological analysis includes immunohistochemical detection of neurofilaments and the absence of gliosis. Results This configuration led to a decade long recording in this locked-in person. At year nine, the neural activity underwent conditioning experiments indicating that the neural activity was functional and not noise. This paper presents data on the histological analysis of the tissue inside the electrode tip after 13 years of implantation. Conclusion This paper is a singular example of histological analysis after a decade of recording. The histological analysis laid out herein is strong evidence that the brain can grow neurites into the electrode tip and record for a decade. This is profoundly important in the field of brain to machine or computer interfacing by implying that long term electrodes should incorporate some means of growing the neuropil into the electrode rather than placing the electrode into the neuropil.
Collapse
Affiliation(s)
- Marla Gearing
- Laboratory Medicine and Neurology, Department of Pathology, Emory University School of Medicine, Atlanta, GA, United States
| | | |
Collapse
|
43
|
Dash D, Ferrari P, Wang J. Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals. Front Neurosci 2020; 14:290. [PMID: 32317917 PMCID: PMC7154084 DOI: 10.3389/fnins.2020.00290] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Accepted: 03/13/2020] [Indexed: 11/16/2022] Open
Abstract
Speech production is a hierarchical mechanism involving the synchronization of the brain and the oral articulators, where the intention of linguistic concepts is transformed into meaningful sounds. Individuals with locked-in syndrome (fully paralyzed but aware) lose their motor ability completely including articulation and even eyeball movement. The neural pathway may be the only option to resume a certain level of communication for these patients. Current brain-computer interfaces (BCIs) use patients' visual and attentional correlates to build communication, resulting in a slow communication rate (a few words per minute). Direct decoding of imagined speech from the neural signals (and then driving a speech synthesizer) has the potential for a higher communication rate. In this study, we investigated the decoding of five imagined and spoken phrases from single-trial, non-invasive magnetoencephalography (MEG) signals collected from eight adult subjects. Two machine learning algorithms were used. One was an artificial neural network (ANN) with statistical features as the baseline approach. The other was convolutional neural networks (CNNs) applied on the spatial, spectral and temporal features extracted from the MEG signals. Experimental results indicated the possibility to decode imagined and spoken phrases directly from neuromagnetic signals. CNNs were found to be highly effective with an average decoding accuracy of up to 93% for the imagined and 96% for the spoken phrases.
Collapse
Affiliation(s)
- Debadatta Dash
- Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, United States
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, TX, United States
| | - Paul Ferrari
- MEG Lab, Dell Children's Medical Center, Austin, TX, United States
- Department of Psychology, University of Texas at Austin, Austin, TX, United States
| | - Jun Wang
- Department of Neurology, Dell Medical School, University of Texas at Austin, Austin, TX, United States
- Department of Communication Sciences and Disorders, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
44
|
Annen J, Laureys S, Gosseries O. Brain-computer interfaces for consciousness assessment and communication in severely brain-injured patients. BRAIN-COMPUTER INTERFACES 2020; 168:137-152. [DOI: 10.1016/b978-0-444-63934-9.00011-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
45
|
Abstract
Human brain function research has evolved dramatically in the last decades. In this chapter the role of modern methods of recording brain activity in understanding human brain function is explained. Current knowledge of brain function relevant to brain-computer interface (BCI) research is detailed, with an emphasis on the motor system which provides an exceptional level of detail to decoding of intended or attempted movements in paralyzed beneficiaries of BCI technology and translation to computer-mediated actions. BCI technologies that stand to benefit the most of the detailed organization of the human cortex are, and for the foreseeable future are likely to be, reliant on intracranial electrodes. These evolving technologies are expected to enable severely paralyzed people to regain the faculty of movement and speech in the coming decades.
Collapse
Affiliation(s)
- Nick F Ramsey
- Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
46
|
Abstract
Locked-in syndrome (LIS) is characterized by an inability to move or speak in the presence of intact cognition and can be caused by brainstem trauma or neuromuscular disease. Quality of life (QoL) in LIS is strongly impaired by the inability to communicate, which cannot always be remedied by traditional augmentative and alternative communication (AAC) solutions if residual muscle activity is insufficient to control the AAC device. Brain-computer interfaces (BCIs) may offer a solution by employing the person's neural signals instead of relying on muscle activity. Here, we review the latest communication BCI research using noninvasive signal acquisition approaches (electroencephalography, functional magnetic resonance imaging, functional near-infrared spectroscopy) and subdural and intracortical implanted electrodes, and we discuss current efforts to translate research knowledge into usable BCI-enabled communication solutions that aim to improve the QoL of individuals with LIS.
Collapse
|
47
|
Stavisky SD, Willett FR, Wilson GH, Murphy BA, Rezaii P, Avansino DT, Memberg WD, Miller JP, Kirsch RF, Hochberg LR, Ajiboye AB, Druckmann S, Shenoy KV, Henderson JM. Neural ensemble dynamics in dorsal motor cortex during speech in people with paralysis. eLife 2019; 8:e46015. [PMID: 31820736 PMCID: PMC6954053 DOI: 10.7554/elife.46015] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 11/14/2019] [Indexed: 01/20/2023] Open
Abstract
Speaking is a sensorimotor behavior whose neural basis is difficult to study with single neuron resolution due to the scarcity of human intracortical measurements. We used electrode arrays to record from the motor cortex 'hand knob' in two people with tetraplegia, an area not previously implicated in speech. Neurons modulated during speaking and during non-speaking movements of the tongue, lips, and jaw. This challenges whether the conventional model of a 'motor homunculus' division by major body regions extends to the single-neuron scale. Spoken words and syllables could be decoded from single trials, demonstrating the potential of intracortical recordings for brain-computer interfaces to restore speech. Two neural population dynamics features previously reported for arm movements were also present during speaking: a component that was mostly invariant across initiating different words, followed by rotatory dynamics during speaking. This suggests that common neural dynamical motifs may underlie movement of arm and speech articulators.
Collapse
Affiliation(s)
- Sergey D Stavisky
- Department of NeurosurgeryStanford UniversityStanfordUnited States
- Department of Electrical EngineeringStanford UniversityStanfordUnited States
| | - Francis R Willett
- Department of NeurosurgeryStanford UniversityStanfordUnited States
- Department of Electrical EngineeringStanford UniversityStanfordUnited States
| | - Guy H Wilson
- Neurosciences ProgramStanford UniversityStanfordUnited States
| | - Brian A Murphy
- Department of Biomedical EngineeringCase Western Reserve UniversityClevelandUnited States
- FES Center, Rehab R&D ServiceLouis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
| | - Paymon Rezaii
- Department of NeurosurgeryStanford UniversityStanfordUnited States
| | | | - William D Memberg
- Department of Biomedical EngineeringCase Western Reserve UniversityClevelandUnited States
- FES Center, Rehab R&D ServiceLouis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
| | - Jonathan P Miller
- FES Center, Rehab R&D ServiceLouis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
- Department of NeurosurgeryUniversity Hospitals Cleveland Medical CenterClevelandUnited States
| | - Robert F Kirsch
- Department of Biomedical EngineeringCase Western Reserve UniversityClevelandUnited States
- FES Center, Rehab R&D ServiceLouis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
| | - Leigh R Hochberg
- VA RR&D Center for Neurorestoration and Neurotechnology, Rehabilitation R&D ServiceProvidence VA Medical CenterProvidenceUnited States
- Center for Neurotechnology and Neurorecovery, Department of NeurologyMassachusetts General Hospital, Harvard Medical SchoolBostonUnited States
- School of Engineering and Robert J. & Nandy D. Carney Institute for Brain ScienceBrown UniversityProvidenceUnited States
| | - A Bolu Ajiboye
- Department of Biomedical EngineeringCase Western Reserve UniversityClevelandUnited States
- FES Center, Rehab R&D ServiceLouis Stokes Cleveland Department of Veterans Affairs Medical CenterClevelandUnited States
| | - Shaul Druckmann
- Department of NeurobiologyStanford UniversityStanfordUnited States
| | - Krishna V Shenoy
- Department of Electrical EngineeringStanford UniversityStanfordUnited States
- Department of NeurobiologyStanford UniversityStanfordUnited States
- Department of BioengineeringStanford UniversityStanfordUnited States
- Howard Hughes Medical Institute, Stanford UniversityStanfordUnited States
- Wu Tsai Neurosciences InstituteStanford UniversityStanfordUnited States
- Bio-X ProgramStanford UniversityStanfordUnited States
| | - Jaimie M Henderson
- Department of NeurosurgeryStanford UniversityStanfordUnited States
- Wu Tsai Neurosciences InstituteStanford UniversityStanfordUnited States
- Bio-X ProgramStanford UniversityStanfordUnited States
| |
Collapse
|
48
|
Loza CA, Reddy CG, Akella S, Príncipe JC. Discrimination of Movement-Related Cortical Potentials Exploiting Unsupervised Learned Representations From ECoGs. Front Neurosci 2019; 13:1248. [PMID: 31824249 PMCID: PMC6882771 DOI: 10.3389/fnins.2019.01248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 11/05/2019] [Indexed: 11/13/2022] Open
Abstract
Brain–Computer Interfaces (BCI) aim to bypass the peripheral nervous system to link the brain to external devices via successful modeling of decoding mechanisms. BCI based on electrocorticogram or ECoG represent a viable compromise between clinical practicality, spatial resolution, and signal quality when it comes to extracellular electrical potentials from local neuronal assemblies. Classic analysis of ECoG traces usually falls under the umbrella of Time-Frequency decompositions with adaptations from Fourier analysis and wavelets as its most prominent variants. However, analyzing such high-dimensional, multivariate time series demands for specialized signal processing and neurophysiological principles. We propose a generative model for single-channel ECoGs that is able to fully characterize reoccurring rhythm–specific neuromodulations as weighted activations of prototypical templates over time. The set of timings, weights and indexes comprise a temporal marked point process (TMPP) that accesses a set of bases from vector spaces of different dimensions—a dictionary. The shallow nature of the model admits the equivalence between latent variables and representations. In this way, learning the model parameters is a case of unsupervised representation learning. We exploit principles of Minimum Description Length (MDL) encoding to effectively yield a data-driven framework where prototypical neuromodulations (not restricted to a particular duration) can be estimated alongside the timings and features of the TMPP. We validate the proposed methodology on discrimination of movement-related tasks utilizing 32-electrode grids implanted in the frontal cortex of six epileptic subjects. We show that the learned representations from the high-gamma band (85–145 Hz) are not only interpretable, but also discriminant in a lower dimensional space. The results also underscore the practicality of our algorithm, i.e., 2 main hyperparameters that can be readily set via neurophysiology, and emphasize the need of principled and interpretable representation learning in order to model encoding mechanisms in the brain.
Collapse
Affiliation(s)
- Carlos A. Loza
- Department of Mathematics, Universidad San Francisco de Quito, Quito, Ecuador
- Instituto de Neurociencias, Universidad San Francisco de Quito, Quito, Ecuador
- *Correspondence: Carlos A. Loza
| | - Chandan G. Reddy
- Department of Neurosurgery, University of Iowa, Iowa City, IA, United States
- Department of Neurosurgery, University of Florida, Gainesville, FL, United States
- Computational NeuroEngineering Lab, Electrical and Computer Engineering Department, University of Florida, Gainesville, FL, United States
| | - Shailaja Akella
- Computational NeuroEngineering Lab, Electrical and Computer Engineering Department, University of Florida, Gainesville, FL, United States
| | - José C. Príncipe
- Computational NeuroEngineering Lab, Electrical and Computer Engineering Department, University of Florida, Gainesville, FL, United States
| |
Collapse
|
49
|
Herff C, Diener L, Angrick M, Mugler E, Tate MC, Goldrick MA, Krusienski DJ, Slutzky MW, Schultz T. Generating Natural, Intelligible Speech From Brain Activity in Motor, Premotor, and Inferior Frontal Cortices. Front Neurosci 2019; 13:1267. [PMID: 31824257 PMCID: PMC6882773 DOI: 10.3389/fnins.2019.01267] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/07/2019] [Indexed: 12/17/2022] Open
Abstract
Neural interfaces that directly produce intelligible speech from brain activity would allow people with severe impairment from neurological disorders to communicate more naturally. Here, we record neural population activity in motor, premotor and inferior frontal cortices during speech production using electrocorticography (ECoG) and show that ECoG signals alone can be used to generate intelligible speech output that can preserve conversational cues. To produce speech directly from neural data, we adapted a method from the field of speech synthesis called unit selection, in which units of speech are concatenated to form audible output. In our approach, which we call Brain-To-Speech, we chose subsequent units of speech based on the measured ECoG activity to generate audio waveforms directly from the neural recordings. Brain-To-Speech employed the user's own voice to generate speech that sounded very natural and included features such as prosody and accentuation. By investigating the brain areas involved in speech production separately, we found that speech motor cortex provided more information for the reconstruction process than the other cortical areas.
Collapse
Affiliation(s)
- Christian Herff
- School of Mental Health & Neuroscience, Maastricht University, Maastricht, Netherlands
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Lorenz Diener
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Miguel Angrick
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Emily Mugler
- Department of Neurology, Northwestern University, Chicago, IL, United States
| | - Matthew C. Tate
- Department of Neurosurgery, Northwestern University, Chicago, IL, United States
| | - Matthew A. Goldrick
- Department of Linguistics, Northwestern University, Chicago, IL, United States
| | - Dean J. Krusienski
- Biomedical Engineering Department, Virginia Commonwealth University, Richmond, VA, United States
| | - Marc W. Slutzky
- Department of Neurology, Northwestern University, Chicago, IL, United States
- Department of Physiology, Northwestern University, Chicago, IL, United States
- Department of Physical Medicine & Rehabilitation, Northwestern University, Chicago, IL, United States
| | - Tanja Schultz
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| |
Collapse
|
50
|
Salari E, Freudenburg ZV, Branco MP, Aarnoutse EJ, Vansteensel MJ, Ramsey NF. Classification of Articulator Movements and Movement Direction from Sensorimotor Cortex Activity. Sci Rep 2019; 9:14165. [PMID: 31578420 PMCID: PMC6775133 DOI: 10.1038/s41598-019-50834-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 09/11/2019] [Indexed: 12/21/2022] Open
Abstract
For people suffering from severe paralysis, communication can be difficult or nearly impossible. Technology systems called brain-computer interfaces (BCIs) are being developed to assist these people with communication by using their brain activity to control a computer without any muscle activity. To benefit the development of BCIs that employ neural activity related to speech, we investigated if neural activity patterns related to different articulator movements can be distinguished from each other. We recorded with electrocorticography (ECoG), the neural activity related to different articulator movements in 4 epilepsy patients and classified which articulator participants moved based on the sensorimotor cortex activity patterns. The same was done for different movement directions of a single articulator, the tongue. In both experiments highly accurate classification was obtained, on average 92% for different articulators and 85% for different tongue directions. Furthermore, the data show that only a small part of the sensorimotor cortex is needed for classification (ca. 1 cm2). We show that recordings from small parts of the sensorimotor cortex contain information about different articulator movements which might be used for BCI control. Our results are of interest for BCI systems that aim to decode neural activity related to (actual or attempted) movements from a contained cortical area.
Collapse
Affiliation(s)
- E Salari
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Z V Freudenburg
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - M P Branco
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - E J Aarnoutse
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - M J Vansteensel
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - N F Ramsey
- UMC Utrecht Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|