1
|
Vitória MA, Fernandes FG, van den Boom M, Ramsey N, Raemaekers M. Decoding Single and Paired Phonemes Using 7T Functional MRI. Brain Topogr 2024; 37:731-747. [PMID: 38261272 PMCID: PMC11393141 DOI: 10.1007/s10548-024-01034-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/12/2024] [Indexed: 01/24/2024]
Abstract
Several studies have shown that mouth movements related to the pronunciation of individual phonemes are represented in the sensorimotor cortex. This would theoretically allow for brain computer interfaces that are capable of decoding continuous speech by training classifiers based on the activity in the sensorimotor cortex related to the production of individual phonemes. To address this, we investigated the decodability of trials with individual and paired phonemes (pronounced consecutively with one second interval) using activity in the sensorimotor cortex. Fifteen participants pronounced 3 different phonemes and 3 combinations of two of the same phonemes in a 7T functional MRI experiment. We confirmed that support vector machine (SVM) classification of single and paired phonemes was possible. Importantly, by combining classifiers trained on single phonemes, we were able to classify paired phonemes with an accuracy of 53% (33% chance level), demonstrating that activity of isolated phonemes is present and distinguishable in combined phonemes. A SVM searchlight analysis showed that the phoneme representations are widely distributed in the ventral sensorimotor cortex. These findings provide insights about the neural representations of single and paired phonemes. Furthermore, it supports the notion that speech BCI may be feasible based on machine learning algorithms trained on individual phonemes using intracranial electrode grids.
Collapse
Affiliation(s)
- Maria Araújo Vitória
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Francisco Guerreiro Fernandes
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Max van den Boom
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Nick Ramsey
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mathijs Raemaekers
- Brain Center Rudolf Magnus, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
2
|
Canny E, Vansteensel MJ, van der Salm SMA, Müller-Putz GR, Berezutskaya J. Boosting brain-computer interfaces with functional electrical stimulation: potential applications in people with locked-in syndrome. J Neuroeng Rehabil 2023; 20:157. [PMID: 37980536 PMCID: PMC10656959 DOI: 10.1186/s12984-023-01272-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/23/2023] [Indexed: 11/20/2023] Open
Abstract
Individuals with a locked-in state live with severe whole-body paralysis that limits their ability to communicate with family and loved ones. Recent advances in brain-computer interface (BCI) technology have presented a potential alternative for these people to communicate by detecting neural activity associated with attempted hand or speech movements and translating the decoded intended movements to a control signal for a computer. A technique that could potentially enrich the communication capacity of BCIs is functional electrical stimulation (FES) of paralyzed limbs and face to restore body and facial movements of paralyzed individuals, allowing to add body language and facial expression to communication BCI utterances. Here, we review the current state of the art of existing BCI and FES work in people with paralysis of body and face and propose that a combined BCI-FES approach, which has already proved successful in several applications in stroke and spinal cord injury, can provide a novel promising mode of communication for locked-in individuals.
Collapse
Affiliation(s)
- Evan Canny
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Mariska J Vansteensel
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Sandra M A van der Salm
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Gernot R Müller-Putz
- Institute of Neural Engineering, Laboratory of Brain-Computer Interfaces, Graz University of Technology, Graz, Austria
| | - Julia Berezutskaya
- Department of Neurology and Neurosurgery, Brain Center, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
3
|
Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J Neural Eng 2023; 20:056010. [PMID: 37467739 PMCID: PMC10510111 DOI: 10.1088/1741-2552/ace8be] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 07/12/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Objective.Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.Approach.In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.Main results.We show that (1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; (2) individual word decoding in reconstructed speech achieves 92%-100% accuracy (chance level is 8%); (3) direct reconstruction from sensorimotor brain activity produces intelligible speech.Significance.These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Mariska J Vansteensel
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Erik J Aarnoutse
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Marcel A J van Gerven
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| |
Collapse
|
4
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
5
|
Berezutskaya J, Ambrogioni L, Ramsey NF, van Gerven MAJ. Towards Naturalistic Speech Decoding from Intracranial Brain Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:3100-3104. [PMID: 36085779 DOI: 10.1109/embc48229.2022.9871301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech decoding from brain activity can enable development of brain-computer interfaces (BCIs) to restore naturalistic communication in paralyzed patients. Previous work has focused on development of decoding models from isolated speech data with a clean background and multiple repetitions of the material. In this study, we describe a novel approach to speech decoding that relies on a generative adversarial neural network (GAN) to reconstruct speech from brain data recorded during a naturalistic speech listening task (watching a movie). We compared the GAN-based approach, where reconstruction was done from the compressed latent representation of sound decoded from the brain, with several baseline models that reconstructed sound spectrogram directly. We show that the novel approach provides more accurate reconstructions compared to the baselines. These results underscore the potential of GAN models for speech decoding in naturalistic noisy environments and further advancing of BCIs for naturalistic communication. Clinical Relevance - This study presents a novel speech decoding paradigm that combines advances in deep learning, speech synthesis and neural engineering, and has the potential to advance the field of BCI for severely paralyzed individuals.
Collapse
|
6
|
Pandarinath C, Bensmaia SJ. The science and engineering behind sensitized brain-controlled bionic hands. Physiol Rev 2022; 102:551-604. [PMID: 34541898 PMCID: PMC8742729 DOI: 10.1152/physrev.00034.2020] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/07/2021] [Accepted: 09/13/2021] [Indexed: 12/13/2022] Open
Abstract
Advances in our understanding of brain function, along with the development of neural interfaces that allow for the monitoring and activation of neurons, have paved the way for brain-machine interfaces (BMIs), which harness neural signals to reanimate the limbs via electrical activation of the muscles or to control extracorporeal devices, thereby bypassing the muscles and senses altogether. BMIs consist of reading out motor intent from the neuronal responses monitored in motor regions of the brain and executing intended movements with bionic limbs, reanimated limbs, or exoskeletons. BMIs also allow for the restoration of the sense of touch by electrically activating neurons in somatosensory regions of the brain, thereby evoking vivid tactile sensations and conveying feedback about object interactions. In this review, we discuss the neural mechanisms of motor control and somatosensation in able-bodied individuals and describe approaches to use neuronal responses as control signals for movement restoration and to activate residual sensory pathways to restore touch. Although the focus of the review is on intracortical approaches, we also describe alternative signal sources for control and noninvasive strategies for sensory restoration.
Collapse
Affiliation(s)
- Chethan Pandarinath
- Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, Georgia
- Department of Neurosurgery, Emory University, Atlanta, Georgia
| | - Sliman J Bensmaia
- Department of Organismal Biology and Anatomy, University of Chicago, Chicago, Illinois
- Committee on Computational Neuroscience, University of Chicago, Chicago, Illinois
- Grossman Institute for Neuroscience, Quantitative Biology, and Human Behavior, University of Chicago, Chicago, Illinois
| |
Collapse
|
7
|
Glanz O, Hader M, Schulze-Bonhage A, Auer P, Ball T. A Study of Word Complexity Under Conditions of Non-experimental, Natural Overt Speech Production Using ECoG. Front Hum Neurosci 2022; 15:711886. [PMID: 35185491 PMCID: PMC8854223 DOI: 10.3389/fnhum.2021.711886] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/15/2021] [Indexed: 11/25/2022] Open
Abstract
The linguistic complexity of words has largely been studied on the behavioral level and in experimental settings. Only little is known about the neural processes underlying it in uninstructed, spontaneous conversations. We built up a multimodal neurolinguistic corpus composed of synchronized audio, video, and electrocorticographic (ECoG) recordings from the fronto-temporo-parietal cortex to address this phenomenon based on uninstructed, spontaneous speech production. We performed extensive linguistic annotations of the language material and calculated word complexity using several numeric parameters. We orthogonalized the parameters with the help of a linear regression model. Then, we correlated the spectral components of neural activity with the individual linguistic parameters and with the residuals of the linear regression model, and compared the results. The proportional relation between the number of consonants and vowels, which was the most informative parameter with regard to the neural representation of word complexity, showed effects in two areas: the frontal one was at the junction of the premotor cortex, the prefrontal cortex, and Brodmann area 44. The postcentral one lay directly above the lateral sulcus and comprised the ventral central sulcus, the parietal operculum and the adjacent inferior parietal cortex. Beyond the physiological findings summarized here, our methods may be useful for those interested in ways of studying neural effects related to natural language production and in surmounting the intrinsic problem of collinearity between multiple features of spontaneously spoken material.
Collapse
Affiliation(s)
- Olga Glanz
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Neurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Olga Glanz (Iljina),
| | - Marina Hader
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
| | - Andreas Schulze-Bonhage
- Department of Neurosurgery, Faculty of Medicine, Epilepsy Center, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| | - Peter Auer
- GRK 1624 “Frequency Effects in Language,” University of Freiburg, Freiburg, Germany
- Department of German Linguistics, University of Freiburg, Freiburg, Germany
- The Hermann Paul School of Linguistics, University of Freiburg, Freiburg, Germany
| | - Tonio Ball
- BrainLinks-BrainTools, University of Freiburg, Freiburg, Germany
- Translational Neurotechnology Lab, Department of Neurosurgery, Faculty of Medicine, Medical Center—University of Freiburg, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
- *Correspondence: Tonio Ball,
| |
Collapse
|
8
|
Luo S, Rabbani Q, Crone NE. Brain-Computer Interface: Applications to Speech Decoding and Synthesis to Augment Communication. Neurotherapeutics 2022; 19:263-273. [PMID: 35099768 PMCID: PMC9130409 DOI: 10.1007/s13311-022-01190-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 01/03/2023] Open
Abstract
Damage or degeneration of motor pathways necessary for speech and other movements, as in brainstem strokes or amyotrophic lateral sclerosis (ALS), can interfere with efficient communication without affecting brain structures responsible for language or cognition. In the worst-case scenario, this can result in the locked in syndrome (LIS), a condition in which individuals cannot initiate communication and can only express themselves by answering yes/no questions with eye blinks or other rudimentary movements. Existing augmentative and alternative communication (AAC) devices that rely on eye tracking can improve the quality of life for people with this condition, but brain-computer interfaces (BCIs) are also increasingly being investigated as AAC devices, particularly when eye tracking is too slow or unreliable. Moreover, with recent and ongoing advances in machine learning and neural recording technologies, BCIs may offer the only means to go beyond cursor control and text generation on a computer, to allow real-time synthesis of speech, which would arguably offer the most efficient and expressive channel for communication. The potential for BCI speech synthesis has only recently been realized because of seminal studies of the neuroanatomical and neurophysiological underpinnings of speech production using intracranial electrocorticographic (ECoG) recordings in patients undergoing epilepsy surgery. These studies have shown that cortical areas responsible for vocalization and articulation are distributed over a large area of ventral sensorimotor cortex, and that it is possible to decode speech and reconstruct its acoustics from ECoG if these areas are recorded with sufficiently dense and comprehensive electrode arrays. In this article, we review these advances, including the latest neural decoding strategies that range from deep learning models to the direct concatenation of speech units. We also discuss state-of-the-art vocoders that are integral in constructing natural-sounding audio waveforms for speech BCIs. Finally, this review outlines some of the challenges ahead in directly synthesizing speech for patients with LIS.
Collapse
Affiliation(s)
- Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
9
|
Tam WK, Wu T, Zhao Q, Keefer E, Yang Z. Human motor decoding from neural signals: a review. BMC Biomed Eng 2019; 1:22. [PMID: 32903354 PMCID: PMC7422484 DOI: 10.1186/s42490-019-0022-z] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 07/21/2019] [Indexed: 01/24/2023] Open
Abstract
Many people suffer from movement disability due to amputation or neurological diseases. Fortunately, with modern neurotechnology now it is possible to intercept motor control signals at various points along the neural transduction pathway and use that to drive external devices for communication or control. Here we will review the latest developments in human motor decoding. We reviewed the various strategies to decode motor intention from human and their respective advantages and challenges. Neural control signals can be intercepted at various points in the neural signal transduction pathway, including the brain (electroencephalography, electrocorticography, intracortical recordings), the nerves (peripheral nerve recordings) and the muscles (electromyography). We systematically discussed the sites of signal acquisition, available neural features, signal processing techniques and decoding algorithms in each of these potential interception points. Examples of applications and the current state-of-the-art performance were also reviewed. Although great strides have been made in human motor decoding, we are still far away from achieving naturalistic and dexterous control like our native limbs. Concerted efforts from material scientists, electrical engineers, and healthcare professionals are needed to further advance the field and make the technology widely available in clinical use.
Collapse
Affiliation(s)
- Wing-kin Tam
- Department of Biomedical Engineering, University of Minnesota Twin Cities, 7-105 Hasselmo Hall, 312 Church St. SE, Minnesota, 55455 USA
| | - Tong Wu
- Department of Biomedical Engineering, University of Minnesota Twin Cities, 7-105 Hasselmo Hall, 312 Church St. SE, Minnesota, 55455 USA
| | - Qi Zhao
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, 4-192 Keller Hall, 200 Union Street SE, Minnesota, 55455 USA
| | - Edward Keefer
- Nerves Incorporated, Dallas, TX P. O. Box 141295 USA
| | - Zhi Yang
- Department of Biomedical Engineering, University of Minnesota Twin Cities, 7-105 Hasselmo Hall, 312 Church St. SE, Minnesota, 55455 USA
| |
Collapse
|
10
|
Towards reconstructing intelligible speech from the human auditory cortex. Sci Rep 2019; 9:874. [PMID: 30696881 PMCID: PMC6351601 DOI: 10.1038/s41598-018-37359-z] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 11/30/2018] [Indexed: 11/08/2022] Open
Abstract
Auditory stimulus reconstruction is a technique that finds the best approximation of the acoustic stimulus from the population of evoked neural activity. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible in both overt and covert conditions. However, the low quality of the reconstructed speech has severely limited the utility of this method for brain-computer interface (BCI) applications. To advance the state-of-the-art in speech neuroprosthesis, we combined the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex. We investigated the dependence of reconstruction accuracy on linear and nonlinear (deep neural network) regression methods and the acoustic representation that is used as the target of reconstruction, including auditory spectrogram and speech synthesis parameters. In addition, we compared the reconstruction accuracy from low and high neural frequency ranges. Our results show that a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies achieves the highest subjective and objective scores on a digit recognition task, improving the intelligibility by 65% over the baseline method which used linear regression to reconstruct the auditory spectrogram. These results demonstrate the efficacy of deep learning and speech synthesis algorithms for designing the next generation of speech BCI systems, which not only can restore communications for paralyzed patients but also have the potential to transform human-computer interaction technologies.
Collapse
|
11
|
Cooney C, Folli R, Coyle D. Neurolinguistics Research Advancing Development of a Direct-Speech Brain-Computer Interface. iScience 2018; 8:103-125. [PMID: 30296666 PMCID: PMC6174918 DOI: 10.1016/j.isci.2018.09.016] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 09/04/2018] [Accepted: 09/18/2018] [Indexed: 01/09/2023] Open
Abstract
A direct-speech brain-computer interface (DS-BCI) acquires neural signals corresponding to imagined speech, then processes and decodes these signals to produce a linguistic output in the form of phonemes, words, or sentences. Recent research has shown the potential of neurolinguistics to enhance decoding approaches to imagined speech with the inclusion of semantics and phonology in experimental procedures. As neurolinguistics research findings are beginning to be incorporated within the scope of DS-BCI research, it is our view that a thorough understanding of imagined speech, and its relationship with overt speech, must be considered an integral feature of research in this field. With a focus on imagined speech, we provide a review of the most important neurolinguistics research informing the field of DS-BCI and suggest how this research may be utilized to improve current experimental protocols and decoding techniques. Our review of the literature supports a cross-disciplinary approach to DS-BCI research, in which neurolinguistics concepts and methods are utilized to aid development of a naturalistic mode of communication.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
12
|
Steadman MA, Sumner CJ. Changes in Neuronal Representations of Consonants in the Ascending Auditory System and Their Role in Speech Recognition. Front Neurosci 2018; 12:671. [PMID: 30369863 PMCID: PMC6194309 DOI: 10.3389/fnins.2018.00671] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 09/06/2018] [Indexed: 11/25/2022] Open
Abstract
A fundamental task of the ascending auditory system is to produce representations that facilitate the recognition of complex sounds. This is particularly challenging in the context of acoustic variability, such as that between different talkers producing the same phoneme. These representations are transformed as information is propagated throughout the ascending auditory system from the inner ear to the auditory cortex (AI). Investigating these transformations and their role in speech recognition is key to understanding hearing impairment and the development of future clinical interventions. Here, we obtained neural responses to an extensive set of natural vowel-consonant-vowel phoneme sequences, each produced by multiple talkers, in three stages of the auditory processing pathway. Auditory nerve (AN) representations were simulated using a model of the peripheral auditory system and extracellular neuronal activity was recorded in the inferior colliculus (IC) and primary auditory cortex (AI) of anaesthetized guinea pigs. A classifier was developed to examine the efficacy of these representations for recognizing the speech sounds. Individual neurons convey progressively less information from AN to AI. Nonetheless, at the population level, representations are sufficiently rich to facilitate recognition of consonants with a high degree of accuracy at all stages indicating a progression from a dense, redundant representation to a sparse, distributed one. We examined the timescale of the neural code for consonant recognition and found that optimal timescales increase throughout the ascending auditory system from a few milliseconds in the periphery to several tens of milliseconds in the cortex. Despite these longer timescales, we found little evidence to suggest that representations up to the level of AI become increasingly invariant to across-talker differences. Instead, our results support the idea that the role of the subcortical auditory system is one of dimensionality expansion, which could provide a basis for flexible classification of arbitrary speech sounds.
Collapse
Affiliation(s)
- Mark A. Steadman
- MRC Institute of Hearing Research, School of Medicine, The University of Nottingham, Nottingham, United Kingdom
- Department of Bioengineering, Imperial College London, London, United Kingdom
| | - Christian J. Sumner
- MRC Institute of Hearing Research, School of Medicine, The University of Nottingham, Nottingham, United Kingdom
| |
Collapse
|
13
|
Varatharajah Y, Berry B, Cimbalnik J, Kremen V, Van Gompel J, Stead M, Brinkmann B, Iyer R, Worrell G. Integrating artificial intelligence with real-time intracranial EEG monitoring to automate interictal identification of seizure onset zones in focal epilepsy. J Neural Eng 2018; 15:046035. [PMID: 29855436 PMCID: PMC6108188 DOI: 10.1088/1741-2552/aac960] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
OBJECTIVE An ability to map seizure-generating brain tissue, i.e. the seizure onset zone (SOZ), without recording actual seizures could reduce the duration of invasive EEG monitoring for patients with drug-resistant epilepsy. A widely-adopted practice in the literature is to compare the incidence (events/time) of putative pathological electrophysiological biomarkers associated with epileptic brain tissue with the SOZ determined from spontaneous seizures recorded with intracranial EEG, primarily using a single biomarker. Clinical translation of the previous efforts suffers from their inability to generalize across multiple patients because of (a) the inter-patient variability and (b) the temporal variability in the epileptogenic activity. APPROACH Here, we report an artificial intelligence-based approach for combining multiple interictal electrophysiological biomarkers and their temporal characteristics as a way of accounting for the above barriers and show that it can reliably identify seizure onset zones in a study cohort of 82 patients who underwent evaluation for drug-resistant epilepsy. MAIN RESULTS Our investigation provides evidence that utilizing the complementary information provided by multiple electrophysiological biomarkers and their temporal characteristics can significantly improve the localization potential compared to previously published single-biomarker incidence-based approaches, resulting in an average area under ROC curve (AUC) value of 0.73 in a cohort of 82 patients. Our results also suggest that recording durations between 90 min and 2 h are sufficient to localize SOZs with accuracies that may prove clinically relevant. SIGNIFICANCE The successful validation of our approach on a large cohort of 82 patients warrants future investigation on the feasibility of utilizing intra-operative EEG monitoring and artificial intelligence to localize epileptogenic brain tissue. Broadly, our study demonstrates the use of artificial intelligence coupled with careful feature engineering in augmenting clinical decision making.
Collapse
Affiliation(s)
- Yogatheesan Varatharajah
- Electrical and Computer Engineering, University of Illinois, Urbana, IL 61801, United States of America
| | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Martin S, Iturrate I, Millán JDR, Knight RT, Pasley BN. Decoding Inner Speech Using Electrocorticography: Progress and Challenges Toward a Speech Prosthesis. Front Neurosci 2018; 12:422. [PMID: 29977189 PMCID: PMC6021529 DOI: 10.3389/fnins.2018.00422] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 06/04/2018] [Indexed: 01/01/2023] Open
Abstract
Certain brain disorders resulting from brainstem infarcts, traumatic brain injury, cerebral palsy, stroke, and amyotrophic lateral sclerosis, limit verbal communication despite the patient being fully aware. People that cannot communicate due to neurological disorders would benefit from a system that can infer internal speech directly from brain signals. In this review article, we describe the state of the art in decoding inner speech, ranging from early acoustic sound features, to higher order speech units. We focused on intracranial recordings, as this technique allows monitoring brain activity with high spatial, temporal, and spectral resolution, and therefore is a good candidate to investigate inner speech. Despite intense efforts, investigating how the human cortex encodes inner speech remains an elusive challenge, due to the lack of behavioral and observable measures. We emphasize various challenges commonly encountered when investigating inner speech decoding, and propose potential solutions in order to get closer to a natural speech assistive device.
Collapse
Affiliation(s)
- Stephanie Martin
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Iñaki Iturrate
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - José del R. Millán
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Robert T. Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychology, University of California, Berkeley, Berkeley, CA, United States
| | - Brian N. Pasley
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
15
|
Ibayashi K, Kunii N, Matsuo T, Ishishita Y, Shimada S, Kawai K, Saito N. Decoding Speech With Integrated Hybrid Signals Recorded From the Human Ventral Motor Cortex. Front Neurosci 2018; 12:221. [PMID: 29674950 PMCID: PMC5895763 DOI: 10.3389/fnins.2018.00221] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2017] [Accepted: 03/20/2018] [Indexed: 11/13/2022] Open
Abstract
Restoration of speech communication for locked-in patients by means of brain computer interfaces (BCIs) is currently an important area of active research. Among the neural signals obtained from intracranial recordings, single/multi-unit activity (SUA/MUA), local field potential (LFP), and electrocorticography (ECoG) are good candidates for an input signal for BCIs. However, the question of which signal or which combination of the three signal modalities is best suited for decoding speech production remains unverified. In order to record SUA, LFP, and ECoG simultaneously from a highly localized area of human ventral sensorimotor cortex (vSMC), we fabricated an electrode the size of which was 7 by 13 mm containing sparsely arranged microneedle and conventional macro contacts. We determined which signal modality is the most capable of decoding speech production, and tested if the combination of these signals could improve the decoding accuracy of spoken phonemes. Feature vectors were constructed from spike frequency obtained from SUAs and event-related spectral perturbation derived from ECoG and LFP signals, then input to the decoder. The results showed that the decoding accuracy for five spoken vowels was highest when features from multiple signals were combined and optimized for each subject, and reached 59% when averaged across all six subjects. This result suggests that multi-scale signals convey complementary information for speech articulation. The current study demonstrated that simultaneous recording of multi-scale neuronal activities could raise decoding accuracy even though the recording area is limited to a small portion of cortex, which is advantageous for future implementation of speech-assisting BCIs.
Collapse
Affiliation(s)
- Kenji Ibayashi
- Department of Neurosurgery, The University of Tokyo Hospital, Tokyo, Japan
| | - Naoto Kunii
- Department of Neurosurgery, The University of Tokyo Hospital, Tokyo, Japan
| | - Takeshi Matsuo
- Department of Neurosurgery, Tokyo Metropolitan Neurological Hospital, Tokyo, Japan
| | - Yohei Ishishita
- Department of Neurosurgery, The University of Tokyo Hospital, Tokyo, Japan
| | - Seijiro Shimada
- Department of Neurosurgery, The University of Tokyo Hospital, Tokyo, Japan
| | - Kensuke Kawai
- Department of Neurosurgery, Jichi Medical University, Tochigi, Japan
| | - Nobuhito Saito
- Department of Neurosurgery, The University of Tokyo Hospital, Tokyo, Japan
| |
Collapse
|
16
|
Brumberg JS, Pitt KM, Mantie-Kozlowski A, Burnison JD. Brain-Computer Interfaces for Augmentative and Alternative Communication: A Tutorial. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2018; 27:1-12. [PMID: 29318256 PMCID: PMC5968329 DOI: 10.1044/2017_ajslp-16-0244] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Accepted: 08/14/2017] [Indexed: 05/10/2023]
Abstract
PURPOSE Brain-computer interfaces (BCIs) have the potential to improve communication for people who require but are unable to use traditional augmentative and alternative communication (AAC) devices. As BCIs move toward clinical practice, speech-language pathologists (SLPs) will need to consider their appropriateness for AAC intervention. METHOD This tutorial provides a background on BCI approaches to provide AAC specialists foundational knowledge necessary for clinical application of BCI. Tutorial descriptions were generated based on a literature review of BCIs for restoring communication. RESULTS The tutorial responses directly address 4 major areas of interest for SLPs who specialize in AAC: (a) the current state of BCI with emphasis on SLP scope of practice (including the subareas: the way in which individuals access AAC with BCI, the efficacy of BCI for AAC, and the effects of fatigue), (b) populations for whom BCI is best suited, (c) the future of BCI as an addition to AAC access strategies, and (d) limitations of BCI. CONCLUSION Current BCIs have been designed as access methods for AAC rather than a replacement; therefore, SLPs can use existing knowledge in AAC as a starting point for clinical application. Additional training is recommended to stay updated with rapid advances in BCI.
Collapse
Affiliation(s)
- Jonathan S. Brumberg
- Department of Speech-Language-Hearing: Sciences and Disorders, Neuroscience Graduate Program, The University of Kansas, Lawrence
| | - Kevin M. Pitt
- Department of Speech-Language-Hearing: Sciences and Disorders, The University of Kansas, Lawrence
| | | | | |
Collapse
|
17
|
Decoding spoken phonemes from sensorimotor cortex with high-density ECoG grids. Neuroimage 2017; 180:301-311. [PMID: 28993231 DOI: 10.1016/j.neuroimage.2017.10.011] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Revised: 10/04/2017] [Accepted: 10/06/2017] [Indexed: 12/19/2022] Open
Abstract
For people who cannot communicate due to severe paralysis or involuntary movements, technology that decodes intended speech from the brain may offer an alternative means of communication. If decoding proves to be feasible, intracranial Brain-Computer Interface systems can be developed which are designed to translate decoded speech into computer generated speech or to instructions for controlling assistive devices. Recent advances suggest that such decoding may be feasible from sensorimotor cortex, but it is not clear how this challenge can be approached best. One approach is to identify and discriminate elements of spoken language, such as phonemes. We investigated feasibility of decoding four spoken phonemes from the sensorimotor face area, using electrocorticographic signals obtained with high-density electrode grids. Several decoding algorithms including spatiotemporal matched filters, spatial matched filters and support vector machines were compared. Phonemes could be classified correctly at a level of over 75% with spatiotemporal matched filters. Support Vector machine analysis reached a similar level, but spatial matched filters yielded significantly lower scores. The most informative electrodes were clustered along the central sulcus. Highest scores were achieved from time windows centered around voice onset time, but a 500 ms window before onset time could also be classified significantly. The results suggest that phoneme production involves a sequence of robust and reproducible activity patterns on the cortical surface. Importantly, decoding requires inclusion of temporal information to capture the rapid shifts of robust patterns associated with articulator muscle group contraction during production of a phoneme. The high classification scores are likely to be enabled by the use of high density grids, and by the use of discrete phonemes. Implications for use in Brain-Computer Interfaces are discussed.
Collapse
|
18
|
Holdgraf CR, Rieger JW, Micheli C, Martin S, Knight RT, Theunissen FE. Encoding and Decoding Models in Cognitive Electrophysiology. Front Syst Neurosci 2017; 11:61. [PMID: 29018336 PMCID: PMC5623038 DOI: 10.3389/fnsys.2017.00061] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 08/07/2017] [Indexed: 11/13/2022] Open
Abstract
Cognitive neuroscience has seen rapid growth in the size and complexity of data recorded from the human brain as well as in the computational tools available to analyze this data. This data explosion has resulted in an increased use of multivariate, model-based methods for asking neuroscience questions, allowing scientists to investigate multiple hypotheses with a single dataset, to use complex, time-varying stimuli, and to study the human brain under more naturalistic conditions. These tools come in the form of "Encoding" models, in which stimulus features are used to model brain activity, and "Decoding" models, in which neural features are used to generated a stimulus output. Here we review the current state of encoding and decoding models in cognitive electrophysiology and provide a practical guide toward conducting experiments and analyses in this emerging field. Our examples focus on using linear models in the study of human language and audition. We show how to calculate auditory receptive fields from natural sounds as well as how to decode neural recordings to predict speech. The paper aims to be a useful tutorial to these approaches, and a practical introduction to using machine learning and applied statistics to build models of neural activity. The data analytic approaches we discuss may also be applied to other sensory modalities, motor systems, and cognitive systems, and we cover some examples in these areas. In addition, a collection of Jupyter notebooks is publicly available as a complement to the material covered in this paper, providing code examples and tutorials for predictive modeling in python. The aim is to provide a practical understanding of predictive modeling of human brain data and to propose best-practices in conducting these analyses.
Collapse
Affiliation(s)
- Christopher R. Holdgraf
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Office of the Vice Chancellor for Research, Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, United States
| | - Jochem W. Rieger
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
| | - Cristiano Micheli
- Department of Psychology, Carl-von-Ossietzky University, Oldenburg, Germany
- Institut des Sciences Cognitives Marc Jeannerod, Lyon, France
| | - Stephanie Martin
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Defitech Chair in Brain-Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Robert T. Knight
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Frederic E. Theunissen
- Department of Psychology, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychology, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
19
|
Wang X, Gkogkidis CA, Iljina O, Fiederer LDJ, Henle C, Mader I, Kaminsky J, Stieglitz T, Gierthmuehlen M, Ball T. Mapping the fine structure of cortical activity with different micro-ECoG electrode array geometries. J Neural Eng 2017; 14:056004. [DOI: 10.1088/1741-2552/aa785e] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
20
|
Pailla T, Jiang W, Dichter B, Chang EF, Gilja V. ECoG data analyses to inform closed-loop BCI experiments for speech-based prosthetic applications. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2016:5713-5716. [PMID: 28269552 DOI: 10.1109/embc.2016.7592024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Brain Computer Interfaces (BCIs) assist individuals with motor disabilities by enabling them to control prosthetic devices with their neural activity. Performance of closed-loop BCI systems can be improved by using design strategies that leverage structured and task-relevant neural activity. We use data from high density electrocorticography (ECoG) grids implanted in three subjects to study sensory-motor activity during an instructed speech task in which the subjects vocalized three cardinal vowel phonemes. We show how our findings relate to the current understanding of speech physiology and functional organization of human sensory-motor cortex. We investigate the effect of behavioral variations on parameters and performance of the decoding model. Our analyses suggest experimental design strategies that may be critical for speech-based BCI performance.
Collapse
|
21
|
Brumberg JS, Krusienski DJ, Chakrabarti S, Gunduz A, Brunner P, Ritaccio AL, Schalk G. Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task. PLoS One 2016; 11:e0166872. [PMID: 27875590 PMCID: PMC5119784 DOI: 10.1371/journal.pone.0166872] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 11/04/2016] [Indexed: 11/18/2022] Open
Abstract
How the human brain plans, executes, and monitors continuous and fluent speech has remained largely elusive. For example, previous research has defined the cortical locations most important for different aspects of speech function, but has not yet yielded a definition of the temporal progression of involvement of those locations as speech progresses either overtly or covertly. In this paper, we uncovered the spatio-temporal evolution of neuronal population-level activity related to continuous overt speech, and identified those locations that shared activity characteristics across overt and covert speech. Specifically, we asked subjects to repeat continuous sentences aloud or silently while we recorded electrical signals directly from the surface of the brain (electrocorticography (ECoG)). We then determined the relationship between cortical activity and speech output across different areas of cortex and at sub-second timescales. The results highlight a spatio-temporal progression of cortical involvement in the continuous speech process that initiates utterances in frontal-motor areas and ends with the monitoring of auditory feedback in superior temporal gyrus. Direct comparison of cortical activity related to overt versus covert conditions revealed a common network of brain regions involved in speech that may implement orthographic and phonological processing. Our results provide one of the first characterizations of the spatiotemporal electrophysiological representations of the continuous speech process, and also highlight the common neural substrate of overt and covert speech. These results thereby contribute to a refined understanding of speech functions in the human brain.
Collapse
Affiliation(s)
- Jonathan S. Brumberg
- Department of Speech-Language-Hearing: Sciences & Disorders, University of Kansas, Lawrence, KS, United States of America
- * E-mail:
| | - Dean J. Krusienski
- Department of Electrical & Computer Engineering, Old Dominion University, Norfolk, VA, United States of America
| | - Shreya Chakrabarti
- Department of Electrical & Computer Engineering, Old Dominion University, Norfolk, VA, United States of America
| | - Aysegul Gunduz
- J. Crayton Pruitt Family Dept. of Biomedical Engineering, University of Florida, Gainesville, FL, United States of America
| | - Peter Brunner
- National Center for Adaptive Neurotechnologies, Wadsworth Center, New York State Department of Health, Albany, NY, United States of America
- Department of Neurology, Albany Medical College, Albany, NY, United States of America
| | - Anthony L. Ritaccio
- Department of Neurology, Albany Medical College, Albany, NY, United States of America
| | - Gerwin Schalk
- National Center for Adaptive Neurotechnologies, Wadsworth Center, New York State Department of Health, Albany, NY, United States of America
- Department of Neurology, Albany Medical College, Albany, NY, United States of America
| |
Collapse
|
22
|
Wang NXR, Olson JD, Ojemann JG, Rao RPN, Brunton BW. Unsupervised Decoding of Long-Term, Naturalistic Human Neural Recordings with Automated Video and Audio Annotations. Front Hum Neurosci 2016; 10:165. [PMID: 27148018 PMCID: PMC4838634 DOI: 10.3389/fnhum.2016.00165] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Accepted: 04/01/2016] [Indexed: 11/13/2022] Open
Abstract
Fully automated decoding of human activities and intentions from direct neural recordings is a tantalizing challenge in brain-computer interfacing. Implementing Brain Computer Interfaces (BCIs) outside carefully controlled experiments in laboratory settings requires adaptive and scalable strategies with minimal supervision. Here we describe an unsupervised approach to decoding neural states from naturalistic human brain recordings. We analyzed continuous, long-term electrocorticography (ECoG) data recorded over many days from the brain of subjects in a hospital room, with simultaneous audio and video recordings. We discovered coherent clusters in high-dimensional ECoG recordings using hierarchical clustering and automatically annotated them using speech and movement labels extracted from audio and video. To our knowledge, this represents the first time techniques from computer vision and speech processing have been used for natural ECoG decoding. Interpretable behaviors were decoded from ECoG data, including moving, speaking and resting; the results were assessed by comparison with manual annotation. Discovered clusters were projected back onto the brain revealing features consistent with known functional areas, opening the door to automated functional brain mapping in natural settings.
Collapse
Affiliation(s)
- Nancy X R Wang
- Department of Computer Science and Engineering, University of WashingtonSeattle, WA, USA; Institute for Neuroengineering, University of WashingtonSeattle, WA, USA; eScience Institute, University of WashingtonSeattle, WA, USA; Center for Sensorimotor Neural Engineering, University of WashingtonSeattle, WA, USA
| | - Jared D Olson
- Center for Sensorimotor Neural Engineering, University of WashingtonSeattle, WA, USA; Department of Rehabilitation Medicine, University of WashingtonSeattle, WA, USA
| | - Jeffrey G Ojemann
- Institute for Neuroengineering, University of WashingtonSeattle, WA, USA; Center for Sensorimotor Neural Engineering, University of WashingtonSeattle, WA, USA; Department of Neurological Surgery, University of WashingtonSeattle, WA, USA
| | - Rajesh P N Rao
- Department of Computer Science and Engineering, University of WashingtonSeattle, WA, USA; Institute for Neuroengineering, University of WashingtonSeattle, WA, USA; Center for Sensorimotor Neural Engineering, University of WashingtonSeattle, WA, USA
| | - Bingni W Brunton
- Institute for Neuroengineering, University of WashingtonSeattle, WA, USA; eScience Institute, University of WashingtonSeattle, WA, USA; Department of Biology, University of WashingtonSeattle, WA, USA
| |
Collapse
|
23
|
Herff C, Heger D, de Pesters A, Telaar D, Brunner P, Schalk G, Schultz T. Brain-to-text: decoding spoken phrases from phone representations in the brain. Front Neurosci 2015; 9:217. [PMID: 26124702 PMCID: PMC4464168 DOI: 10.3389/fnins.2015.00217] [Citation(s) in RCA: 144] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Accepted: 05/18/2015] [Indexed: 11/24/2022] Open
Abstract
It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech.
Collapse
Affiliation(s)
- Christian Herff
- Cognitive Systems Lab, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology Karlsruhe, Germany
| | - Dominic Heger
- Cognitive Systems Lab, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology Karlsruhe, Germany
| | - Adriana de Pesters
- New York State Department of Health, National Center for Adaptive Neurotechnologies, Wadsworth Center Albany, NY, USA ; Department of Biomedical Sciences, State University of New York at Albany Albany, NY, USA
| | - Dominic Telaar
- Cognitive Systems Lab, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology Karlsruhe, Germany
| | - Peter Brunner
- New York State Department of Health, National Center for Adaptive Neurotechnologies, Wadsworth Center Albany, NY, USA ; Department of Neurology, Albany Medical College Albany, NY, USA
| | - Gerwin Schalk
- New York State Department of Health, National Center for Adaptive Neurotechnologies, Wadsworth Center Albany, NY, USA ; Department of Biomedical Sciences, State University of New York at Albany Albany, NY, USA ; Department of Neurology, Albany Medical College Albany, NY, USA
| | - Tanja Schultz
- Cognitive Systems Lab, Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology Karlsruhe, Germany
| |
Collapse
|
24
|
Lotte F, Brumberg JS, Brunner P, Gunduz A, Ritaccio AL, Guan C, Schalk G. Electrocorticographic representations of segmental features in continuous speech. Front Hum Neurosci 2015; 9:97. [PMID: 25759647 PMCID: PMC4338752 DOI: 10.3389/fnhum.2015.00097] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 02/06/2015] [Indexed: 11/25/2022] Open
Abstract
Acoustic speech output results from coordinated articulation of dozens of muscles, bones and cartilages of the vocal mechanism. While we commonly take the fluency and speed of our speech productions for granted, the neural mechanisms facilitating the requisite muscular control are not completely understood. Previous neuroimaging and electrophysiology studies of speech sensorimotor control has typically concentrated on speech sounds (i.e., phonemes, syllables and words) in isolation; sentence-length investigations have largely been used to inform coincident linguistic processing. In this study, we examined the neural representations of segmental features (place and manner of articulation, and voicing status) in the context of fluent, continuous speech production. We used recordings from the cortical surface [electrocorticography (ECoG)] to simultaneously evaluate the spatial topography and temporal dynamics of the neural correlates of speech articulation that may mediate the generation of hypothesized gestural or articulatory scores. We found that the representation of place of articulation involved broad networks of brain regions during all phases of speech production: preparation, execution and monitoring. In contrast, manner of articulation and voicing status were dominated by auditory cortical responses after speech had been initiated. These results provide a new insight into the articulatory and auditory processes underlying speech production in terms of their motor requirements and acoustic correlates.
Collapse
Affiliation(s)
| | - Jonathan S Brumberg
- Department of Speech-Language-Hearing, University of Kansas Lawrence, KS, USA
| | - Peter Brunner
- National Center for Adaptive Neurotechnologies, Wadsworth Center, New York State Department of Health Albany, NY, USA ; Department of Neurology, Albany Medical College Albany, NY, USA
| | - Aysegul Gunduz
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida Gainesville, FL, USA
| | | | - Cuntai Guan
- ASTAR Agency for Science, Technology and Research, Institute for Infocomm Research, Singapore Singapore
| | - Gerwin Schalk
- National Center for Adaptive Neurotechnologies, Wadsworth Center, New York State Department of Health Albany, NY, USA ; Department of Neurology, Albany Medical College Albany, NY, USA
| |
Collapse
|
25
|
Song C, Xu R, Hong B. Decoding of Chinese phoneme clusters using ECoG. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2014:1278-81. [PMID: 25570199 DOI: 10.1109/embc.2014.6943831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A finite set of phonetic units is used in human speech, but how our brain recognizes these units from speech streams is still largely unknown. The revealing of this neural mechanism may lead to the development of new types of speech brain computer interfaces (BCI) and computer speech recognition systems. In this study, we used electrocorticography (ECoG) signal from human cortex to decode phonetic units during the perception of continuous speech. By exploring the wavelet time-frequency features, we identified ECoG electrodes that have selective response to specific Chinese phonemes. Gamma and high-gamma power of these electrodes were further combined to separate sets of phonemes into clusters. The clustered organization largely coincided with phonological categories defined by the place of articulation and manner of articulation. These findings were incorporated into a decoding framework of Chinese phonemes clusters. Using support vector machine (SVM) classifier, we achieved consistent accuracies higher than chance level across five patients discriminating specific phonetic clusters, which suggests a promising direction of implementing a speech BCI.
Collapse
|
26
|
Kelly JW, Siewiorek DP, Smailagic A, Wang W. An Adaptive Filter for the Removal of Drifting Sinusoidal Noise Without a Reference. IEEE J Biomed Health Inform 2014; 20:213-21. [PMID: 25474814 DOI: 10.1109/jbhi.2014.2375318] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This paper presents a method for filtering sinusoidal noise with a variable bandwidth filter that is capable of tracking a sinusoid's drifting frequency. The method, which is based on the adaptive noise canceling (ANC) technique, will be referred to here as the adaptive sinusoid canceler (ASC). The ASC eliminates sinusoidal contamination by tracking its frequency and achieving a narrower bandwidth than typical notch filters. The detected frequency is used to digitally generate an internal reference instead of relying on an external one as ANC filters typically do. The filter's bandwidth adjusts to achieve faster and more accurate convergence. In this paper, the focus of the discussion and the data is physiological signals, specifically electrocorticographic (ECoG) neural data contaminated with power line noise, but the presented technique could be applicable to other recordings as well. On simulated data, the ASC was able to reliably track the noise's frequency, properly adjust its bandwidth, and outperform comparative methods including standard notch filters and an adaptive line enhancer. These results were reinforced by visual results obtained from real ECoG data. The ASC showed that it could be an effective method for increasing signal to noise ratio in the presence of drifting sinusoidal noise, which is of significant interest for biomedical applications.
Collapse
|
27
|
Mugler EM, Patton JL, Flint RD, Wright ZA, Schuele SU, Rosenow J, Shih JJ, Krusienski DJ, Slutzky MW. Direct classification of all American English phonemes using signals from functional speech motor cortex. J Neural Eng 2014; 11:035015. [PMID: 24836588 PMCID: PMC4097188 DOI: 10.1088/1741-2560/11/3/035015] [Citation(s) in RCA: 114] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE Although brain-computer interfaces (BCIs) can be used in several different ways to restore communication, communicative BCI has not approached the rate or efficiency of natural human speech. Electrocorticography (ECoG) has precise spatiotemporal resolution that enables recording of brain activity distributed over a wide area of cortex, such as during speech production. In this study, we sought to decode elements of speech production using ECoG. APPROACH We investigated words that contain the entire set of phonemes in the general American accent using ECoG with four subjects. Using a linear classifier, we evaluated the degree to which individual phonemes within each word could be correctly identified from cortical signal. MAIN RESULTS We classified phonemes with up to 36% accuracy when classifying all phonemes and up to 63% accuracy for a single phoneme. Further, misclassified phonemes follow articulation organization described in phonology literature, aiding classification of whole words. Precise temporal alignment to phoneme onset was crucial for classification success. SIGNIFICANCE We identified specific spatiotemporal features that aid classification, which could guide future applications. Word identification was equivalent to information transfer rates as high as 3.0 bits s(-1) (33.6 words min(-1)), supporting pursuit of speech articulation for BCI control.
Collapse
Affiliation(s)
- Emily M Mugler
- Bioengineering, University of Illinois at Chicago, 851 S. Morgan Street, Chicago, IL 60607, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Smith E, Duede S, Hanrahan S, Davis T, House P, Greger B. Seeing is believing: neural representations of visual stimuli in human auditory cortex correlate with illusory auditory perceptions. PLoS One 2013; 8:e73148. [PMID: 24023823 PMCID: PMC3762867 DOI: 10.1371/journal.pone.0073148] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 07/19/2013] [Indexed: 11/18/2022] Open
Abstract
In interpersonal communication, the listener can often see as well as hear the speaker. Visual stimuli can subtly change a listener's auditory perception, as in the McGurk illusion, in which perception of a phoneme's auditory identity is changed by a concurrent video of a mouth articulating a different phoneme. Studies have yet to link visual influences on the neural representation of language with subjective language perception. Here we show that vision influences the electrophysiological representation of phonemes in human auditory cortex prior to the presentation of the auditory stimulus. We used the McGurk effect to dissociate the subjective perception of phonemes from the auditory stimuli. With this paradigm we demonstrate that neural representations in auditory cortex are more closely correlated with the visual stimuli of mouth articulation, which drive the illusory subjective auditory perception, than the actual auditory stimuli. Additionally, information about visual and auditory stimuli transfer in the caudal-rostral direction along the superior temporal gyrus during phoneme perception as would be expected of visual information flowing from the occipital cortex into the ventral auditory processing stream. These results show that visual stimuli influence the neural representation in auditory cortex early in sensory processing and may override the subjective auditory perceptions normally generated by auditory stimuli. These findings depict a marked influence of vision on the neural processing of audition in tertiary auditory cortex and suggest a mechanistic underpinning for the McGurk effect.
Collapse
Affiliation(s)
- Elliot Smith
- Interdepartmental Program in Neuroscience, University of Utah, Salt Lake City, Utah, United States of America
- Department of Bioengineering, University of Utah, Salt Lake City, Utah, United States of America
| | - Scott Duede
- Department of Linguistics, University of Utah, Salt Lake City, Utah, United States of America
| | - Sara Hanrahan
- Department of Bioengineering, University of Utah, Salt Lake City, Utah, United States of America
| | - Tyler Davis
- Department of Bioengineering, University of Utah, Salt Lake City, Utah, United States of America
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, United States of America
| | - Paul House
- Department of Neurosurgery, University of Utah, Salt Lake City, Utah, United States of America
| | - Bradley Greger
- Interdepartmental Program in Neuroscience, University of Utah, Salt Lake City, Utah, United States of America
- Department of Bioengineering, University of Utah, Salt Lake City, Utah, United States of America
- * E-mail:
| |
Collapse
|
29
|
Rodriguez Merzagora A, Coffey TJ, Sperling MR, Sharan A, Litt B, Baltuch G, Jacobs J. Repeated stimuli elicit diminished high-gamma electrocorticographic responses. Neuroimage 2013; 85 Pt 2:844-52. [PMID: 23867555 DOI: 10.1016/j.neuroimage.2013.07.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 06/25/2013] [Accepted: 07/02/2013] [Indexed: 10/26/2022] Open
Abstract
In the phenomenon of repetition suppression (RS), when a person views a stimulus, the neural activity involved in processing that item is relatively diminished if that stimulus had been previously viewed. Previous noninvasive imaging studies mapped the prevalence of RS for different stimulus types to identify brain regions involved in representing a range of cognitive information. However, these noninvasive findings are challenging to interpret because they do not provide information on how RS relates to the brain's electrophysiological activity. We examined the electrophysiological basis of RS directly using brain recordings from implanted electrocorticographic (ECoG) electrodes in neurosurgical patients. Patients performed a memory task during ECoG recording and we identified high-gamma signals (65-128 Hz) that distinguished the neuronal representation of specific memory items. We then compared the neural representation of each item between novel and repeated viewings. This revealed the presence of RS, in which the neuronal representation of a repeated item had a significantly decreased amplitude and duration compared with novel stimuli. Furthermore, the magnitude of RS was greatest for the stimuli that initially elicited the largest activation at each site. These results have implications for understanding the neural basis of RS and human memory by showing that individual cortical sites exhibit the largest RS for the stimuli that they most actively represent.
Collapse
Affiliation(s)
- Anna Rodriguez Merzagora
- School of Biomedical Engineering, Science & Health Systems, Drexel University, Philadelphia, PA 19104, USA
| | | | | | | | | | | | | |
Collapse
|
30
|
Muthukumaraswamy SD. High-frequency brain activity and muscle artifacts in MEG/EEG: a review and recommendations. Front Hum Neurosci 2013; 7:138. [PMID: 23596409 PMCID: PMC3625857 DOI: 10.3389/fnhum.2013.00138] [Citation(s) in RCA: 376] [Impact Index Per Article: 34.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 03/28/2013] [Indexed: 12/13/2022] Open
Abstract
In recent years high-frequency brain activity in the gamma-frequency band (30-80 Hz) and above has become the focus of a growing body of work in MEG/EEG research. Unfortunately, high-frequency neural activity overlaps entirely with the spectral bandwidth of muscle activity (~20-300 Hz). It is becoming appreciated that artifacts of muscle activity may contaminate a number of non-invasive reports of high-frequency activity. In this review, the spectral, spatial, and temporal characteristics of muscle artifacts are compared with those described (so far) for high-frequency neural activity. In addition, several of the techniques that are being developed to help suppress muscle artifacts in MEG/EEG are reviewed. Suggestions are made for the collection, analysis, and presentation of experimental data with the aim of reducing the number of publications in the future that may contain muscle artifacts.
Collapse
|
31
|
Pei X, Barbour DL, Leuthardt EC, Schalk G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J Neural Eng 2011; 8:046028. [PMID: 21750369 DOI: 10.1088/1741-2560/8/4/046028] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Several stories in the popular media have speculated that it may be possible to infer from the brain which word a person is speaking or even thinking. While recent studies have demonstrated that brain signals can give detailed information about actual and imagined actions, such as different types of limb movements or spoken words, concrete experimental evidence for the possibility to 'read the mind', i.e. to interpret internally-generated speech, has been scarce. In this study, we found that it is possible to use signals recorded from the surface of the brain (electrocorticography) to discriminate the vowels and consonants embedded in spoken and in imagined words, and we defined the cortical areas that held the most information about discrimination of vowels and consonants. The results shed light on the distinct mechanisms associated with production of vowels and consonants, and could provide the basis for brain-based communication using imagined speech.
Collapse
Affiliation(s)
- Xiaomei Pei
- Brain-Computer Interface R&D Program, Wadsworth Center, New York State Department of Health, Albany, NY, USA
| | | | | | | |
Collapse
|
32
|
Brumberg JS, Wright EJ, Andreasen DS, Guenther FH, Kennedy PR. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech-motor cortex. Front Neurosci 2011; 5:65. [PMID: 21629876 PMCID: PMC3096823 DOI: 10.3389/fnins.2011.00065] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2010] [Accepted: 04/24/2011] [Indexed: 11/30/2022] Open
Abstract
We conducted a neurophysiological study of attempted speech production in a paralyzed human volunteer using chronic microelectrode recordings. The volunteer suffers from locked-in syndrome leaving him in a state of near-total paralysis, though he maintains good cognition and sensation. In this study, we investigated the feasibility of supervised classification techniques for prediction of intended phoneme production in the absence of any overt movements including speech. Such classification or decoding ability has the potential to greatly improve the quality-of-life of many people who are otherwise unable to speak by providing a direct communicative link to the general community. We examined the performance of three classifiers on a multi-class discrimination problem in which the items were 38 American English phonemes including monophthong and diphthong vowels and consonants. The three classifiers differed in performance, but averaged between 16 and 21% overall accuracy (chance-level is 1/38 or 2.6%). Further, the distribution of phonemes classified statistically above chance was non-uniform though 20 of 38 phonemes were classified with statistical significance for all three classifiers. These preliminary results suggest supervised classification techniques are capable of performing large scale multi-class discrimination for attempted speech production and may provide the basis for future communication prostheses.
Collapse
Affiliation(s)
| | | | - Dinal S. Andreasen
- Neural Signals Inc. Duluth, GA, USA
- Georgia Tech Research InstituteMarietta, GA, USA
| | - Frank H. Guenther
- Department of Cognitive and Neural Systems, Boston UniversityBoston, MA, USA
- Department of Speech, Language, and Hearing Sciences, Boston UniversityBoston, MA, USA
- Division of Health Sciences and Technology, Harvard University–Massachusetts Institute of TechnologyCambridge, MA, USA
| | | |
Collapse
|
33
|
Leuthardt EC, Gaona C, Sharma M, Szrama N, Roland J, Freudenberg Z, Solis J, Breshears J, Schalk G. Using the electrocorticographic speech network to control a brain-computer interface in humans. J Neural Eng 2011; 8:036004. [PMID: 21471638 DOI: 10.1088/1741-2560/8/3/036004] [Citation(s) in RCA: 118] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Electrocorticography (ECoG) has emerged as a new signal platform for brain-computer interface (BCI) systems. Classically, the cortical physiology that has been commonly investigated and utilized for device control in humans has been brain signals from the sensorimotor cortex. Hence, it was unknown whether other neurophysiological substrates, such as the speech network, could be used to further improve on or complement existing motor-based control paradigms. We demonstrate here for the first time that ECoG signals associated with different overt and imagined phoneme articulation can enable invasively monitored human patients to control a one-dimensional computer cursor rapidly and accurately. This phonetic content was distinguishable within higher gamma frequency oscillations and enabled users to achieve final target accuracies between 68% and 91% within 15 min. Additionally, one of the patients achieved robust control using recordings from a microarray consisting of 1 mm spaced microwires. These findings suggest that the cortical network associated with speech could provide an additional cognitive and physiologic substrate for BCI operation and that these signals can be acquired from a cortical array that is small and minimally invasive.
Collapse
Affiliation(s)
- Eric C Leuthardt
- Department of Biomedical Engineering, Washington University in St. Louis, Campus Box 8057, 660 South Euclid, St Louis, MO 63130, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Kellis S, Miller K, Thomson K, Brown R, House P, Greger B. Decoding spoken words using local field potentials recorded from the cortical surface. J Neural Eng 2010; 7:056007. [PMID: 20811093 DOI: 10.1088/1741-2560/7/5/056007] [Citation(s) in RCA: 171] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Pathological conditions such as amyotrophic lateral sclerosis or damage to the brainstem can leave patients severely paralyzed but fully aware, in a condition known as 'locked-in syndrome'. Communication in this state is often reduced to selecting individual letters or words by arduous residual movements. More intuitive and rapid communication may be restored by directly interfacing with language areas of the cerebral cortex. We used a grid of closely spaced, nonpenetrating micro-electrodes to record local field potentials (LFPs) from the surface of face motor cortex and Wernicke's area. From these LFPs we were successful in classifying a small set of words on a trial-by-trial basis at levels well above chance. We found that the pattern of electrodes with the highest accuracy changed for each word, which supports the idea that closely spaced micro-electrodes are capable of capturing neural signals from independent neural processing assemblies. These results further support using cortical surface potentials (electrocorticography) in brain-computer interfaces. These results also show that LFPs recorded from the cortical surface (micro-electrocorticography) of language areas can be used to classify speech-related cortical rhythms and potentially restore communication to locked-in patients.
Collapse
Affiliation(s)
- Spencer Kellis
- Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT 84112, USA
| | | | | | | | | | | |
Collapse
|
35
|
Slutzky MW, Jordan LR, Krieg T, Chen M, Mogul DJ, Miller LE. Optimal spacing of surface electrode arrays for brain-machine interface applications. J Neural Eng 2010; 7:26004. [PMID: 20197598 DOI: 10.1088/1741-2560/7/2/026004] [Citation(s) in RCA: 128] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Brain-machine interfaces (BMIs) use signals recorded directly from the brain to control an external device, such as a computer cursor or a prosthetic limb. These control signals have been recorded from different levels of the brain, from field potentials at the scalp or cortical surface to single neuron action potentials. At present, the more invasive recordings have better signal quality, but also lower stability over time. Recently, subdural field potentials have been proposed as a stable, good quality source of control signals, with the potential for higher spatial and temporal bandwidth than EEG. Here we used finite element modeling in rats and humans and spatial spectral analysis in rats to compare the spatial resolution of signals recorded epidurally (outside the dura), with those recorded from subdural and scalp locations. Resolution of epidural and subdural signals was very similar in rats and somewhat less so in human models. Both were substantially better than signals recorded at the scalp. Resolution of epidural and subdural signals in humans was much more similar when the cerebrospinal fluid layer thickness was reduced. This suggests that the less invasive epidural recordings may yield signals of similar quality to subdural recordings, and hence may be more attractive as a source of control signals for BMIs.
Collapse
Affiliation(s)
- Marc W Slutzky
- Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
| | | | | | | | | | | |
Collapse
|
36
|
Kellis S, Miller K, Thomson K, Brown R, House P, Greger B. Classification of spoken words using surface local field potentials. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2010; 2010:3827-3830. [PMID: 21097062 DOI: 10.1109/iembs.2010.5627682] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Cortical surface potentials recorded by electrocorticography (ECoG) have enabled robust motor classification algorithms in large part because of the close proximity of the electrodes to the cortical surface. However, standard clinical ECoG electrodes are large in both diameter and spacing relative to the underlying cortical column architecture in which groups of neurons process similar types of stimuli. The potential for surface micro-electrodes closely spaced together to provide even higher fidelity in recording surface field potentials has been a topic of recent interest in the neural prosthetic community. This study describes the classification of spoken words from surface local field potentials (LFPs) recorded using grids of subdural, nonpenetrating high impedance micro-electrodes. Data recorded from these micro-ECoG electrodes supported accurate and rapid classification. Furthermore, electrodes spaced millimeters apart demonstrated varying classification characteristics, suggesting that cortical surface LFPs may be recorded with high temporal and spatial resolution to enable even more robust algorithms for motor classification.
Collapse
Affiliation(s)
- Spencer Kellis
- Department of Bioengineering, University of Utah, Salt Lake City, UT 84112, USA
| | | | | | | | | | | |
Collapse
|
37
|
Leuthardt EC, Freudenberg Z, Bundy D, Roland J. Microscale recording from human motor cortex: implications for minimally invasive electrocorticographic brain-computer interfaces. Neurosurg Focus 2009; 27:E10. [PMID: 19569885 DOI: 10.3171/2009.4.focus0980] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECT There is a growing interest in the use of recording from the surface of the brain, known as electrocorticography (ECoG), as a practical signal platform for brain-computer interface application. The signal has a combination of high signal quality and long-term stability that may be the ideal intermediate modality for future application. The research paradigm for studying ECoG signals uses patients requiring invasive monitoring for seizure localization. The implanted arrays span cortex areas on the order of centimeters. Currently, it is unknown what level of motor information can be discerned from small regions of human cortex with microscale ECoG recording. METHODS In this study, a patient requiring invasive monitoring for seizure localization underwent concurrent implantation with a 16-microwire array (1-mm electrode spacing) placed over primary motor cortex. Microscale activity was recorded while the patient performed simple contra- and ipsilateral wrist movements that were monitored in parallel with electromyography. Using various statistical methods, linear and nonlinear relationships between these microcortical changes and recorded electromyography activity were defined. RESULTS Small regions of primary motor cortex (< 5 mm) carry sufficient information to separate multiple aspects of motor movements (that is, wrist flexion/extension and ipsilateral/contralateral movements). CONCLUSIONS These findings support the conclusion that small regions of cortex investigated by ECoG recording may provide sufficient information about motor intentions to support brain-computer interface operations in the future. Given the small scale of the cortical region required, the requisite implanted array would be minimally invasive in terms of surgical placement of the electrode array.
Collapse
Affiliation(s)
- Eric C Leuthardt
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri 63110, USA.
| | | | | | | |
Collapse
|
38
|
Leuthardt EC, Schalk G, Roland J, Rouse A, Moran DW. Evolution of brain-computer interfaces: going beyond classic motor physiology. Neurosurg Focus 2009; 27:E4. [PMID: 19569892 PMCID: PMC2920041 DOI: 10.3171/2009.4.focus0979] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The notion that a computer can decode brain signals to infer the intentions of a human and then enact those intentions directly through a machine is becoming a realistic technical possibility. These types of devices are known as brain-computer interfaces (BCIs). The evolution of these neuroprosthetic technologies could have significant implications for patients with motor disabilities by enhancing their ability to interact and communicate with their environment. The cortical physiology most investigated and used for device control has been brain signals from the primary motor cortex. To date, this classic motor physiology has been an effective substrate for demonstrating the potential efficacy of BCI-based control. However, emerging research now stands to further enhance our understanding of the cortical physiology underpinning human intent and provide further signals for more complex brain-derived control. In this review, the authors report the current status of BCIs and detail the emerging research trends that stand to augment clinical applications in the future.
Collapse
Affiliation(s)
- Eric C Leuthardt
- Department of Biomedical Engineering, Washington University School of Medicine, St. Louis, Missouri 63110, USA.
| | | | | | | | | |
Collapse
|
39
|
|