1
|
Cao Y, Zhang Z, Qin BW, Sang W, Li H, Wang T, Tan F, Gan Y, Zhang X, Liu T, Xiang D, Lin W, Liu Q. Physical Reservoir Computing Using van der Waals Ferroelectrics for Acoustic Keyword Spotting. ACS NANO 2024; 18:23265-23276. [PMID: 39140427 DOI: 10.1021/acsnano.4c06144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Acoustic keyword spotting (KWS) plays a pivotal role in the voice-activated systems of artificial intelligence (AI), allowing for hands-free interactions between humans and smart devices through information retrieval of the voice commands. The cloud computing technology integrated with the artificial neural networks has been employed to execute the KWS tasks, which however suffers from propagation delay and the risk of privacy breach. Here, we report a single-node reservoir computing (RC) system based on the CuInP2S6 (CIPS)/graphene heterostructure planar device for implementing the KWS task with low computation cost. Through deliberately tuning the Schottky barrier height at the ferroelectric CIPS interfaces for the thermionic injection and transport of the electrons, the typical nonlinear current response and fading memory characteristics are achieved in the device. Additionally, the device exhibits diverse synaptic plasticity with an excellent separation capability of the temporal information. We construct a RC system through employing the ferroelectric device as the physical node to spot the acoustic keywords, i.e., the natural numbers from 1 to 9 based on simulation, in which the system demonstrates outstanding performance with high accuracy rate (>94.6%) and recall rate (>92.0%). Our work promises physical RC in single-node configuration as a prospective computing platform to process the acoustic keywords, promoting its applications in the artificial auditory system at the edge.
Collapse
Affiliation(s)
- Yi Cao
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- School of Microelectronics, Fudan University, Shanghai 200433, China
| | - Zefeng Zhang
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- Research Institute of Intelligent Complex Systems and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
| | - Bo-Wei Qin
- Research Institute of Intelligent Complex Systems and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| | - Weihui Sang
- Shanghai Frontiers Science Research Base of Intelligent Optoelectronics and Perception, Institute of Optoelectronics and Department of Materials Science, Fudan University, Shanghai 200433, China
| | - Honghong Li
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- School of Microelectronics, Fudan University, Shanghai 200433, China
| | - Tinghao Wang
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- School of Microelectronics, Fudan University, Shanghai 200433, China
| | - Feixia Tan
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- School of Microelectronics, Fudan University, Shanghai 200433, China
| | - Yang Gan
- Shanghai Frontiers Science Research Base of Intelligent Optoelectronics and Perception, Institute of Optoelectronics and Department of Materials Science, Fudan University, Shanghai 200433, China
| | - Xumeng Zhang
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai 200433, China
| | - Tao Liu
- Shanghai Frontiers Science Research Base of Intelligent Optoelectronics and Perception, Institute of Optoelectronics and Department of Materials Science, Fudan University, Shanghai 200433, China
| | - Du Xiang
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai 200433, China
| | - Wei Lin
- Research Institute of Intelligent Complex Systems and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
- School of Mathematical Sciences, SCMS, SCAM, and CCSB, Fudan University, Shanghai 200433, China
| | - Qi Liu
- State Key Laboratory of Integrated Chips and Systems, Frontier Institute of Chip and System, Fudan University, Shanghai 200433, China
- School of Microelectronics, Fudan University, Shanghai 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai 200433, China
| |
Collapse
|
2
|
Silva AB, Liu JR, Metzger SL, Bhaya-Grossman I, Dougherty ME, Seaton MP, Littlejohn KT, Tu-Chan A, Ganguly K, Moses DA, Chang EF. A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages. Nat Biomed Eng 2024; 8:977-991. [PMID: 38769157 PMCID: PMC11554235 DOI: 10.1038/s41551-024-01207-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 04/01/2024] [Indexed: 05/22/2024]
Abstract
Advancements in decoding speech from brain activity have focused on decoding a single language. Hence, the extent to which bilingual speech production relies on unique or shared cortical activity across languages has remained unclear. Here, we leveraged electrocorticography, along with deep-learning and statistical natural-language models of English and Spanish, to record and decode activity from speech-motor cortex of a Spanish-English bilingual with vocal-tract and limb paralysis into sentences in either language. This was achieved without requiring the participant to manually specify the target language. Decoding models relied on shared vocal-tract articulatory representations across languages, which allowed us to build a syllable classifier that generalized across a shared set of English and Spanish syllables. Transfer learning expedited training of the bilingual decoder by enabling neural data recorded in one language to improve decoding in the other language. Overall, our findings suggest shared cortical articulatory representations that persist after paralysis and enable the decoding of multiple languages without the need to train separate language-specific decoders.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Sean L Metzger
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Maximilian E Dougherty
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Margaret P Seaton
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Adelyn Tu-Chan
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Karunesh Ganguly
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
- University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA.
| |
Collapse
|
3
|
Wu X, Wellington S, Fu Z, Zhang D. Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods. J Neural Eng 2024; 21:036055. [PMID: 38885688 DOI: 10.1088/1741-2552/ad593a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 06/17/2024] [Indexed: 06/20/2024]
Abstract
Objective.Brain-computer interfaces (BCIs) are technologies that bypass damaged or disrupted neural pathways and directly decode brain signals to perform intended actions. BCIs for speech have the potential to restore communication by decoding the intended speech directly. Many studies have demonstrated promising results using invasive micro-electrode arrays and electrocorticography. However, the use of stereo-electroencephalography (sEEG) for speech decoding has not been fully recognized.Approach.In this research, recently released sEEG data were used to decode Dutch words spoken by epileptic participants. We decoded speech waveforms from sEEG data using advanced deep-learning methods. Three methods were implemented: a linear regression method, an recurrent neural network (RNN)-based sequence-to-sequence model (RNN), and a transformer model.Main results.Our RNN and transformer models outperformed the linear regression significantly, while no significant difference was found between the two deep-learning methods. Further investigation on individual electrodes showed that the same decoding result can be obtained using only a few of the electrodes.Significance.This study demonstrated that decoding speech from sEEG signals is possible, and the location of the electrodes is critical to the decoding performance.
Collapse
Affiliation(s)
- Xiaolong Wu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Scott Wellington
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Zhichun Fu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Dingguo Zhang
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| |
Collapse
|
4
|
Metzger SL, Littlejohn KT, Silva AB, Moses DA, Seaton MP, Wang R, Dougherty ME, Liu JR, Wu P, Berger MA, Zhuravleva I, Tu-Chan A, Ganguly K, Anumanchipalli GK, Chang EF. A high-performance neuroprosthesis for speech decoding and avatar control. Nature 2023; 620:1037-1046. [PMID: 37612505 PMCID: PMC10826467 DOI: 10.1038/s41586-023-06443-4] [Citation(s) in RCA: 69] [Impact Index Per Article: 69.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 07/17/2023] [Indexed: 08/25/2023]
Abstract
Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive1. Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant's pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis.
Collapse
Affiliation(s)
- Sean L Metzger
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Margaret P Seaton
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Ran Wang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Maximilian E Dougherty
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA
| | - Peter Wu
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | | | - Inga Zhuravleva
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Adelyn Tu-Chan
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Karunesh Ganguly
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Gopala K Anumanchipalli
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
- University of California, Berkeley-University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA, USA.
| |
Collapse
|
5
|
Wingfield C, Zhang C, Devereux B, Fonteneau E, Thwaites A, Liu X, Woodland P, Marslen-Wilson W, Su L. On the similarities of representations in artificial and brain neural networks for speech recognition. Front Comput Neurosci 2022; 16:1057439. [PMID: 36618270 PMCID: PMC9811675 DOI: 10.3389/fncom.2022.1057439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/29/2022] [Indexed: 12/24/2022] Open
Abstract
Introduction In recent years, machines powered by deep learning have achieved near-human levels of performance in speech recognition. The fields of artificial intelligence and cognitive neuroscience have finally reached a similar level of performance, despite their huge differences in implementation, and so deep learning models can-in principle-serve as candidates for mechanistic models of the human auditory system. Methods Utilizing high-performance automatic speech recognition systems, and advanced non-invasive human neuroimaging technology such as magnetoencephalography and multivariate pattern-information analysis, the current study aimed to relate machine-learned representations of speech to recorded human brain representations of the same speech. Results In one direction, we found a quasi-hierarchical functional organization in human auditory cortex qualitatively matched with the hidden layers of deep artificial neural networks trained as part of an automatic speech recognizer. In the reverse direction, we modified the hidden layer organization of the artificial neural network based on neural activation patterns in human brains. The result was a substantial improvement in word recognition accuracy and learned speech representations. Discussion We have demonstrated that artificial and brain neural networks can be mutually informative in the domain of speech recognition.
Collapse
Affiliation(s)
- Cai Wingfield
- Department of Psychology, Lancaster University, Lancaster, United Kingdom
| | - Chao Zhang
- Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - Barry Devereux
- School of Electronics, Electrical Engineering and Computer Science, Queens University Belfast, Belfast, United Kingdom
| | - Elisabeth Fonteneau
- Department of Psychology, University Paul Valéry Montpellier, Montpellier, France
| | - Andrew Thwaites
- Department of Psychology, University of Cambridge, Cambridge, United Kingdom
| | - Xunying Liu
- Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Phil Woodland
- Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | | | - Li Su
- Department of Neuroscience, Neuroscience Institute, Insigneo Institute for in silico Medicine, University of Sheffield, Sheffield, United Kingdom,Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom,*Correspondence: Li Su
| |
Collapse
|
6
|
Verwoert M, Ottenhoff MC, Goulis S, Colon AJ, Wagner L, Tousseyn S, van Dijk JP, Kubben PL, Herff C. Dataset of Speech Production in intracranial.Electroencephalography. Sci Data 2022; 9:434. [PMID: 35869138 PMCID: PMC9307753 DOI: 10.1038/s41597-022-01542-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022] Open
Abstract
Speech production is an intricate process involving a large number of muscles and cognitive processes. The neural processes underlying speech production are not completely understood. As speech is a uniquely human ability, it can not be investigated in animal models. High-fidelity human data can only be obtained in clinical settings and is therefore not easily available to all researchers. Here, we provide a dataset of 10 participants reading out individual words while we measured intracranial EEG from a total of 1103 electrodes. The data, with its high temporal resolution and coverage of a large variety of cortical and sub-cortical brain regions, can help in understanding the speech production process better. Simultaneously, the data can be used to test speech decoding and synthesis approaches from neural data to develop speech Brain-Computer Interfaces and speech neuroprostheses. Measurement(s) | Brain activity | Technology Type(s) | Stereotactic electroencephalography | Sample Characteristic - Organism | Homo sapiens | Sample Characteristic - Environment | Epilepsy monitoring center | Sample Characteristic - Location | The Netherlands |
Collapse
|
7
|
Metzger SL, Liu JR, Moses DA, Dougherty ME, Seaton MP, Littlejohn KT, Chartier J, Anumanchipalli GK, Tu-Chan A, Ganguly K, Chang EF. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. Nat Commun 2022; 13:6510. [PMID: 36347863 PMCID: PMC9643551 DOI: 10.1038/s41467-022-33611-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 09/26/2022] [Indexed: 11/09/2022] Open
Abstract
Neuroprostheses have the potential to restore communication to people who cannot speak or type due to paralysis. However, it is unclear if silent attempts to speak can be used to control a communication neuroprosthesis. Here, we translated direct cortical signals in a clinical-trial participant (ClinicalTrials.gov; NCT03698149) with severe limb and vocal-tract paralysis into single letters to spell out full sentences in real time. We used deep-learning and language-modeling techniques to decode letter sequences as the participant attempted to silently spell using code words that represented the 26 English letters (e.g. "alpha" for "a"). We leveraged broad electrode coverage beyond speech-motor cortex to include supplemental control signals from hand cortex and complementary information from low- and high-frequency signal components to improve decoding accuracy. We decoded sentences using words from a 1,152-word vocabulary at a median character error rate of 6.13% and speed of 29.4 characters per minute. In offline simulations, we showed that our approach generalized to large vocabularies containing over 9,000 words (median character error rate of 8.23%). These results illustrate the clinical viability of a silently controlled speech neuroprosthesis to generate sentences from a large vocabulary through a spelling-based approach, complementing previous demonstrations of direct full-word decoding.
Collapse
Affiliation(s)
- Sean L. Metzger
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| | - Jessie R. Liu
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| | - David A. Moses
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA
| | - Maximilian E. Dougherty
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA
| | - Margaret P. Seaton
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA
| | - Kaylo T. Littlejohn
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA USA
| | - Josh Chartier
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA
| | - Gopala K. Anumanchipalli
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA USA
| | - Adelyn Tu-Chan
- grid.266102.10000 0001 2297 6811Department of Neurology, University of California, San Francisco, San Francisco, CA USA
| | - Karunesh Ganguly
- grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Department of Neurology, University of California, San Francisco, San Francisco, CA USA
| | - Edward F. Chang
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| |
Collapse
|
8
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
9
|
Luo S, Rabbani Q, Crone NE. Brain-Computer Interface: Applications to Speech Decoding and Synthesis to Augment Communication. Neurotherapeutics 2022; 19:263-273. [PMID: 35099768 PMCID: PMC9130409 DOI: 10.1007/s13311-022-01190-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 01/03/2023] Open
Abstract
Damage or degeneration of motor pathways necessary for speech and other movements, as in brainstem strokes or amyotrophic lateral sclerosis (ALS), can interfere with efficient communication without affecting brain structures responsible for language or cognition. In the worst-case scenario, this can result in the locked in syndrome (LIS), a condition in which individuals cannot initiate communication and can only express themselves by answering yes/no questions with eye blinks or other rudimentary movements. Existing augmentative and alternative communication (AAC) devices that rely on eye tracking can improve the quality of life for people with this condition, but brain-computer interfaces (BCIs) are also increasingly being investigated as AAC devices, particularly when eye tracking is too slow or unreliable. Moreover, with recent and ongoing advances in machine learning and neural recording technologies, BCIs may offer the only means to go beyond cursor control and text generation on a computer, to allow real-time synthesis of speech, which would arguably offer the most efficient and expressive channel for communication. The potential for BCI speech synthesis has only recently been realized because of seminal studies of the neuroanatomical and neurophysiological underpinnings of speech production using intracranial electrocorticographic (ECoG) recordings in patients undergoing epilepsy surgery. These studies have shown that cortical areas responsible for vocalization and articulation are distributed over a large area of ventral sensorimotor cortex, and that it is possible to decode speech and reconstruct its acoustics from ECoG if these areas are recorded with sufficiently dense and comprehensive electrode arrays. In this article, we review these advances, including the latest neural decoding strategies that range from deep learning models to the direct concatenation of speech units. We also discuss state-of-the-art vocoders that are integral in constructing natural-sounding audio waveforms for speech BCIs. Finally, this review outlines some of the challenges ahead in directly synthesizing speech for patients with LIS.
Collapse
Affiliation(s)
- Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Qinwan Rabbani
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
10
|
Cooney C, Folli R, Coyle D. A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech. IEEE Trans Biomed Eng 2021; 69:1983-1994. [PMID: 34874850 DOI: 10.1109/tbme.2021.3132861] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE Brain-computer interfaces (BCI) studies are increasingly leveraging different attributes of multiple signal modalities simultaneously. Bimodal data acquisition protocols combining the temporal resolution of electroencephalography (EEG) with the spatial resolution of functional near-infrared spectroscopy (fNIRS) require novel approaches to decoding. METHODS We present an EEG-fNIRS Hybrid BCI that employs a new bimodal deep neural network architecture consisting of two convolutional sub-networks (subnets) to decode overt and imagined speech. Features from each subnet are fused before further feature extraction and classification. Nineteen participants performed overt and imagined speech in a novel cue-based paradigm enabling investigation of stimulus and linguistic effects on decoding. RESULTS Using the hybrid approach, classification accuracies (46.31% and 34.29% for overt and imagined speech, respectively (chance: 25%)) indicated a significant improvement on EEG used independently for imagined speech (p=0.020) while tending towards significance for overt speech (p=0.098). In comparison with fNIRS, significant improvements for both speech-types were achieved with bimodal decoding (p<0.001). There was a mean difference of ~12.02% between overt and imagined speech with accuracies as high as 87.18% and 53%. Deeper subnets enhanced performance while stimulus effected overt and imagined speech in significantly different ways. CONCLUSION The bimodal approach was a significant improvement on unimodal results for several tasks. Results indicate the potential of multi-modal deep learning for enhancing neural signal decoding. SIGNIFICANCE This novel architecture can be used to enhance speech decoding from bimodal neural signals.
Collapse
|
11
|
Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity. Commun Biol 2021; 4:1055. [PMID: 34556793 PMCID: PMC8460739 DOI: 10.1038/s42003-021-02578-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 08/11/2021] [Indexed: 11/17/2022] Open
Abstract
Speech neuroprosthetics aim to provide a natural communication channel to individuals who are unable to speak due to physical or neurological impairments. Real-time synthesis of acoustic speech directly from measured neural activity could enable natural conversations and notably improve quality of life, particularly for individuals who have severely limited means of communication. Recent advances in decoding approaches have led to high quality reconstructions of acoustic speech from invasively measured neural activity. However, most prior research utilizes data collected during open-loop experiments of articulated speech, which might not directly translate to imagined speech processes. Here, we present an approach that synthesizes audible speech in real-time for both imagined and whispered speech conditions. Using a participant implanted with stereotactic depth electrodes, we were able to reliably generate audible speech in real-time. The decoding models rely predominately on frontal activity suggesting that speech processes have similar representations when vocalized, whispered, or imagined. While reconstructed audio is not yet intelligible, our real-time synthesis approach represents an essential step towards investigating how patients will learn to operate a closed-loop speech neuroprosthesis based on imagined speech. Miguel Angrick et al. develop an intracranial EEG-based method to decode imagined speech from a human patient and translate it into audible speech in real-time. This report presents an important proof of concept that acoustic output can be reconstructed on the basis of neural signals, and serves as a valuable step in the development of neuroprostheses to help nonverbal patients interact with their environment.
Collapse
|
12
|
Moses DA, Metzger SL, Liu JR, Anumanchipalli GK, Makin JG, Sun PF, Chartier J, Dougherty ME, Liu PM, Abrams GM, Tu-Chan A, Ganguly K, Chang EF. Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria. N Engl J Med 2021; 385:217-227. [PMID: 34260835 PMCID: PMC8972947 DOI: 10.1056/nejmoa2027540] [Citation(s) in RCA: 147] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
BACKGROUND Technology to restore the ability to communicate in paralyzed persons who cannot speak has the potential to improve autonomy and quality of life. An approach that decodes words and sentences directly from the cerebral cortical activity of such patients may represent an advancement over existing methods for assisted communication. METHODS We implanted a subdural, high-density, multielectrode array over the area of the sensorimotor cortex that controls speech in a person with anarthria (the loss of the ability to articulate speech) and spastic quadriparesis caused by a brain-stem stroke. Over the course of 48 sessions, we recorded 22 hours of cortical activity while the participant attempted to say individual words from a vocabulary set of 50 words. We used deep-learning algorithms to create computational models for the detection and classification of words from patterns in the recorded cortical activity. We applied these computational models, as well as a natural-language model that yielded next-word probabilities given the preceding words in a sequence, to decode full sentences as the participant attempted to say them. RESULTS We decoded sentences from the participant's cortical activity in real time at a median rate of 15.2 words per minute, with a median word error rate of 25.6%. In post hoc analyses, we detected 98% of the attempts by the participant to produce individual words, and we classified words with 47.1% accuracy using cortical signals that were stable throughout the 81-week study period. CONCLUSIONS In a person with anarthria and spastic quadriparesis caused by a brain-stem stroke, words and sentences were decoded directly from cortical activity during attempted speech with the use of deep-learning models and a natural-language model. (Funded by Facebook and others; ClinicalTrials.gov number, NCT03698149.).
Collapse
Affiliation(s)
- David A Moses
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Sean L Metzger
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Jessie R Liu
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Gopala K Anumanchipalli
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Joseph G Makin
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Pengfei F Sun
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Josh Chartier
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Maximilian E Dougherty
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Patricia M Liu
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Gary M Abrams
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Adelyn Tu-Chan
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Karunesh Ganguly
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| | - Edward F Chang
- From the Department of Neurological Surgery (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., M.E.D., E.F.C.), the Weill Institute for Neuroscience (D.A.M., S.L.M., J.R.L., G.K.A., J.G.M., P.F.S., J.C., K.G., E.F.C.), and the Departments of Rehabilitation Services (P.M.L.) and Neurology (G.M.A., A.T.-C., K.G.), University of California, San Francisco (UCSF), San Francisco, and the Graduate Program in Bioengineering, University of California, Berkeley-UCSF, Berkeley (S.L.M., J.R.L., E.F.C.)
| |
Collapse
|
13
|
Pei L, Ouyang G. Online recognition of handwritten characters from scalp-recorded brain activities during handwriting. J Neural Eng 2021; 18. [PMID: 34036941 DOI: 10.1088/1741-2552/ac01a0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 05/14/2021] [Indexed: 12/31/2022]
Abstract
Objective.Brain-computer interfaces aim to build an efficient communication with the world using neural signals, which may bring great benefits to human society, especially to people with physical impairments. To date, the ability to translate brain signals to effective communication outcome remains low. This work explores whether the handwriting process could serve as a potential interface with high performance. To this end, we first examined how much the scalp-recorded brain signals encode information related to handwriting and whether it is feasible to precisely retrieve the handwritten content solely from the scalp-recorded electrical data.Approach.Five participants were instructed to write the sentence 'HELLO, WORLD!' repeatedly on a tablet while their brain signals were simultaneously recorded by electroencephalography (EEG). The EEG signals were first decomposed by independent component analysis for extracting features to be used to train a convolutional neural network (CNN) to recognize the written symbols.Main results.The accuracy of the CNN-based classifier trained and applied on the same participant (training and test data separated) ranged from 76.8% to 97.0%. The accuracy of cross-participant application was more diverse, ranging from 14.7% to 58.7%. These results showed the possibility of recognizing the handwritten content directly from the scalp level brain signal. A demonstration of the recognition system in an online mode was presented. The major factor that grounded the recognition was the close association between the rich dynamics of electroencephalogram source activities and the kinematic information during the handwriting movements.Significance.This work revealed an explicit and precise mapping between scalp-level electrophysiological signals and linguistic information conveyed by handwriting, which provided a novel approach to developing brain computer interfaces that focus on semantic communication.
Collapse
Affiliation(s)
- Leisi Pei
- Faculty of Education, The University of Hong Kong, Pokfulam, Hong Kong SAR, People's Republic of China
| | - Guang Ouyang
- Faculty of Education, The University of Hong Kong, Pokfulam, Hong Kong SAR, People's Republic of China
| |
Collapse
|
14
|
Cooney C, Korik A, Folli R, Coyle D. Evaluation of Hyperparameter Optimization in Machine and Deep Learning Methods for Decoding Imagined Speech EEG. SENSORS 2020; 20:s20164629. [PMID: 32824559 PMCID: PMC7472624 DOI: 10.3390/s20164629] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 08/10/2020] [Accepted: 08/13/2020] [Indexed: 01/12/2023]
Abstract
Classification of electroencephalography (EEG) signals corresponding to imagined speech production is important for the development of a direct-speech brain-computer interface (DS-BCI). Deep learning (DL) has been utilized with great success across several domains. However, it remains an open question whether DL methods provide significant advances over traditional machine learning (ML) approaches for classification of imagined speech. Furthermore, hyperparameter (HP) optimization has been neglected in DL-EEG studies, resulting in the significance of its effects remaining uncertain. In this study, we aim to improve classification of imagined speech EEG by employing DL methods while also statistically evaluating the impact of HP optimization on classifier performance. We trained three distinct convolutional neural networks (CNN) on imagined speech EEG using a nested cross-validation approach to HP optimization. Each of the CNNs evaluated was designed specifically for EEG decoding. An imagined speech EEG dataset consisting of both words and vowels facilitated training on both sets independently. CNN results were compared with three benchmark ML methods: Support Vector Machine, Random Forest and regularized Linear Discriminant Analysis. Intra- and inter-subject methods of HP optimization were tested and the effects of HPs statistically analyzed. Accuracies obtained by the CNNs were significantly greater than the benchmark methods when trained on both datasets (words: 24.97%, p < 1 × 10-7, chance: 16.67%; vowels: 30.00%, p < 1 × 10-7, chance: 20%). The effects of varying HP values, and interactions between HPs and the CNNs were both statistically significant. The results of HP optimization demonstrate how critical it is for training CNNs to decode imagined speech.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Londonderry BT48 7JL, UK; (A.K.); (D.C.)
- Correspondence:
| | - Attila Korik
- Intelligent Systems Research Centre, Ulster University, Londonderry BT48 7JL, UK; (A.K.); (D.C.)
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown BT37 0QB, UK;
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Londonderry BT48 7JL, UK; (A.K.); (D.C.)
| |
Collapse
|
15
|
Delgado Saa J, Christen A, Martin S, Pasley BN, Knight RT, Giraud AL. Using Coherence-based spectro-spatial filters for stimulus features prediction from electro-corticographic recordings. Sci Rep 2020; 10:7637. [PMID: 32376909 PMCID: PMC7203138 DOI: 10.1038/s41598-020-63303-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 03/19/2020] [Indexed: 11/08/2022] Open
Abstract
The traditional approach in neuroscience relies on encoding models where brain responses are related to different stimuli in order to establish dependencies. In decoding tasks, on the contrary, brain responses are used to predict the stimuli, and traditionally, the signals are assumed stationary within trials, which is rarely the case for natural stimuli. We hypothesize that a decoding model assuming each experimental trial as a realization of a random process more likely reflects the statistical properties of the undergoing process compared to the assumption of stationarity. Here, we propose a Coherence-based spectro-spatial filter that allows for reconstructing stimulus features from brain signal's features. The proposed method extracts common patterns between features of the brain signals and the stimuli that produced them. These patterns, originating from different recording electrodes are combined, forming a spatial filter that produces a unified prediction of the presented stimulus. This approach takes into account frequency, phase, and spatial distribution of brain features, hence avoiding the need to predefine specific frequency bands of interest or phase relationships between stimulus and brain responses manually. Furthermore, the model does not require the tuning of hyper-parameters, reducing significantly the computational load attached to it. Using three different cognitive tasks (motor movements, speech perception, and speech production), we show that the proposed method consistently improves stimulus feature predictions in terms of correlation (group averages of 0.74 for motor movements, 0.84 for speech perception, and 0.74 for speech production) in comparison with other methods based on regularized multivariate regression, probabilistic graphical models and artificial neural networks. Furthermore, the model parameters revealed those anatomical regions and spectral components that were discriminant in the different cognitive tasks. This novel method does not only provide a useful tool to address fundamental neuroscience questions, but could also be applied to neuroprosthetics.
Collapse
Affiliation(s)
- Jaime Delgado Saa
- Auditory Language Group, University of Geneva, Geneva, Switzerland.
- BSPAI Lab, Universidad del Norte, Barranquilla, Colombia.
| | - Andy Christen
- Auditory Language Group, University of Geneva, Geneva, Switzerland
| | - Stephanie Martin
- Auditory Language Group, University of Geneva, Geneva, Switzerland
| | - Brian N Pasley
- Knight Lab, University of California at Berkeley, Berkeley, USA
| | - Robert T Knight
- Knight Lab, University of California at Berkeley, Berkeley, USA
| | - Anne-Lise Giraud
- Auditory Language Group, University of Geneva, Geneva, Switzerland
| |
Collapse
|
16
|
Farrokhi B, Erfanian A. A state-based probabilistic method for decoding hand position during movement from ECoG signals in non-human primate. J Neural Eng 2020; 17:026042. [PMID: 32224511 DOI: 10.1088/1741-2552/ab848b] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
OBJECTIVE In this study, we proposed a state-based probabilistic method for decoding hand positions during unilateral and bilateral movements using the ECoG signals recorded from the brain of Rhesus monkey. APPROACH A customized electrode array was implanted subdurally in the right hemisphere of the brain covering from the primary motor cortex to the frontal cortex. Three different experimental paradigms were considered: ipsilateral, contralateral, and bilateral movements. During unilateral movement, the monkey was trained to get food with one hand, while during bilateral movement, the monkey used its left and right hands alternately to get food. To estimate the hand positions, a state-based probabilistic method was introduced which was based on the conditional probability of the hand movement state (i.e. idle, right hand movement, and left hand movement) and the conditional expectation of the hand position for each state. Moreover, a hybrid feature extraction method based on linear discriminant analysis and partial least squares (PLS) was introduced. MAIN RESULTS The proposed method could successfully decode the hand positions during ipsilateral, contralateral, and bilateral movements and significantly improved the decoding performance compared to the conventional Kalman and PLS regression methods [Formula: see text]. The proposed hybrid feature extraction method was found to outperform both the PLS and PCA methods [Formula: see text]. Investigating the kinematic information of each frequency band shows that more informative frequency bands were [Formula: see text] (15-30 Hz) and [Formula: see text](50-100 Hz) for ipsilateral and [Formula: see text] and [Formula: see text] (100-200 Hz) for contralateral movements. It is observed that ipsilateral movement was decoded better than contralateral movement for [Formula: see text] (5-15 Hz) and [Formula: see text] bands, while contralateral movements was decoded better for [Formula: see text] (30-200 Hz) and hfECoG (200-400 Hz) bands. SIGNIFICANCE Accurate decoding the bilateral movement using the ECoG recorded from one brain hemisphere is an important issue toward real-life applications of the brain-machine interface technologies.
Collapse
Affiliation(s)
- Behraz Farrokhi
- Department of Biomedical Engineering, School of Electrical Engineering, Iran University of Science and Technology (IUST), Iran Neural Technology Research Centre, Tehran, Iran
| | | |
Collapse
|
17
|
Heelan C, Lee J, O’Shea R, Lynch L, Brandman DM, Truccolo W, Nurmikko AV. Decoding speech from spike-based neural population recordings in secondary auditory cortex of non-human primates. Commun Biol 2019; 2:466. [PMID: 31840111 PMCID: PMC6906475 DOI: 10.1038/s42003-019-0707-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 11/15/2019] [Indexed: 11/21/2022] Open
Abstract
Direct electronic communication with sensory areas of the neocortex is a challenging ambition for brain-computer interfaces. Here, we report the first successful neural decoding of English words with high intelligibility from intracortical spike-based neural population activity recorded from the secondary auditory cortex of macaques. We acquired 96-channel full-broadband population recordings using intracortical microelectrode arrays in the rostral and caudal parabelt regions of the superior temporal gyrus (STG). We leveraged a new neural processing toolkit to investigate the choice of decoding algorithm, neural preprocessing, audio representation, channel count, and array location on neural decoding performance. The presented spike-based machine learning neural decoding approach may further be useful in informing future encoding strategies to deliver direct auditory percepts to the brain as specific patterns of microstimulation.
Collapse
Affiliation(s)
- Christopher Heelan
- School of Engineering, Brown University, Providence, RI USA
- Connexon Systems, Providence, RI USA
| | - Jihun Lee
- School of Engineering, Brown University, Providence, RI USA
| | - Ronan O’Shea
- School of Engineering, Brown University, Providence, RI USA
| | - Laurie Lynch
- School of Engineering, Brown University, Providence, RI USA
| | - David M. Brandman
- Department of Surgery (Neurosurgery), Dalhousie University, Halifax, Nova Scotia Canada
| | - Wilson Truccolo
- Department of Neuroscience, Brown University, Providence, RI USA
- Carney Institute for Brain Science, Brown University, Providence, RI USA
| | - Arto V. Nurmikko
- School of Engineering, Brown University, Providence, RI USA
- Carney Institute for Brain Science, Brown University, Providence, RI USA
| |
Collapse
|
18
|
Herff C, Diener L, Angrick M, Mugler E, Tate MC, Goldrick MA, Krusienski DJ, Slutzky MW, Schultz T. Generating Natural, Intelligible Speech From Brain Activity in Motor, Premotor, and Inferior Frontal Cortices. Front Neurosci 2019; 13:1267. [PMID: 31824257 PMCID: PMC6882773 DOI: 10.3389/fnins.2019.01267] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/07/2019] [Indexed: 12/17/2022] Open
Abstract
Neural interfaces that directly produce intelligible speech from brain activity would allow people with severe impairment from neurological disorders to communicate more naturally. Here, we record neural population activity in motor, premotor and inferior frontal cortices during speech production using electrocorticography (ECoG) and show that ECoG signals alone can be used to generate intelligible speech output that can preserve conversational cues. To produce speech directly from neural data, we adapted a method from the field of speech synthesis called unit selection, in which units of speech are concatenated to form audible output. In our approach, which we call Brain-To-Speech, we chose subsequent units of speech based on the measured ECoG activity to generate audio waveforms directly from the neural recordings. Brain-To-Speech employed the user's own voice to generate speech that sounded very natural and included features such as prosody and accentuation. By investigating the brain areas involved in speech production separately, we found that speech motor cortex provided more information for the reconstruction process than the other cortical areas.
Collapse
Affiliation(s)
- Christian Herff
- School of Mental Health & Neuroscience, Maastricht University, Maastricht, Netherlands
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Lorenz Diener
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Miguel Angrick
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| | - Emily Mugler
- Department of Neurology, Northwestern University, Chicago, IL, United States
| | - Matthew C. Tate
- Department of Neurosurgery, Northwestern University, Chicago, IL, United States
| | - Matthew A. Goldrick
- Department of Linguistics, Northwestern University, Chicago, IL, United States
| | - Dean J. Krusienski
- Biomedical Engineering Department, Virginia Commonwealth University, Richmond, VA, United States
| | - Marc W. Slutzky
- Department of Neurology, Northwestern University, Chicago, IL, United States
- Department of Physiology, Northwestern University, Chicago, IL, United States
- Department of Physical Medicine & Rehabilitation, Northwestern University, Chicago, IL, United States
| | - Tanja Schultz
- Cognitive Systems Lab, University of Bremen, Bremen, Germany
| |
Collapse
|
19
|
Moses DA, Leonard MK, Makin JG, Chang EF. Real-time decoding of question-and-answer speech dialogue using human cortical activity. Nat Commun 2019; 10:3096. [PMID: 31363096 PMCID: PMC6667454 DOI: 10.1038/s41467-019-10994-4] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 06/06/2019] [Indexed: 01/15/2023] Open
Abstract
Natural communication often occurs in dialogue, differentially engaging auditory and sensorimotor brain regions during listening and speaking. However, previous attempts to decode speech directly from the human brain typically consider listening or speaking tasks in isolation. Here, human participants listened to questions and responded aloud with answers while we used high-density electrocorticography (ECoG) recordings to detect when they heard or said an utterance and to then decode the utterance's identity. Because certain answers were only plausible responses to certain questions, we could dynamically update the prior probabilities of each answer using the decoded question likelihoods as context. We decode produced and perceived utterances with accuracy rates as high as 61% and 76%, respectively (chance is 7% and 20%). Contextual integration of decoded question likelihoods significantly improves answer decoding. These results demonstrate real-time decoding of speech in an interactive, conversational setting, which has important implications for patients who are unable to communicate.
Collapse
Affiliation(s)
- David A Moses
- Department of Neurological Surgery and the Center for Integrative Neuroscience at UC San Francisco, 675 Nelson Rising Lane, San Francisco, CA, 94158, USA
| | - Matthew K Leonard
- Department of Neurological Surgery and the Center for Integrative Neuroscience at UC San Francisco, 675 Nelson Rising Lane, San Francisco, CA, 94158, USA
| | - Joseph G Makin
- Department of Neurological Surgery and the Center for Integrative Neuroscience at UC San Francisco, 675 Nelson Rising Lane, San Francisco, CA, 94158, USA
| | - Edward F Chang
- Department of Neurological Surgery and the Center for Integrative Neuroscience at UC San Francisco, 675 Nelson Rising Lane, San Francisco, CA, 94158, USA.
| |
Collapse
|
20
|
Angrick M, Herff C, Johnson G, Shih J, Krusienski D, Schultz T. Interpretation of convolutional neural networks for speech spectrogram regression from intracranial recordings. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.080] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
21
|
Slutzky MW. Brain-Machine Interfaces: Powerful Tools for Clinical Treatment and Neuroscientific Investigations. Neuroscientist 2019; 25:139-154. [PMID: 29772957 PMCID: PMC6611552 DOI: 10.1177/1073858418775355] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Brain-machine interfaces (BMIs) have exploded in popularity in the past decade. BMIs, also called brain-computer interfaces, provide a direct link between the brain and a computer, usually to control an external device. BMIs have a wide array of potential clinical applications, ranging from restoring communication to people unable to speak due to amyotrophic lateral sclerosis or a stroke, to restoring movement to people with paralysis from spinal cord injury or motor neuron disease, to restoring memory to people with cognitive impairment. Because BMIs are controlled directly by the activity of prespecified neurons or cortical areas, they also provide a powerful paradigm with which to investigate fundamental questions about brain physiology, including neuronal behavior, learning, and the role of oscillations. This article reviews the clinical and neuroscientific applications of BMIs, with a primary focus on motor BMIs.
Collapse
Affiliation(s)
- Marc W Slutzky
- 1 Departments of Neurology, Physiology, and Physical Medicine & Rehabilitation, Northwestern University, Chicago, IL, USA
| |
Collapse
|
22
|
Xie Z, Reetzke R, Chandrasekaran B. Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:587-601. [PMID: 30950746 PMCID: PMC6802895 DOI: 10.1044/2018_jslhr-s-astm-18-0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 10/28/2018] [Accepted: 11/26/2018] [Indexed: 05/27/2023]
Abstract
Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based approaches to analyzing speech-evoked neurophysiological responses. Method Two categories of ML-based approaches are introduced: decoding models, which generate a speech stimulus output using the features from the neurophysiological responses, and encoding models, which use speech stimulus features to predict neurophysiological responses. In this review, we focus on (a) a decoding model classification approach, wherein speech-evoked neurophysiological responses are classified as belonging to 1 of a finite set of possible speech events (e.g., phonological categories), and (b) an encoding model temporal response function approach, which quantifies the transformation of a speech stimulus feature to continuous neural activity. Results We illustrate the utility of the classification approach to analyze early electroencephalographic (EEG) responses to Mandarin lexical tone categories from a traditional experimental design, and to classify EEG responses to English phonemes evoked by natural continuous speech (i.e., an audiobook) into phonological categories (plosive, fricative, nasal, and vowel). We also demonstrate the utility of temporal response function to predict EEG responses to natural continuous speech from acoustic features. Neural metrics from the 3 examples all exhibit statistically significant effects at the individual level. Conclusion We propose that ML-based approaches can complement traditional analysis approaches to analyze neurophysiological responses to speech signals and provide a deeper understanding of natural speech and language processing using ecologically valid paradigms in both typical and clinical populations.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Rachel Reetzke
- Department of Communication Sciences and Disorders, The University of Texas at Austin
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh
| |
Collapse
|
23
|
Rabbani Q, Milsap G, Crone NE. The Potential for a Speech Brain-Computer Interface Using Chronic Electrocorticography. Neurotherapeutics 2019; 16:144-165. [PMID: 30617653 PMCID: PMC6361062 DOI: 10.1007/s13311-018-00692-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A brain-computer interface (BCI) is a technology that uses neural features to restore or augment the capabilities of its user. A BCI for speech would enable communication in real time via neural correlates of attempted or imagined speech. Such a technology would potentially restore communication and improve quality of life for locked-in patients and other patients with severe communication disorders. There have been many recent developments in neural decoders, neural feature extraction, and brain recording modalities facilitating BCI for the control of prosthetics and in automatic speech recognition (ASR). Indeed, ASR and related fields have developed significantly over the past years, and many lend many insights into the requirements, goals, and strategies for speech BCI. Neural speech decoding is a comparatively new field but has shown much promise with recent studies demonstrating semantic, auditory, and articulatory decoding using electrocorticography (ECoG) and other neural recording modalities. Because the neural representations for speech and language are widely distributed over cortical regions spanning the frontal, parietal, and temporal lobes, the mesoscopic scale of population activity captured by ECoG surface electrode arrays may have distinct advantages for speech BCI, in contrast to the advantages of microelectrode arrays for upper-limb BCI. Nevertheless, there remain many challenges for the translation of speech BCIs to clinical populations. This review discusses and outlines the current state-of-the-art for speech BCI and explores what a speech BCI using chronic ECoG might entail.
Collapse
Affiliation(s)
- Qinwan Rabbani
- Department of Electrical Engineering, The Johns Hopkins University Whiting School of Engineering, Baltimore, MD, USA.
| | - Griffin Milsap
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Nathan E Crone
- Department of Neurology, The Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
24
|
Cooney C, Folli R, Coyle D. Neurolinguistics Research Advancing Development of a Direct-Speech Brain-Computer Interface. iScience 2018; 8:103-125. [PMID: 30296666 PMCID: PMC6174918 DOI: 10.1016/j.isci.2018.09.016] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 09/04/2018] [Accepted: 09/18/2018] [Indexed: 01/09/2023] Open
Abstract
A direct-speech brain-computer interface (DS-BCI) acquires neural signals corresponding to imagined speech, then processes and decodes these signals to produce a linguistic output in the form of phonemes, words, or sentences. Recent research has shown the potential of neurolinguistics to enhance decoding approaches to imagined speech with the inclusion of semantics and phonology in experimental procedures. As neurolinguistics research findings are beginning to be incorporated within the scope of DS-BCI research, it is our view that a thorough understanding of imagined speech, and its relationship with overt speech, must be considered an integral feature of research in this field. With a focus on imagined speech, we provide a review of the most important neurolinguistics research informing the field of DS-BCI and suggest how this research may be utilized to improve current experimental protocols and decoding techniques. Our review of the literature supports a cross-disciplinary approach to DS-BCI research, in which neurolinguistics concepts and methods are utilized to aid development of a naturalistic mode of communication.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
25
|
Abstract
OBJECTIVE Advances in electrophysiological methods such as electrocorticography (ECoG) have enabled researchers to decode phonemes, syllables, and words from brain activity. The ultimate aspiration underlying these efforts is the development of a brain-machine interface (BMI) that will enable speakers to produce real-time, naturalistic speech. In the effort to create such a device, researchers have typically followed a bottom-up approach whereby low-level units of language (e.g. phonemes, syllables, or letters) are decoded from articulation areas (e.g. premotor cortex) with the aim of assembling these low-level units into words and sentences. APPROACH In this paper, we recommend that researchers supplement the existing bottom-up approach with a novel top-down approach. According to the top-down proposal, initial decoding of top-down information may facilitate the subsequent decoding of downstream representations by constraining the hypothesis space from which low-level units are selected. MAIN RESULTS We identify types and sources of top-down information that may crucially inform BMI decoding ecosystems: communicative intentions (e.g. speech acts), situational pragmatics (e.g. recurrent communicative pressures), and formal linguistic data (e.g. syntactic rules and constructions, lexical collocations, speakers' individual speech histories). SIGNIFICANCE Given the inherently interactive nature of communication, we further propose that BMIs be entrained on neural responses associated with interactive dialogue tasks, as opposed to the typical practice of entraining BMIs with non-interactive presentations of language stimuli.
Collapse
Affiliation(s)
- Leon Li
- Department of Psychology and Neuroscience, Duke University, Durham, NC, United States of America
| | | |
Collapse
|
26
|
Martin S, Iturrate I, Millán JDR, Knight RT, Pasley BN. Decoding Inner Speech Using Electrocorticography: Progress and Challenges Toward a Speech Prosthesis. Front Neurosci 2018; 12:422. [PMID: 29977189 PMCID: PMC6021529 DOI: 10.3389/fnins.2018.00422] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 06/04/2018] [Indexed: 01/01/2023] Open
Abstract
Certain brain disorders resulting from brainstem infarcts, traumatic brain injury, cerebral palsy, stroke, and amyotrophic lateral sclerosis, limit verbal communication despite the patient being fully aware. People that cannot communicate due to neurological disorders would benefit from a system that can infer internal speech directly from brain signals. In this review article, we describe the state of the art in decoding inner speech, ranging from early acoustic sound features, to higher order speech units. We focused on intracranial recordings, as this technique allows monitoring brain activity with high spatial, temporal, and spectral resolution, and therefore is a good candidate to investigate inner speech. Despite intense efforts, investigating how the human cortex encodes inner speech remains an elusive challenge, due to the lack of behavioral and observable measures. We emphasize various challenges commonly encountered when investigating inner speech decoding, and propose potential solutions in order to get closer to a natural speech assistive device.
Collapse
Affiliation(s)
- Stephanie Martin
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| | - Iñaki Iturrate
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - José del R. Millán
- Defitech Chair in Brain Machine Interface, Center for Neuroprosthetics, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Robert T. Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
- Department of Psychology, University of California, Berkeley, Berkeley, CA, United States
| | - Brian N. Pasley
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|