1
|
Del Vecchio M, Bontemps B, Lance F, Gannerie A, Sipp F, Albertini D, Cassani CM, Chatard B, Dupin M, Lachaux JP. Introducing HiBoP: a Unity-based visualization software for large iEEG datasets. J Neurosci Methods 2024; 409:110179. [PMID: 38823595 DOI: 10.1016/j.jneumeth.2024.110179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 05/02/2024] [Accepted: 05/22/2024] [Indexed: 06/03/2024]
Abstract
BACKGROUND Intracranial EEG data offer a unique spatio-temporal precision to investigate human brain functions. Large datasets have become recently accessible thanks to new iEEG data-sharing practices and tighter collaboration with clinicians. Yet, the complexity of such datasets poses new challenges, especially regarding the visualization and anatomical display of iEEG. NEW METHOD We introduce HiBoP, a multi-modal visualization software specifically designed for large groups of patients and multiple experiments. Its main features include the dynamic display of iEEG responses induced by tasks/stimulations, the definition of Regions and electrodes Of Interest, and the shift between group-level and individual-level 3D anatomo-functional data. RESULTS We provide a use-case with data from 36 patients to reveal the global cortical dynamics following tactile stimulation. We used HiBoP to visualize high-gamma responses [50-150 Hz], and define three major response components in primary somatosensory and premotor cortices and parietal operculum. COMPARISON WITH EXISTING METHODS(S) Several iEEG softwares are now publicly available with outstanding analysis features. Yet, most were developed in languages (Python/Matlab) chosen to facilitate the inclusion of new analysis by users, rather than the quality of the visualization. HiBoP represents a visualization tool developed with videogame standards (Unity/C#), and performs detailed anatomical analysis rapidly, across multiple conditions, patients, and modalities with an easy export toward third-party softwares. CONCLUSION HiBoP provides a user-friendly environment that greatly facilitates the exploration of large iEEG datasets, and helps users decipher subtle structure/function relationships.
Collapse
Affiliation(s)
- Maria Del Vecchio
- Istituto di Neuroscienze, Consiglio Nazionale delle Ricerche, Parma 43125, Italy
| | - Benjamin Bontemps
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Florian Lance
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Adrien Gannerie
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Florian Sipp
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Davide Albertini
- Dipartimento di Medicina e Chirurgia, Università di Parma, Via Volturno 39, Parma 43125, Italy
| | - Chiara Maria Cassani
- Istituto di Neuroscienze, Consiglio Nazionale delle Ricerche, Parma 43125, Italy; Department of School of Advanced Studies, University of Camerino, Italy
| | - Benoit Chatard
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Maryne Dupin
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France
| | - Jean-Philippe Lachaux
- Lyon Neuroscience Research Center, EDUWELL team, INSERM UMRS 1028, CNRS UMR 5292, Université Claude Bernard Lyon 1, Université de Lyon, Lyon F-69000, France.
| |
Collapse
|
2
|
Bordonaro M. Postmortem communication. Theory Biosci 2024; 143:229-234. [PMID: 39096453 DOI: 10.1007/s12064-024-00423-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/27/2024] [Indexed: 08/05/2024]
Abstract
The phenomenon of near death and dying experiences has been both of popular interest and of scientific speculation. However, the reality of mental perception at the point of death is currently a subjective experience and has not been formally evaluated. While postmortem gene expression, even in humans, has been evaluated, restoration of postmortem brain activity has heretofore only been attempted in animal models, at the molecular and cellular levels. Meanwhile, progress has been made to translate brain activity of living humans into speech and images. This paper proposes two inter-related thought experiments. First, assuming progress and refinement of the technology of translating human brain activity into interpretable speech and images, can an objective analysis of death experiences be obtained by utilizing these technologies on dying humans? Second, can human brain function be revived postmortem and, if so, can the relevant technologies be utilized for communication with (recently) deceased individuals? In this paper, these questions are considered and possible implications explored.
Collapse
Affiliation(s)
- Michael Bordonaro
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA, 18509, USA.
| |
Collapse
|
3
|
Bickel B, Giraud AL, Zuberbühler K, van Schaik CP. Language follows a distinct mode of extra-genomic evolution. Phys Life Rev 2024; 50:211-225. [PMID: 39153248 DOI: 10.1016/j.plrev.2024.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 08/02/2024] [Indexed: 08/19/2024]
Abstract
As one of the most specific, yet most diverse of human behaviors, language is shaped by both genomic and extra-genomic evolution. Sharing methods and models between these modes of evolution has significantly advanced our understanding of language and inspired generalized theories of its evolution. Progress is hampered, however, by the fact that the extra-genomic evolution of languages, i.e. linguistic evolution, maps only partially to other forms of evolution. Contrasting it with the biological evolution of eukaryotes and the cultural evolution of technology as the best understood models, we show that linguistic evolution is special by yielding a stationary dynamic rather than stable solutions, and that this dynamic allows the use of language change for social differentiation while maintaining its global adaptiveness. Linguistic evolution furthermore differs from technological evolution by requiring vertical transmission, allowing the reconstruction of phylogenies; and it differs from eukaryotic biological evolution by foregoing a genotype vs phenotype distinction, allowing deliberate and biased change. Recognising these differences will improve our empirical tools and open new avenues for analyzing how linguistic, cultural, and biological evolution interacted with each other when language emerged in the hominin lineage. Importantly, our framework will help to cope with unprecedented scientific and ethical challenges that presently arise from how rapid cultural evolution impacts language, most urgently from interventional clinical tools for language disorders, potential epigenetic effects of technology on language, artificial intelligence and linguistic communicators, and global losses of linguistic diversity and identity. Beyond language, the distinctions made here allow identifying variation in other forms of biological and cultural evolution, developing new perspectives for empirical research.
Collapse
Affiliation(s)
- Balthasar Bickel
- Department of Comparative Language Science, University of Zurich, Switzerland; Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Switzerland.
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, University of Geneva, Switzerland; Institut de l'Audition, Institut Pasteur, INSERM, Université Paris Cité, France
| | - Klaus Zuberbühler
- Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Switzerland; Institute of Biology, University of Neuchâtel, Switzerland; School of Psychology and Neuroscience, University of St Andrews, United Kingdom
| | - Carel P van Schaik
- Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Switzerland; Department of Evolutionary Biology and Environmental Science, University of Zurich, Switzerland; Max Planck Institute for Animal Behavior, Konstanz, Germany
| |
Collapse
|
4
|
Rabbani Q, Shah S, Milsap G, Fifer M, Hermansky H, Crone N. Iterative alignment discovery of speech-associated neural activity. J Neural Eng 2024; 21:046056. [PMID: 39194182 DOI: 10.1088/1741-2552/ad663c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 07/22/2024] [Indexed: 08/29/2024]
Abstract
Objective. Brain-computer interfaces (BCIs) have the potential to preserve or restore speech in patients with neurological disorders that weaken the muscles involved in speech production. However, successful training of low-latency speech synthesis and recognition models requires alignment of neural activity with intended phonetic or acoustic output with high temporal precision. This is particularly challenging in patients who cannot produce audible speech, as ground truth with which to pinpoint neural activity synchronized with speech is not available.Approach. In this study, we present a new iterative algorithm for neural voice activity detection (nVAD) called iterative alignment discovery dynamic time warping (IAD-DTW) that integrates DTW into the loss function of a deep neural network (DNN). The algorithm is designed to discover the alignment between a patient's electrocorticographic (ECoG) neural responses and their attempts to speak during collection of data for training BCI decoders for speech synthesis and recognition.Main results. To demonstrate the effectiveness of the algorithm, we tested its accuracy in predicting the onset and duration of acoustic signals produced by able-bodied patients with intact speech undergoing short-term diagnostic ECoG recordings for epilepsy surgery. We simulated a lack of ground truth by randomly perturbing the temporal correspondence between neural activity and an initial single estimate for all speech onsets and durations. We examined the model's ability to overcome these perturbations to estimate ground truth. IAD-DTW showed no notable degradation (<1% absolute decrease in accuracy) in performance in these simulations, even in the case of maximal misalignments between speech and silence.Significance. IAD-DTW is computationally inexpensive and can be easily integrated into existing DNN-based nVAD approaches, as it pertains only to the final loss computation. This approach makes it possible to train speech BCI algorithms using ECoG data from patients who are unable to produce audible speech, including those with Locked-In Syndrome.
Collapse
Affiliation(s)
- Qinwan Rabbani
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, United States of America
| | - Samyak Shah
- Department of Neurology, Johns Hopkins Medicine, Baltimore, MD 21287, United States of America
| | - Griffin Milsap
- Research and Exploratory Development Department, Johns Hopkins University Applied Physics Laboratory, Laurel, MD 20723, United States of America
| | - Matthew Fifer
- Research and Exploratory Development Department, Johns Hopkins University Applied Physics Laboratory, Laurel, MD 20723, United States of America
| | - Hynek Hermansky
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, United States of America
| | - Nathan Crone
- Department of Neurology, Johns Hopkins Medicine, Baltimore, MD 21287, United States of America
| |
Collapse
|
5
|
Coles L, Ventrella D, Carnicer-Lombarte A, Elmi A, Troughton JG, Mariello M, El Hadwe S, Woodington BJ, Bacci ML, Malliaras GG, Barone DG, Proctor CM. Origami-inspired soft fluidic actuation for minimally invasive large-area electrocorticography. Nat Commun 2024; 15:6290. [PMID: 39060241 PMCID: PMC11282215 DOI: 10.1038/s41467-024-50597-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 07/16/2024] [Indexed: 07/28/2024] Open
Abstract
Electrocorticography is an established neural interfacing technique wherein an array of electrodes enables large-area recording from the cortical surface. Electrocorticography is commonly used for seizure mapping however the implantation of large-area electrocorticography arrays is a highly invasive procedure, requiring a craniotomy larger than the implant area to place the device. In this work, flexible thin-film electrode arrays are combined with concepts from soft robotics, to realize a large-area electrocorticography device that can change shape via integrated fluidic actuators. We show that the 32-electrode device can be packaged using origami-inspired folding into a compressed state and implanted through a small burr-hole craniotomy, then expanded on the surface of the brain for large-area cortical coverage. The implantation, expansion, and recording functionality of the device is confirmed in-vitro and in porcine in-vivo models. The integration of shape actuation into neural implants provides a clinically viable pathway to realize large-area neural interfaces via minimally invasive surgical techniques.
Collapse
Affiliation(s)
- Lawrence Coles
- Department of Engineering, University of Cambridge, Cambridge, UK
- Institute of Biomedical Engineering, Engineering Science Department, University of Oxford, Oxford, UK
| | - Domenico Ventrella
- Department of Veterinary Medical Sciences, Alma Mater Studiorum, University of Bologna, Ozzano dell'Emilia, Bologna, Italy
| | | | - Alberto Elmi
- Department of Veterinary Medical Sciences, Alma Mater Studiorum, University of Bologna, Ozzano dell'Emilia, Bologna, Italy
| | - Joe G Troughton
- Department of Engineering, University of Cambridge, Cambridge, UK
- Institute of Biomedical Engineering, Engineering Science Department, University of Oxford, Oxford, UK
| | - Massimo Mariello
- Institute of Biomedical Engineering, Engineering Science Department, University of Oxford, Oxford, UK
| | - Salim El Hadwe
- Department of Engineering, University of Cambridge, Cambridge, UK
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Ben J Woodington
- Department of Engineering, University of Cambridge, Cambridge, UK
| | - Maria L Bacci
- Department of Veterinary Medical Sciences, Alma Mater Studiorum, University of Bologna, Ozzano dell'Emilia, Bologna, Italy
| | | | - Damiano G Barone
- Department of Engineering, University of Cambridge, Cambridge, UK
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Christopher M Proctor
- Department of Engineering, University of Cambridge, Cambridge, UK.
- Institute of Biomedical Engineering, Engineering Science Department, University of Oxford, Oxford, UK.
| |
Collapse
|
6
|
te Rietmolen N, Mercier MR, Trébuchon A, Morillon B, Schön D. Speech and music recruit frequency-specific distributed and overlapping cortical networks. eLife 2024; 13:RP94509. [PMID: 39038076 PMCID: PMC11262799 DOI: 10.7554/elife.94509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024] Open
Abstract
To what extent does speech and music processing rely on domain-specific and domain-general neural networks? Using whole-brain intracranial EEG recordings in 18 epilepsy patients listening to natural, continuous speech or music, we investigated the presence of frequency-specific and network-level brain activity. We combined it with a statistical approach in which a clear operational distinction is made between shared, preferred, and domain-selective neural responses. We show that the majority of focal and network-level neural activity is shared between speech and music processing. Our data also reveal an absence of anatomical regional selectivity. Instead, domain-selective neural responses are restricted to distributed and frequency-specific coherent oscillations, typical of spectral fingerprints. Our work highlights the importance of considering natural stimuli and brain dynamics in their full complexity to map cognitive and brain functions.
Collapse
Affiliation(s)
- Noémie te Rietmolen
- Institute for Language, Communication, and the Brain, Aix-Marseille UniversityMarseilleFrance
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des SystèmesMarseilleFrance
| | - Manuel R Mercier
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des SystèmesMarseilleFrance
| | - Agnès Trébuchon
- Institute for Language, Communication, and the Brain, Aix-Marseille UniversityMarseilleFrance
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des SystèmesMarseilleFrance
- APHM, Hôpital de la Timone, Service de Neurophysiologie CliniqueMarseilleFrance
| | - Benjamin Morillon
- Institute for Language, Communication, and the Brain, Aix-Marseille UniversityMarseilleFrance
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des SystèmesMarseilleFrance
| | - Daniele Schön
- Institute for Language, Communication, and the Brain, Aix-Marseille UniversityMarseilleFrance
- Aix Marseille Université, INSERM, INS, Institut de Neurosciences des SystèmesMarseilleFrance
| |
Collapse
|
7
|
de Borman A, Wittevrongel B, Dauwe I, Carrette E, Meurs A, Van Roost D, Boon P, Van Hulle MM. Imagined speech event detection from electrocorticography and its transfer between speech modes and subjects. Commun Biol 2024; 7:818. [PMID: 38969758 PMCID: PMC11226700 DOI: 10.1038/s42003-024-06518-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 06/27/2024] [Indexed: 07/07/2024] Open
Abstract
Speech brain-computer interfaces aim to support communication-impaired patients by translating neural signals into speech. While impressive progress was achieved in decoding performed, perceived and attempted speech, imagined speech remains elusive, mainly due to the absence of behavioral output. Nevertheless, imagined speech is advantageous since it does not depend on any articulator movements that might become impaired or even lost throughout the stages of a neurodegenerative disease. In this study, we analyzed electrocortigraphy data recorded from 16 participants in response to 3 speech modes: performed, perceived (listening), and imagined speech. We used a linear model to detect speech events and examined the contributions of each frequency band, from delta to high gamma, given the speech mode and electrode location. For imagined speech detection, we observed a strong contribution of gamma bands in the motor cortex, whereas lower frequencies were more prominent in the temporal lobe, in particular of the left hemisphere. Based on the similarities in frequency patterns, we were able to transfer models between speech modes and participants with similar electrode locations.
Collapse
Affiliation(s)
- Aurélie de Borman
- Laboratory for Neuro- and Psychophysiology, KU Leuven, Leuven, Belgium.
| | | | - Ine Dauwe
- Department of Neurology, Ghent University Hospital, Ghent, Belgium
| | - Evelien Carrette
- Department of Neurology, Ghent University Hospital, Ghent, Belgium
| | - Alfred Meurs
- Department of Neurology, Ghent University Hospital, Ghent, Belgium
| | - Dirk Van Roost
- Department of Neurosurgery, Ghent University Hospital, Ghent, Belgium
| | - Paul Boon
- Department of Neurology, Ghent University Hospital, Ghent, Belgium
| | - Marc M Van Hulle
- Laboratory for Neuro- and Psychophysiology, KU Leuven, Leuven, Belgium
- Leuven Brain Institute (LBI), Leuven, Belgium
- Leuven Institute for Artificial Intelligence (Leuven.AI), Leuven, Belgium
| |
Collapse
|
8
|
Silva AB, Littlejohn KT, Liu JR, Moses DA, Chang EF. The speech neuroprosthesis. Nat Rev Neurosci 2024; 25:473-492. [PMID: 38745103 DOI: 10.1038/s41583-024-00819-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/12/2024] [Indexed: 05/16/2024]
Abstract
Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by directly decoding speech from intact cortical activity has the potential to restore natural communication and self-expression. Recent discoveries have defined how key features of speech production are facilitated by the coordinated activity of vocal-tract articulatory and motor-planning cortical representations. In this Review, we highlight such progress and how it has led to successful speech decoding, first in individuals implanted with intracranial electrodes for clinical epilepsy monitoring and subsequently in individuals with paralysis as part of early feasibility clinical trials to restore speech. We discuss high-spatiotemporal-resolution neural interfaces and the adaptation of state-of-the-art speech computational algorithms that have driven rapid and substantial progress in decoding neural activity into text, audible speech, and facial movements. Although restoring natural speech is a long-term goal, speech neuroprostheses already have performance levels that surpass communication rates offered by current assistive-communication technology. Given this accelerated rate of progress in the field, we propose key evaluation metrics for speed and accuracy, among others, to help standardize across studies. We finish by highlighting several directions to more fully explore the multidimensional feature space of speech and language, which will continue to accelerate progress towards a clinically viable speech neuroprosthesis.
Collapse
Affiliation(s)
- Alexander B Silva
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Kaylo T Littlejohn
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Jessie R Liu
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - David A Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA.
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
9
|
Monney J, Dallaire SE, Stoutah L, Fanda L, Mégevand P. Voxeloc: A time-saving graphical user interface for localizing and visualizing stereo-EEG electrodes. J Neurosci Methods 2024; 407:110154. [PMID: 38697518 DOI: 10.1016/j.jneumeth.2024.110154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 03/26/2024] [Accepted: 04/27/2024] [Indexed: 05/05/2024]
Abstract
BACKGROUND Thanks to its unrivalled spatial and temporal resolutions and signal-to-noise ratio, intracranial EEG (iEEG) is becoming a valuable tool in neuroscience research. To attribute functional properties to cortical tissue, it is paramount to be able to determine precisely the localization of each electrode with respect to a patient's brain anatomy. Several software packages or pipelines offer the possibility to localize manually or semi-automatically iEEG electrodes. However, their reliability and ease of use may leave to be desired. NEW METHOD Voxeloc (voxel electrode locator) is a Matlab-based graphical user interface to localize and visualize stereo-EEG electrodes. Voxeloc adopts a semi-automated approach to determine the coordinates of each electrode contact, the user only needing to indicate the deep-most contact of each electrode shaft and another point more proximally. RESULTS With a deliberately streamlined functionality and intuitive graphical user interface, the main advantages of Voxeloc are ease of use and inter-user reliability. Additionally, oblique slices along the shaft of each electrode can be generated to facilitate the precise localization of each contact. Voxeloc is open-source software and is compatible with the open iEEG-BIDS (Brain Imaging Data Structure) format. COMPARISON WITH EXISTING METHODS Localizing full patients' iEEG implants was faster using Voxeloc than two comparable software packages, and the inter-user agreement was better. CONCLUSIONS Voxeloc offers an easy-to-use and reliable tool to localize and visualize stereo-EEG electrodes. This will contribute to democratizing neuroscience research using iEEG.
Collapse
Affiliation(s)
- Jonathan Monney
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Shannon E Dallaire
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Dalhousie University, Halifax, Canada
| | - Lydia Stoutah
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Université Paris-Saclay, Paris, France
| | - Lora Fanda
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Pierre Mégevand
- Clinical Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Basic Neuroscience department, Faculty of Medicine, University of Geneva, Geneva, Switzerland; Neurology division, Geneva University Hospitals, Geneva, Switzerland.
| |
Collapse
|
10
|
Wu X, Wellington S, Fu Z, Zhang D. Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods. J Neural Eng 2024; 21:036055. [PMID: 38885688 DOI: 10.1088/1741-2552/ad593a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 06/17/2024] [Indexed: 06/20/2024]
Abstract
Objective.Brain-computer interfaces (BCIs) are technologies that bypass damaged or disrupted neural pathways and directly decode brain signals to perform intended actions. BCIs for speech have the potential to restore communication by decoding the intended speech directly. Many studies have demonstrated promising results using invasive micro-electrode arrays and electrocorticography. However, the use of stereo-electroencephalography (sEEG) for speech decoding has not been fully recognized.Approach.In this research, recently released sEEG data were used to decode Dutch words spoken by epileptic participants. We decoded speech waveforms from sEEG data using advanced deep-learning methods. Three methods were implemented: a linear regression method, an recurrent neural network (RNN)-based sequence-to-sequence model (RNN), and a transformer model.Main results.Our RNN and transformer models outperformed the linear regression significantly, while no significant difference was found between the two deep-learning methods. Further investigation on individual electrodes showed that the same decoding result can be obtained using only a few of the electrodes.Significance.This study demonstrated that decoding speech from sEEG signals is possible, and the location of the electrodes is critical to the decoding performance.
Collapse
Affiliation(s)
- Xiaolong Wu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Scott Wellington
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Zhichun Fu
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Dingguo Zhang
- Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom
| |
Collapse
|
11
|
Tan H, Zeng X, Ni J, Liang K, Xu C, Zhang Y, Wang J, Li Z, Yang J, Han C, Gao Y, Yu X, Han S, Meng F, Ma Y. Intracranial EEG signals disentangle multi-areal neural dynamics of vicarious pain perception. Nat Commun 2024; 15:5203. [PMID: 38890380 PMCID: PMC11189531 DOI: 10.1038/s41467-024-49541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 06/07/2024] [Indexed: 06/20/2024] Open
Abstract
Empathy enables understanding and sharing of others' feelings. Human neuroimaging studies have identified critical brain regions supporting empathy for pain, including the anterior insula (AI), anterior cingulate (ACC), amygdala, and inferior frontal gyrus (IFG). However, to date, the precise spatio-temporal profiles of empathic neural responses and inter-regional communications remain elusive. Here, using intracranial electroencephalography, we investigated electrophysiological signatures of vicarious pain perception. Others' pain perception induced early increases in high-gamma activity in IFG, beta power increases in ACC, but decreased beta power in AI and amygdala. Vicarious pain perception also altered the beta-band-coordinated coupling between ACC, AI, and amygdala, as well as increased modulation of IFG high-gamma amplitudes by beta phases of amygdala/AI/ACC. We identified a necessary combination of neural features for decoding vicarious pain perception. These spatio-temporally specific regional activities and inter-regional interactions within the empathy network suggest a neurodynamic model of human pain empathy.
Collapse
Affiliation(s)
- Huixin Tan
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Xiaoyu Zeng
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Jun Ni
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Kun Liang
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Cuiping Xu
- Department of Functional Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Yanyang Zhang
- Department of Neurosurgery, Chinese PLA General Hospital, Beijing, China
| | - Jiaxin Wang
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Zizhou Li
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Jiaxin Yang
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China
| | - Chunlei Han
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Yuan Gao
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Xinguang Yu
- Department of Neurosurgery, Chinese PLA General Hospital, Beijing, China
| | - Shihui Han
- School of Psychological and Cognitive Sciences, PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing, China
| | - Fangang Meng
- Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
- Chinese Institute for Brain Research, Beijing, China.
| | - Yina Ma
- State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University, Beijing, China.
- IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China.
- Beijing Key Laboratory of Brain Imaging and Connectomics, Beijing Normal University, Beijing, China.
- Chinese Institute for Brain Research, Beijing, China.
| |
Collapse
|
12
|
Wandelt SK, Bjånes DA, Pejsa K, Lee B, Liu C, Andersen RA. Representation of internal speech by single neurons in human supramarginal gyrus. Nat Hum Behav 2024; 8:1136-1149. [PMID: 38740984 PMCID: PMC11199147 DOI: 10.1038/s41562-024-01867-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 03/16/2024] [Indexed: 05/16/2024]
Abstract
Speech brain-machine interfaces (BMIs) translate brain signals into words or audio outputs, enabling communication for people having lost their speech abilities due to diseases or injury. While important advances in vocalized, attempted and mimed speech decoding have been achieved, results for internal speech decoding are sparse and have yet to achieve high functionality. Notably, it is still unclear from which brain areas internal speech can be decoded. Here two participants with tetraplegia with implanted microelectrode arrays located in the supramarginal gyrus (SMG) and primary somatosensory cortex (S1) performed internal and vocalized speech of six words and two pseudowords. In both participants, we found significant neural representation of internal and vocalized speech, at the single neuron and population level in the SMG. From recorded population activity in the SMG, the internally spoken and vocalized words were significantly decodable. In an offline analysis, we achieved average decoding accuracies of 55% and 24% for each participant, respectively (chance level 12.5%), and during an online internal speech BMI task, we averaged 79% and 23% accuracy, respectively. Evidence of shared neural representations between internal speech, word reading and vocalized speech processes was found in participant 1. SMG represented words as well as pseudowords, providing evidence for phonetic encoding. Furthermore, our decoder achieved high classification with multiple internal speech strategies (auditory imagination/visual imagination). Activity in S1 was modulated by vocalized but not internal speech in both participants, suggesting no articulator movements of the vocal tract occurred during internal speech production. This work represents a proof-of-concept for a high-performance internal speech BMI.
Collapse
Affiliation(s)
- Sarah K Wandelt
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA.
| | - David A Bjånes
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
- Rancho Los Amigos National Rehabilitation Center, Downey, CA, USA
| | - Kelsie Pejsa
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
| | - Brian Lee
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Neurological Surgery, Keck School of Medicine of USC, Los Angeles, CA, USA
- USC Neurorestoration Center, Keck School of Medicine of USC, Los Angeles, CA, USA
| | - Charles Liu
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Rancho Los Amigos National Rehabilitation Center, Downey, CA, USA
- Department of Neurological Surgery, Keck School of Medicine of USC, Los Angeles, CA, USA
- USC Neurorestoration Center, Keck School of Medicine of USC, Los Angeles, CA, USA
| | - Richard A Andersen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- T&C Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
13
|
Zhang W, Jiang M, Teo KAC, Bhuvanakantham R, Fong L, Sim WKJ, Guo Z, Foo CHV, Chua RHJ, Padmanabhan P, Leong V, Lu J, Gulyás B, Guan C. Revealing the spatiotemporal brain dynamics of covert speech compared with overt speech: A simultaneous EEG-fMRI study. Neuroimage 2024; 293:120629. [PMID: 38697588 DOI: 10.1016/j.neuroimage.2024.120629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 04/17/2024] [Accepted: 04/29/2024] [Indexed: 05/05/2024] Open
Abstract
Covert speech (CS) refers to speaking internally to oneself without producing any sound or movement. CS is involved in multiple cognitive functions and disorders. Reconstructing CS content by brain-computer interface (BCI) is also an emerging technique. However, it is still controversial whether CS is a truncated neural process of overt speech (OS) or involves independent patterns. Here, we performed a word-speaking experiment with simultaneous EEG-fMRI. It involved 32 participants, who generated words both overtly and covertly. By integrating spatial constraints from fMRI into EEG source localization, we precisely estimated the spatiotemporal dynamics of neural activity. During CS, EEG source activity was localized in three regions: the left precentral gyrus, the left supplementary motor area, and the left putamen. Although OS involved more brain regions with stronger activations, CS was characterized by an earlier event-locked activation in the left putamen (peak at 262 ms versus 1170 ms). The left putamen was also identified as the only hub node within the functional connectivity (FC) networks of both OS and CS, while showing weaker FC strength towards speech-related regions in the dominant hemisphere during CS. Path analysis revealed significant multivariate associations, indicating an indirect association between the earlier activation in the left putamen and CS, which was mediated by reduced FC towards speech-related regions. These findings revealed the specific spatiotemporal dynamics of CS, offering insights into CS mechanisms that are potentially relevant for future treatment of self-regulation deficits, speech disorders, and development of BCI speech applications.
Collapse
Affiliation(s)
- Wei Zhang
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Muyun Jiang
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Kok Ann Colin Teo
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; IGP-Neuroscience, Interdisciplinary Graduate Programme, Nanyang Technological University, Singapore; Division of Neurosurgery, National University Health System, Singapore
| | - Raghavan Bhuvanakantham
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - LaiGuan Fong
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore
| | - Wei Khang Jeremy Sim
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; IGP-Neuroscience, Interdisciplinary Graduate Programme, Nanyang Technological University, Singapore
| | - Zhiwei Guo
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | | | | | - Parasuraman Padmanabhan
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Victoria Leong
- Division of Psychology, Nanyang Technological University, Singapore; Department of Pediatrics, University of Cambridge, United Kingdom
| | - Jia Lu
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; DSO National Laboratories, Singapore; Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Balázs Gulyás
- Cognitive Neuroimaging Centre, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.
| | - Cuntai Guan
- School of Computer Science and Engineering, Nanyang Technological University, Singapore.
| |
Collapse
|
14
|
Komeiji S, Mitsuhashi T, Iimura Y, Suzuki H, Sugano H, Shinoda K, Tanaka T. Feasibility of decoding covert speech in ECoG with a Transformer trained on overt speech. Sci Rep 2024; 14:11491. [PMID: 38769115 PMCID: PMC11106343 DOI: 10.1038/s41598-024-62230-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 05/15/2024] [Indexed: 05/22/2024] Open
Abstract
Several attempts for speech brain-computer interfacing (BCI) have been made to decode phonemes, sub-words, words, or sentences using invasive measurements, such as the electrocorticogram (ECoG), during auditory speech perception, overt speech, or imagined (covert) speech. Decoding sentences from covert speech is a challenging task. Sixteen epilepsy patients with intracranially implanted electrodes participated in this study, and ECoGs were recorded during overt speech and covert speech of eight Japanese sentences, each consisting of three tokens. In particular, Transformer neural network model was applied to decode text sentences from covert speech, which was trained using ECoGs obtained during overt speech. We first examined the proposed Transformer model using the same task for training and testing, and then evaluated the model's performance when trained with overt task for decoding covert speech. The Transformer model trained on covert speech achieved an average token error rate (TER) of 46.6% for decoding covert speech, whereas the model trained on overt speech achieved a TER of 46.3% ( p > 0.05 ; d = 0.07 ) . Therefore, the challenge of collecting training data for covert speech can be addressed using overt speech. The performance of covert speech can improve by employing several overt speeches.
Collapse
Affiliation(s)
- Shuji Komeiji
- Department of Electronic and Information Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
| | - Takumi Mitsuhashi
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Yasushi Iimura
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Hiroharu Suzuki
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Hidenori Sugano
- Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan
| | - Koichi Shinoda
- Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| | - Toshihisa Tanaka
- Department of Electronic and Information Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan.
| |
Collapse
|
15
|
Li H, Li H, Ma L, Polina D. Revealing brain's cognitive process deeply: a study of the consistent EEG patterns of audio-visual perceptual holistic. Front Hum Neurosci 2024; 18:1377233. [PMID: 38601801 PMCID: PMC11004307 DOI: 10.3389/fnhum.2024.1377233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 03/14/2024] [Indexed: 04/12/2024] Open
Abstract
Introduction To investigate the brain's cognitive process and perceptual holistic, we have developed a novel method that focuses on the informational attributes of stimuli. Methods We recorded EEG signals during visual and auditory perceptual cognition experiments and conducted ERP analyses to observe specific positive and negative components occurring after 400ms during both visual and auditory perceptual processes. These ERP components represent the brain's perceptual holistic processing activities, which we have named Information-Related Potentials (IRPs). We combined IRPs with machine learning methods to decode cognitive processes in the brain. Results Our experimental results indicate that IRPs can better characterize information processing, particularly perceptual holism. Additionally, we conducted a brain network analysis and found that visual and auditory perceptual holistic processing share consistent neural pathways. Discussion Our efforts not only demonstrate the specificity, significance, and reliability of IRPs but also reveal their great potential for future brain mechanism research and BCI applications.
Collapse
Affiliation(s)
| | - Haifeng Li
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | | | | |
Collapse
|
16
|
Wu X, Zhang D, Li G, Gao X, Metcalfe B, Chen L. Data augmentation for invasive brain-computer interfaces based on stereo-electroencephalography (SEEG). J Neural Eng 2024; 21:016026. [PMID: 38237174 DOI: 10.1088/1741-2552/ad200e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/18/2024] [Indexed: 02/23/2024]
Abstract
Objective.Deep learning is increasingly used for brain-computer interfaces (BCIs). However, the quantity of available data is sparse, especially for invasive BCIs. Data augmentation (DA) methods, such as generative models, can help to address this sparseness. However, all the existing studies on brain signals were based on convolutional neural networks and ignored the temporal dependence. This paper attempted to enhance generative models by capturing the temporal relationship from a time-series perspective.Approach. A conditional generative network (conditional transformer-based generative adversarial network (cTGAN)) based on the transformer model was proposed. The proposed method was tested using a stereo-electroencephalography (SEEG) dataset which was recorded from eight epileptic patients performing five different movements. Three other commonly used DA methods were also implemented: noise injection (NI), variational autoencoder (VAE), and conditional Wasserstein generative adversarial network with gradient penalty (cWGANGP). Using the proposed method, the artificial SEEG data was generated, and several metrics were used to compare the data quality, including visual inspection, cosine similarity (CS), Jensen-Shannon distance (JSD), and the effect on the performance of a deep learning-based classifier.Main results. Both the proposed cTGAN and the cWGANGP methods were able to generate realistic data, while NI and VAE outputted inferior samples when visualized as raw sequences and in a lower dimensional space. The cTGAN generated the best samples in terms of CS and JSD and outperformed cWGANGP significantly in enhancing the performance of a deep learning-based classifier (each of them yielding a significant improvement of 6% and 3.4%, respectively).Significance. This is the first time that DA methods have been applied to invasive BCIs based on SEEG. In addition, this study demonstrated the advantages of the model that preserves the temporal dependence from a time-series perspective.
Collapse
Affiliation(s)
- Xiaolong Wu
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Dingguo Zhang
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Guangye Li
- School of Mechanical Engineering, Shanghai Jiao Tong University, People's Republic of China
| | - Xin Gao
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Benjamin Metcalfe
- The Centre for Autonomous Robotics (CENTAUR), Department of Electronic & Electrical Engineering, University of Bath, Bath, United Kingdom
| | - Liang Chen
- Liang Chen is with Huashan Hospital, Fudan University, People's Republic of China
| |
Collapse
|
17
|
Luo S, Angrick M, Coogan C, Candrea DN, Wyse‐Sookoo K, Shah S, Rabbani Q, Milsap GW, Weiss AR, Anderson WS, Tippett DC, Maragakis NJ, Clawson LL, Vansteensel MJ, Wester BA, Tenore FV, Hermansky H, Fifer MS, Ramsey NF, Crone NE. Stable Decoding from a Speech BCI Enables Control for an Individual with ALS without Recalibration for 3 Months. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2304853. [PMID: 37875404 PMCID: PMC10724434 DOI: 10.1002/advs.202304853] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/18/2023] [Indexed: 10/26/2023]
Abstract
Brain-computer interfaces (BCIs) can be used to control assistive devices by patients with neurological disorders like amyotrophic lateral sclerosis (ALS) that limit speech and movement. For assistive control, it is desirable for BCI systems to be accurate and reliable, preferably with minimal setup time. In this study, a participant with severe dysarthria due to ALS operates computer applications with six intuitive speech commands via a chronic electrocorticographic (ECoG) implant over the ventral sensorimotor cortex. Speech commands are accurately detected and decoded (median accuracy: 90.59%) throughout a 3-month study period without model retraining or recalibration. Use of the BCI does not require exogenous timing cues, enabling the participant to issue self-paced commands at will. These results demonstrate that a chronically implanted ECoG-based speech BCI can reliably control assistive devices over long time periods with only initial model training and calibration, supporting the feasibility of unassisted home use.
Collapse
Affiliation(s)
- Shiyu Luo
- Department of Biomedical EngineeringJohns Hopkins University School of MedicineBaltimoreMD21205USA
| | - Miguel Angrick
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - Christopher Coogan
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - Daniel N. Candrea
- Department of Biomedical EngineeringJohns Hopkins University School of MedicineBaltimoreMD21205USA
| | - Kimberley Wyse‐Sookoo
- Department of Biomedical EngineeringJohns Hopkins University School of MedicineBaltimoreMD21205USA
| | - Samyak Shah
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - Qinwan Rabbani
- Department of Electrical and Computer EngineeringJohns Hopkins UniversityBaltimoreMD21218USA
- Center for Language and Speech ProcessingJohns Hopkins UniversityBaltimoreMD21218USA
| | - Griffin W. Milsap
- Research and Exploratory Development DepartmentJohns Hopkins University Applied Physics LaboratoryLaurelMD20723USA
| | - Alexander R. Weiss
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - William S. Anderson
- Department of NeurosurgeryJohns Hopkins University School of MedicineBaltimoreMD21205USA
| | - Donna C. Tippett
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
- Department of Otolaryngology‐Head and Neck SurgeryJohns Hopkins University School of MedicineBaltimoreMD21205USA
- Department of Physical Medicine and RehabilitationJohns Hopkins University School of MedicineBaltimoreMD21205USA
| | - Nicholas J. Maragakis
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - Lora L. Clawson
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| | - Mariska J. Vansteensel
- Department of Neurology and NeurosurgeryUMC Utrecht Brain CenterUtrecht3584The Netherlands
| | - Brock A. Wester
- Research and Exploratory Development DepartmentJohns Hopkins University Applied Physics LaboratoryLaurelMD20723USA
| | - Francesco V. Tenore
- Research and Exploratory Development DepartmentJohns Hopkins University Applied Physics LaboratoryLaurelMD20723USA
| | - Hynek Hermansky
- Department of Electrical and Computer EngineeringJohns Hopkins UniversityBaltimoreMD21218USA
- Center for Language and Speech ProcessingJohns Hopkins UniversityBaltimoreMD21218USA
| | - Matthew S. Fifer
- Research and Exploratory Development DepartmentJohns Hopkins University Applied Physics LaboratoryLaurelMD20723USA
| | - Nick F. Ramsey
- Department of Neurology and NeurosurgeryUMC Utrecht Brain CenterUtrecht3584The Netherlands
| | - Nathan E. Crone
- Department of NeurologyJohns Hopkins University School of MedicineBaltimoreMD21287USA
| |
Collapse
|
18
|
Hovsepyan S, Olasagasti I, Giraud AL. Rhythmic modulation of prediction errors: A top-down gating role for the beta-range in speech processing. PLoS Comput Biol 2023; 19:e1011595. [PMID: 37934766 PMCID: PMC10655987 DOI: 10.1371/journal.pcbi.1011595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/17/2023] [Accepted: 10/11/2023] [Indexed: 11/09/2023] Open
Abstract
Natural speech perception requires processing the ongoing acoustic input while keeping in mind the preceding one and predicting the next. This complex computational problem could be handled by a dynamic multi-timescale hierarchical inferential process that coordinates the information flow up and down the language network hierarchy. Using a predictive coding computational model (Precoss-β) that identifies online individual syllables from continuous speech, we address the advantage of a rhythmic modulation of up and down information flows, and whether beta oscillations could be optimal for this. In the model, and consistent with experimental data, theta and low-gamma neural frequency scales ensure syllable-tracking and phoneme-level speech encoding, respectively, while the beta rhythm is associated with inferential processes. We show that a rhythmic alternation of bottom-up and top-down processing regimes improves syllable recognition, and that optimal efficacy is reached when the alternation of bottom-up and top-down regimes, via oscillating prediction error precisions, is in the beta range (around 20-30 Hz). These results not only demonstrate the advantage of a rhythmic alternation of up- and down-going information, but also that the low-beta range is optimal given sensory analysis at theta and low-gamma scales. While specific to speech processing, the notion of alternating bottom-up and top-down processes with frequency multiplexing might generalize to other cognitive architectures.
Collapse
Affiliation(s)
- Sevada Hovsepyan
- Department of Basic Neurosciences, University of Geneva, Biotech Campus, Genève, Switzerland
| | - Itsaso Olasagasti
- Department of Basic Neurosciences, University of Geneva, Biotech Campus, Genève, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, University of Geneva, Biotech Campus, Genève, Switzerland
- Institut Pasteur, Université Paris Cité, Inserm, Institut de l’Audition, France
| |
Collapse
|
19
|
Peterson V, Vissani M, Luo S, Rabbani Q, Crone NE, Bush A, Mark Richardson R. A supervised data-driven spatial filter denoising method for speech artifacts in intracranial electrophysiological recordings. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.05.535577. [PMID: 37066306 PMCID: PMC10104030 DOI: 10.1101/2023.04.05.535577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Neurosurgical procedures that enable direct brain recordings in awake patients offer unique opportunities to explore the neurophysiology of human speech. The scarcity of these opportunities and the altruism of participating patients compel us to apply the highest rigor to signal analysis. Intracranial electroencephalography (iEEG) signals recorded during overt speech can contain a speech artifact that tracks the fundamental frequency (F0) of the participant's voice, involving the same high-gamma frequencies that are modulated during speech production and perception. To address this artifact, we developed a spatial-filtering approach to identify and remove acoustic-induced contaminations of the recorded signal. We found that traditional reference schemes jeopardized signal quality, whereas our data-driven method denoised the recordings while preserving underlying neural activity.
Collapse
Affiliation(s)
- Victoria Peterson
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
- Instituto de Matemática Aplicada del Litoral, IMAL, FIQ-UNL, CONICET, Santa Fe, Argentina
| | - Matteo Vissani
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
| | - Shiyu Luo
- Department of Biomedical Engineering, The Johns Hopkins University School of Medicine
| | - Qinwan Rabbani
- Department of Electrical & Computer Engineering, The Johns Hopkins University
| | - Nathan E. Crone
- Department of Neurology, The Johns Hopkins University School of Medicine
| | - Alan Bush
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
| | - R. Mark Richardson
- Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, United States
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
| |
Collapse
|
20
|
Sankaran N, Moses D, Chiong W, Chang EF. Recommendations for promoting user agency in the design of speech neuroprostheses. Front Hum Neurosci 2023; 17:1298129. [PMID: 37920562 PMCID: PMC10619159 DOI: 10.3389/fnhum.2023.1298129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/04/2023] [Indexed: 11/04/2023] Open
Abstract
Brain-computer interfaces (BCI) that directly decode speech from brain activity aim to restore communication in people with paralysis who cannot speak. Despite recent advances, neural inference of speech remains imperfect, limiting the ability for speech BCIs to enable experiences such as fluent conversation that promote agency - that is, the ability for users to author and transmit messages enacting their intentions. Here, we make recommendations for promoting agency based on existing and emerging strategies in neural engineering. The focus is on achieving fast, accurate, and reliable performance while ensuring volitional control over when a decoder is engaged, what exactly is decoded, and how messages are expressed. Additionally, alongside neuroscientific progress within controlled experimental settings, we argue that a parallel line of research must consider how to translate experimental successes into real-world environments. While such research will ultimately require input from prospective users, here we identify and describe design choices inspired by human-factors work conducted in existing fields of assistive technology, which address practical issues likely to emerge in future real-world speech BCI applications.
Collapse
Affiliation(s)
- Narayan Sankaran
- Kavli Center for Ethics, Science and the Public, University of California, Berkeley, Berkeley, CA, United States
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - David Moses
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| | - Winston Chiong
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, United States
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
- Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
21
|
Berezutskaya J, Freudenburg ZV, Vansteensel MJ, Aarnoutse EJ, Ramsey NF, van Gerven MAJ. Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models. J Neural Eng 2023; 20:056010. [PMID: 37467739 PMCID: PMC10510111 DOI: 10.1088/1741-2552/ace8be] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 07/12/2023] [Accepted: 07/19/2023] [Indexed: 07/21/2023]
Abstract
Objective.Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the field.Approach.In this paper, we optimized and validated a decoding approach based on speech reconstruction directly from high-density electrocorticography recordings from sensorimotor cortex during a speech production task.Main results.We show that (1) dedicated machine learning optimization of reconstruction models is key for achieving the best reconstruction performance; (2) individual word decoding in reconstructed speech achieves 92%-100% accuracy (chance level is 8%); (3) direct reconstruction from sensorimotor brain activity produces intelligible speech.Significance.These results underline the need for model optimization in achieving best speech decoding results and highlight the potential that reconstruction-based speech decoding from sensorimotor cortex can offer for development of next-generation BCI technology for communication.
Collapse
Affiliation(s)
- Julia Berezutskaya
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| | - Zachary V Freudenburg
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Mariska J Vansteensel
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Erik J Aarnoutse
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Nick F Ramsey
- Brain Center, Department of Neurology and Neurosurgery, University Medical Center Utrecht, Utrecht 3584 CX, The Netherlands
| | - Marcel A J van Gerven
- Donders Center for Brain, Cognition and Behaviour, Nijmegen 6525 GD, The Netherlands
| |
Collapse
|
22
|
Bellier L, Llorens A, Marciano D, Gunduz A, Schalk G, Brunner P, Knight RT. Music can be reconstructed from human auditory cortex activity using nonlinear decoding models. PLoS Biol 2023; 21:e3002176. [PMID: 37582062 PMCID: PMC10427021 DOI: 10.1371/journal.pbio.3002176] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 05/30/2023] [Indexed: 08/17/2023] Open
Abstract
Music is core to human experience, yet the precise neural dynamics underlying music perception remain unknown. We analyzed a unique intracranial electroencephalography (iEEG) dataset of 29 patients who listened to a Pink Floyd song and applied a stimulus reconstruction approach previously used in the speech domain. We successfully reconstructed a recognizable song from direct neural recordings and quantified the impact of different factors on decoding accuracy. Combining encoding and decoding analyses, we found a right-hemisphere dominance for music perception with a primary role of the superior temporal gyrus (STG), evidenced a new STG subregion tuned to musical rhythm, and defined an anterior-posterior STG organization exhibiting sustained and onset responses to musical elements. Our findings show the feasibility of applying predictive modeling on short datasets acquired in single patients, paving the way for adding musical elements to brain-computer interface (BCI) applications.
Collapse
Affiliation(s)
- Ludovic Bellier
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Anaïs Llorens
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Déborah Marciano
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
| | - Aysegul Gunduz
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Gerwin Schalk
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
| | - Peter Brunner
- Department of Neurology, Albany Medical College, Albany, New York, United States of America
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, Missouri, United States of America
- National Center for Adaptive Neurotechnologies, Albany, New York, United States of America
| | - Robert T. Knight
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, California, United States of America
- Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
23
|
Giraud AL, Su Y. Reconstructing language from brain signals and deconstructing adversarial thought-reading. Cell Rep Med 2023; 4:101115. [PMID: 37467714 PMCID: PMC10394252 DOI: 10.1016/j.xcrm.2023.101115] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 06/19/2023] [Accepted: 06/20/2023] [Indexed: 07/21/2023]
Abstract
Tang et al.1 report a noninvasive brain-computer interface (BCI) that reconstructs perceived and intended continuous language from semantic brain responses. The study offers new possibilities to radically facilitate neural speech decoder applications and addresses concerns about misuse in non-medical scenarios.
Collapse
Affiliation(s)
- Anne-Lise Giraud
- Institut Pasteur, Université Paris Cité, Inserm, Institut de l'Audition, Paris, France; Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
| | - Yaqing Su
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
24
|
Trammel T, Khodayari N, Luck SJ, Traxler MJ, Swaab TY. Decoding semantic relatedness and prediction from EEG: A classification method comparison. Neuroimage 2023:120268. [PMID: 37422278 DOI: 10.1016/j.neuroimage.2023.120268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 06/22/2023] [Accepted: 07/06/2023] [Indexed: 07/10/2023] Open
Abstract
Machine-learning (ML) decoding methods have become a valuable tool for analyzing information represented in electroencephalogram (EEG) data. However, a systematic quantitative comparison of the performance of major ML classifiers for the decoding of EEG data in neuroscience studies of cognition is lacking. Using EEG data from two visual word-priming experiments examining well-established N400 effects of prediction and semantic relatedness, we compared the performance of three major ML classifiers that each use different algorithms: support vector machine (SVM), linear discriminant analysis (LDA), and random forest (RF). We separately assessed the performance of each classifier in each experiment using EEG data averaged over cross-validation blocks and using single-trial EEG data by comparing them with analyses of raw decoding accuracy, effect size, and feature importance weights. The results of these analyses demonstrated that SVM outperformed the other ML methods on all measures and in both experiments.
Collapse
Affiliation(s)
- Timothy Trammel
- Department of Psychology and Center for Mind and Brain, University of California, Davis, CA, United States.
| | - Natalia Khodayari
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, United States
| | - Steven J Luck
- Department of Psychology and Center for Mind and Brain, University of California, Davis, CA, United States
| | - Matthew J Traxler
- Department of Psychology and Center for Mind and Brain, University of California, Davis, CA, United States
| | - Tamara Y Swaab
- Department of Psychology and Center for Mind and Brain, University of California, Davis, CA, United States
| |
Collapse
|
25
|
Chu Q, Ma O, Hang Y, Tian X. Dual-stream cortical pathways mediate sensory prediction. Cereb Cortex 2023:7169133. [PMID: 37197767 DOI: 10.1093/cercor/bhad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 04/24/2023] [Accepted: 04/26/2023] [Indexed: 05/19/2023] Open
Abstract
Predictions are constantly generated from diverse sources to optimize cognitive functions in the ever-changing environment. However, the neural origin and generation process of top-down induced prediction remain elusive. We hypothesized that motor-based and memory-based predictions are mediated by distinct descending networks from motor and memory systems to the sensory cortices. Using functional magnetic resonance imaging (fMRI) and a dual imagery paradigm, we found that motor and memory upstream systems activated the auditory cortex in a content-specific manner. Moreover, the inferior and posterior parts of the parietal lobe differentially relayed predictive signals in motor-to-sensory and memory-to-sensory networks. Dynamic causal modeling of directed connectivity revealed selective enabling and modulation of connections that mediate top-down sensory prediction and ground the distinctive neurocognitive basis of predictive processing.
Collapse
Affiliation(s)
- Qian Chu
- Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning, Division of Arts and Sciences, New York University Shanghai, Shanghai 200126, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Max Planck-University of Toronto Centre for Neural Science and Technology, Toronto, ON M5S 2E4, Canada
| | - Ou Ma
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Yuqi Hang
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Department of Administration, Leadership, and Technology, Steinhardt School of Culture, Education, and Human Development, New York University, New York, NY 10003, United States
| | - Xing Tian
- Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning, Division of Arts and Sciences, New York University Shanghai, Shanghai 200126, China
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China
- Shanghai Key Laboratory of Brain Functional Genomics (Ministry of Education), School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| |
Collapse
|
26
|
Soroush PZ, Herff C, Ries SK, Shih JJ, Schultz T, Krusienski DJ. The nested hierarchy of overt, mouthed, and imagined speech activity evident in intracranial recordings. Neuroimage 2023; 269:119913. [PMID: 36731812 DOI: 10.1016/j.neuroimage.2023.119913] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 01/05/2023] [Accepted: 01/29/2023] [Indexed: 02/01/2023] Open
Abstract
Recent studies have demonstrated that it is possible to decode and synthesize various aspects of acoustic speech directly from intracranial measurements of electrophysiological brain activity. In order to continue progressing toward the development of a practical speech neuroprosthesis for the individuals with speech impairments, better understanding and modeling of imagined speech processes are required. The present study uses intracranial brain recordings from participants that performed a speaking task with trials consisting of overt, mouthed, and imagined speech modes, representing various degrees of decreasing behavioral output. Speech activity detection models are constructed using spatial, spectral, and temporal brain activity features, and the features and model performances are characterized and compared across the three degrees of behavioral output. The results indicate the existence of a hierarchy in which the relevant channels for the lower behavioral output modes form nested subsets of the relevant channels from the higher behavioral output modes. This provides important insights for the elusive goal of developing more effective imagined speech decoding models with respect to the better-established overt speech decoding counterparts.
Collapse
|
27
|
Lu L, Han M, Zou G, Zheng L, Gao JH. Common and distinct neural representations of imagined and perceived speech. Cereb Cortex 2022; 33:6486-6493. [PMID: 36587299 DOI: 10.1093/cercor/bhac519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 12/06/2022] [Accepted: 12/09/2022] [Indexed: 01/02/2023] Open
Abstract
Humans excel at constructing mental representations of speech streams in the absence of external auditory input: the internal experience of speech imagery. Elucidating the neural processes underlying speech imagery is critical to understanding this higher-order brain function in humans. Here, using functional magnetic resonance imaging, we investigated the shared and distinct neural correlates of imagined and perceived speech by asking participants to listen to poems articulated by a male voice (perception condition) and to imagine hearing poems spoken by that same voice (imagery condition). We found that compared to baseline, speech imagery and perception activated overlapping brain regions, including the bilateral superior temporal gyri and supplementary motor areas. The left inferior frontal gyrus was more strongly activated by speech imagery than by speech perception, suggesting functional specialization for generating speech imagery. Although more research with a larger sample size and a direct behavioral indicator is needed to clarify the neural systems underlying the construction of complex speech imagery, this study provides valuable insights into the neural mechanisms of the closely associated but functionally distinct processes of speech imagery and perception.
Collapse
Affiliation(s)
- Lingxi Lu
- Center for the Cognitive Science of Language, Beijing Language and Culture University, Beijing 100083, China
| | - Meizhen Han
- National Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing 100875, China
| | - Guangyuan Zou
- Center for MRI Research, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Li Zheng
- Center for MRI Research, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Jia-Hong Gao
- Center for MRI Research, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,PKU-IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China.,Beijing City Key Lab for Medical Physics and Engineering, Institution of Heavy Ion Physics, School of Physics, Peking University, Beijing 100871, China.,National Biomedical Imaging Center, Peking University, Beijing 100871, China
| |
Collapse
|
28
|
Verwoert M, Ottenhoff MC, Goulis S, Colon AJ, Wagner L, Tousseyn S, van Dijk JP, Kubben PL, Herff C. Dataset of Speech Production in intracranial.Electroencephalography. Sci Data 2022; 9:434. [PMID: 35869138 PMCID: PMC9307753 DOI: 10.1038/s41597-022-01542-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022] Open
Abstract
Speech production is an intricate process involving a large number of muscles and cognitive processes. The neural processes underlying speech production are not completely understood. As speech is a uniquely human ability, it can not be investigated in animal models. High-fidelity human data can only be obtained in clinical settings and is therefore not easily available to all researchers. Here, we provide a dataset of 10 participants reading out individual words while we measured intracranial EEG from a total of 1103 electrodes. The data, with its high temporal resolution and coverage of a large variety of cortical and sub-cortical brain regions, can help in understanding the speech production process better. Simultaneously, the data can be used to test speech decoding and synthesis approaches from neural data to develop speech Brain-Computer Interfaces and speech neuroprostheses. Measurement(s) | Brain activity | Technology Type(s) | Stereotactic electroencephalography | Sample Characteristic - Organism | Homo sapiens | Sample Characteristic - Environment | Epilepsy monitoring center | Sample Characteristic - Location | The Netherlands |
Collapse
|
29
|
Metzger SL, Liu JR, Moses DA, Dougherty ME, Seaton MP, Littlejohn KT, Chartier J, Anumanchipalli GK, Tu-Chan A, Ganguly K, Chang EF. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. Nat Commun 2022; 13:6510. [PMID: 36347863 PMCID: PMC9643551 DOI: 10.1038/s41467-022-33611-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 09/26/2022] [Indexed: 11/09/2022] Open
Abstract
Neuroprostheses have the potential to restore communication to people who cannot speak or type due to paralysis. However, it is unclear if silent attempts to speak can be used to control a communication neuroprosthesis. Here, we translated direct cortical signals in a clinical-trial participant (ClinicalTrials.gov; NCT03698149) with severe limb and vocal-tract paralysis into single letters to spell out full sentences in real time. We used deep-learning and language-modeling techniques to decode letter sequences as the participant attempted to silently spell using code words that represented the 26 English letters (e.g. "alpha" for "a"). We leveraged broad electrode coverage beyond speech-motor cortex to include supplemental control signals from hand cortex and complementary information from low- and high-frequency signal components to improve decoding accuracy. We decoded sentences using words from a 1,152-word vocabulary at a median character error rate of 6.13% and speed of 29.4 characters per minute. In offline simulations, we showed that our approach generalized to large vocabularies containing over 9,000 words (median character error rate of 8.23%). These results illustrate the clinical viability of a silently controlled speech neuroprosthesis to generate sentences from a large vocabulary through a spelling-based approach, complementing previous demonstrations of direct full-word decoding.
Collapse
Affiliation(s)
- Sean L. Metzger
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| | - Jessie R. Liu
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| | - David A. Moses
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA
| | - Maximilian E. Dougherty
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA
| | - Margaret P. Seaton
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA
| | - Kaylo T. Littlejohn
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA USA
| | - Josh Chartier
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA
| | - Gopala K. Anumanchipalli
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA USA
| | - Adelyn Tu-Chan
- grid.266102.10000 0001 2297 6811Department of Neurology, University of California, San Francisco, San Francisco, CA USA
| | - Karunesh Ganguly
- grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Department of Neurology, University of California, San Francisco, San Francisco, CA USA
| | - Edward F. Chang
- grid.266102.10000 0001 2297 6811Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA USA ,grid.47840.3f0000 0001 2181 7878University of California, Berkeley - University of California, San Francisco Graduate Program in Bioengineering, Berkeley, CA USA
| |
Collapse
|
30
|
Sui Y, Yu H, Zhang C, Chen Y, Jiang C, Li L. Deep brain-machine interfaces: sensing and modulating the human deep brain. Natl Sci Rev 2022; 9:nwac212. [PMID: 36644311 PMCID: PMC9834907 DOI: 10.1093/nsr/nwac212] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 10/02/2022] [Accepted: 10/04/2022] [Indexed: 01/18/2023] Open
Abstract
Different from conventional brain-machine interfaces that focus more on decoding the cerebral cortex, deep brain-machine interfaces enable interactions between external machines and deep brain structures. They sense and modulate deep brain neural activities, aiming at function restoration, device control and therapeutic improvements. In this article, we provide an overview of multiple deep brain recording and stimulation techniques that can serve as deep brain-machine interfaces. We highlight two widely used interface technologies, namely deep brain stimulation and stereotactic electroencephalography, for technical trends, clinical applications and brain connectivity research. We discuss the potential to develop closed-loop deep brain-machine interfaces and achieve more effective and applicable systems for the treatment of neurological and psychiatric disorders.
Collapse
Affiliation(s)
- Yanan Sui
- National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing 100084, China
| | - Huiling Yu
- National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing 100084, China
| | - Chen Zhang
- National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing 100084, China
| | - Yue Chen
- National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing 100084, China
| | - Changqing Jiang
- National Engineering Research Center of Neuromodulation, Tsinghua University, Beijing 100084, China
| | | |
Collapse
|
31
|
Kim T, Shin Y, Kang K, Kim K, Kim G, Byeon Y, Kim H, Gao Y, Lee JR, Son G, Kim T, Jun Y, Kim J, Lee J, Um S, Kwon Y, Son BG, Cho M, Sang M, Shin J, Kim K, Suh J, Choi H, Hong S, Cheng H, Kang HG, Hwang D, Yu KJ. Ultrathin crystalline-silicon-based strain gauges with deep learning algorithms for silent speech interfaces. Nat Commun 2022; 13:5815. [PMID: 36192403 PMCID: PMC9530138 DOI: 10.1038/s41467-022-33457-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 09/16/2022] [Indexed: 11/28/2022] Open
Abstract
A wearable silent speech interface (SSI) is a promising platform that enables verbal communication without vocalization. The most widely studied methodology for SSI focuses on surface electromyography (sEMG). However, sEMG suffers from low scalability because of signal quality-related issues, including signal-to-noise ratio and interelectrode interference. Hence, here, we present a novel SSI by utilizing crystalline-silicon-based strain sensors combined with a 3D convolutional deep learning algorithm. Two perpendicularly placed strain gauges with minimized cell dimension (<0.1 mm2) could effectively capture the biaxial strain information with high reliability. We attached four strain sensors near the subject’s mouths and collected strain data of unprecedently large wordsets (100 words), which our SSI can classify at a high accuracy rate (87.53%). Several analysis methods were demonstrated to verify the system’s reliability, as well as the performance comparison with another SSI using sEMG electrodes with the same dimension, which exhibited a relatively low accuracy rate (42.60%). Designing an efficient platform that enables verbal communication without vocalization remains a challenge. Here, the authors propose a silent speech interface by utilizing a deep learning algorithm combined with strain sensors attached near the subject’s mouth, able to collect 100 words and classify at a high accuracy rate.
Collapse
Affiliation(s)
- Taemin Kim
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Yejee Shin
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kyowon Kang
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kiho Kim
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Gwanho Kim
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Yunsu Byeon
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Hwayeon Kim
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Yuyan Gao
- Department of Engineering Science and Mechanics, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Jeong Ryong Lee
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Geonhui Son
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Taeseong Kim
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Yohan Jun
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.,Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA.,Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - Jihyun Kim
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Jinyoung Lee
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Seyun Um
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Yoohwan Kwon
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Byung Gwan Son
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Myeongki Cho
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Mingyu Sang
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Jongwoon Shin
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kyubeen Kim
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Jungmin Suh
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Heekyeong Choi
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Seokjun Hong
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Huanyu Cheng
- Department of Engineering Science and Mechanics, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Hong-Goo Kang
- Digital Signal Processing & Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| | - Dosik Hwang
- Medical Artificial Intelligence Lab, School of Electrical and Electronic Engineering, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea. .,Department of Electrical and Electronic Engineering, YU-Korea Institute of Science and Technology (KIST) Institute, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea.
| | - Ki Jun Yu
- Functional Bio-integrated Electronics and Energy Management Lab, School of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea. .,Department of Electrical and Electronic Engineering, YU-Korea Institute of Science and Technology (KIST) Institute, Yonsei University, 50, Yonsei-ro, Seodaemun-gu, Seoul, 03722, Korea.
| |
Collapse
|
32
|
Cooney C, Folli R, Coyle D. Opportunities, pitfalls and trade-offs in designing protocols for measuring the neural correlates of speech. Neurosci Biobehav Rev 2022; 140:104783. [PMID: 35907491 DOI: 10.1016/j.neubiorev.2022.104783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 11/25/2022]
Abstract
Decoding speech and speech-related processes directly from the human brain has intensified in studies over recent years as such a decoder has the potential to positively impact people with limited communication capacity due to disease or injury. Additionally, it can present entirely new forms of human-computer interaction and human-machine communication in general and facilitate better neuroscientific understanding of speech processes. Here, we synthesize the literature on neural speech decoding pertaining to how speech decoding experiments have been conducted, coalescing around a necessity for thoughtful experimental design aimed at specific research goals, and robust procedures for evaluating speech decoding paradigms. We examine the use of different modalities for presenting stimuli to participants, methods for construction of paradigms including timings and speech rhythms, and possible linguistic considerations. In addition, novel methods for eliciting naturalistic speech and validating imagined speech task performance in experimental settings are presented based on recent research. We also describe the multitude of terms used to instruct participants on how to produce imagined speech during experiments and propose methods for investigating the effect of these terms on imagined speech decoding. We demonstrate that the range of experimental procedures used in neural speech decoding studies can have unintended consequences which can impact upon the efficacy of the knowledge obtained. The review delineates the strengths and weaknesses of present approaches and poses methodological advances which we anticipate will enhance experimental design, and progress toward the optimal design of movement independent direct speech brain-computer interfaces.
Collapse
Affiliation(s)
- Ciaran Cooney
- Intelligent Systems Research Centre, Ulster University, Derry, UK.
| | - Raffaella Folli
- Institute for Research in Social Sciences, Ulster University, Jordanstown, UK
| | - Damien Coyle
- Intelligent Systems Research Centre, Ulster University, Derry, UK
| |
Collapse
|
33
|
Ward LM, Guevara R. Qualia and Phenomenal Consciousness Arise From the Information Structure of an Electromagnetic Field in the Brain. Front Hum Neurosci 2022; 16:874241. [PMID: 35860400 PMCID: PMC9289677 DOI: 10.3389/fnhum.2022.874241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/17/2022] [Indexed: 11/17/2022] Open
Abstract
In this paper we address the following problems and provide realistic answers to them: (1) What could be the physical substrate for subjective, phenomenal, consciousness (P-consciousness)? Our answer: the electromagnetic (EM) field generated by the movement and changes of electrical charges in the brain. (2) Is this substrate generated in some particular part of the brains of conscious entities or does it comprise the entirety of the brain/body? Our answer: a part of the thalamus in mammals, and homologous parts of other brains generates the critical EM field. (3) From whence arise the qualia experienced in P-consciousness? Our answer, the relevant EM field is “structured” by emulating in the brain the information in EM fields arising from both external (the environment) and internal (the body) sources. (4) What differentiates the P-conscious EM field from other EM fields, e.g., the flux of photons scattered from object surfaces, the EM field of an electro-magnet, or the EM fields generated in the brain that do not enter P-consciousness, such as those generated in the retina or occipital cortex, or those generated in brain areas that guide behavior through visual information in persons exhibiting “blindsight”? Our answer: living systems express a boundary between themselves and the environment, requiring them to model (coarsely emulate) information from their environment in order to control through actions, to the extent possible, the vast sea of variety in which they are immersed. This model, expressed in an EM field, is P-consciousness. The model is the best possible representation of the moment-to-moment niche-relevant (action-relevant: affordance) information an organism can generate (a Gestalt). Information that is at a lower level than niche-relevant, such as the unanalyzed retinal vector-field, is not represented in P-consciousness because it is not niche-relevant. Living organisms have sensory and other systems that have evolved to supply such information, albeit in a coarse form.
Collapse
Affiliation(s)
- Lawrence M. Ward
- Department of Psychology and Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada
- *Correspondence: Lawrence M. Ward,
| | - Ramón Guevara
- Department of Physics and Astronomy, University of Padua, Padua, Italy
- Department of Developmental Psychology and Socialization, Padova Neuroscience Center, University of Padua, Padua, Italy
| |
Collapse
|
34
|
Lee KW, Lee DH, Kim SJ, Lee SW. Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1977-1980. [PMID: 36086641 DOI: 10.1109/embc48229.2022.9871721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal information and low signal-to-noise ratio. In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese. Our assumption was that English, a non-tonal and phonogram-based language, would have spectral differences in neural computation compared to Chinese, a tonal and ideogram-based language. The results showed the significant difference in the relative power spectral density between English and Chinese in specific frequency band groups. Also, the spatial evaluation of Chinese native speakers in the theta band was distinctive during the imagination task. Hence, this paper would suggest the key spectral and spatial information of word imagination with specialized language while decoding the neural signals of speech. Clinical Relevance- Imagined speech-related studies lead to the development of assistive communication technology especially for patients with speech disorders such as aphasia due to brain damage. This study suggests significant spectral features by analyzing cross-language differences of EEG-based imagined speech using two widely used languages.
Collapse
|
35
|
CNN Architectures and Feature Extraction Methods for EEG Imaginary Speech Recognition. SENSORS 2022; 22:s22134679. [PMID: 35808173 PMCID: PMC9268757 DOI: 10.3390/s22134679] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/13/2022] [Accepted: 06/17/2022] [Indexed: 11/25/2022]
Abstract
Speech is a complex mechanism allowing us to communicate our needs, desires and thoughts. In some cases of neural dysfunctions, this ability is highly affected, which makes everyday life activities that require communication a challenge. This paper studies different parameters of an intelligent imaginary speech recognition system to obtain the best performance according to the developed method that can be applied to a low-cost system with limited resources. In developing the system, we used signals from the Kara One database containing recordings acquired for seven phonemes and four words. We used in the feature extraction stage a method based on covariance in the frequency domain that performed better compared to the other time-domain methods. Further, we observed the system performance when using different window lengths for the input signal (0.25 s, 0.5 s and 1 s) to highlight the importance of the short-term analysis of the signals for imaginary speech. The final goal being the development of a low-cost system, we studied several architectures of convolutional neural networks (CNN) and showed that a more complex architecture does not necessarily lead to better results. Our study was conducted on eight different subjects, and it is meant to be a subject’s shared system. The best performance reported in this paper is up to 37% accuracy for all 11 different phonemes and words when using cross-covariance computed over the signal spectrum of a 0.25 s window and a CNN containing two convolutional layers with 64 and 128 filters connected to a dense layer with 64 neurons. The final system qualifies as a low-cost system using limited resources for decision-making and having a running time of 1.8 ms tested on an AMD Ryzen 7 4800HS CPU.
Collapse
|
36
|
The Construction of an Action-Speech Feature-Based School Violence Recognition Algorithm and Occupational Therapy Education Model for Adolescents. Occup Ther Int 2022; 2022:1723736. [PMID: 35685225 PMCID: PMC9166964 DOI: 10.1155/2022/1723736] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/07/2022] [Accepted: 05/13/2022] [Indexed: 12/30/2022] Open
Abstract
This paper constructs an algorithm for youth school violence recognition and an occupational therapy education model for victims through the extraction of action speech features. For the characteristics of violent actions and daily actions, action features in time and frequency domains are extracted and action categories are recognized by BP neural network; for complex actions, it is proposed to decompose complex actions into basic actions to improve the recognition rate; then, LDA dimensionality reduction algorithm is introduced for the problem of the high complexity of algorithm due to high dimensionality of features, and the feature dimensionality is reduced to 8 dimensions by LDA dimensionality reduction algorithm, which reduces the system running time by about 51% and improves the accuracy of violent action recognition by 3.3% while ensuring the overall performance of the system. The LDA dimensionality reduction algorithm reduces the number of features to 8 dimensions, which reduces the running time of the system by 51%, increases the accuracy rate of violent action recognition by 3.3%, and increases the recall rate of violent action recognition by 8.86% while ensuring the overall performance of the system. Based on the classical D-S theory, we proposed an improved D-S evidence fusion algorithm by modifying the original evidence model with a new probability distribution function and constructing new fusion rules, which can solve the fusion conflict problem well. The recall rate for violent actions is increased to 90.0%, thus reducing the missed alarm rate of the system.
Collapse
|
37
|
Cheng THZ, Creel SC, Iversen JR. How Do You Feel the Rhythm: Dynamic Motor-Auditory Interactions Are Involved in the Imagination of Hierarchical Timing. J Neurosci 2022; 42:500-512. [PMID: 34848500 PMCID: PMC8802922 DOI: 10.1523/jneurosci.1121-21.2021] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2021] [Revised: 11/10/2021] [Accepted: 11/12/2021] [Indexed: 11/21/2022] Open
Abstract
Predicting and organizing patterns of events is important for humans to survive in a dynamically changing world. The motor system has been proposed to be actively, and necessarily, engaged in not only the production but the perception of rhythm by organizing hierarchical timing that influences auditory responses. It is not yet well understood how the motor system interacts with the auditory system to perceive and maintain hierarchical structure in time. This study investigated the dynamic interaction between auditory and motor functional sources during the perception and imagination of musical meters. We pursued this using a novel method combining high-density EEG, EMG, and motion capture with independent component analysis to separate motor and auditory activity during meter imagery while robustly controlling against covert movement. We demonstrated that endogenous brain activity in both auditory and motor functional sources reflects the imagination of binary and ternary meters in the absence of corresponding acoustic cues or overt movement at the meter rate. We found clear evidence for hypothesized motor-to-auditory information flow at the beat rate in all conditions, suggesting a role for top-down influence of the motor system on auditory processing of beat-based rhythms, and reflecting an auditory-motor system with tight reciprocal informational coupling. These findings align with and further extend a set of motor hypotheses from beat perception to hierarchical meter imagination, adding supporting evidence to active engagement of the motor system in auditory processing, which may more broadly speak to the neural mechanisms of temporal processing in other human cognitive functions.SIGNIFICANCE STATEMENT Humans live in a world full of hierarchically structured temporal information, the accurate perception of which is essential for understanding speech and music. Music provides a window into the brain mechanisms of time perception, enabling us to examine how the brain groups musical beats into, for example a march or waltz. Using a novel paradigm combining measurement of electrical brain activity with data-driven analysis, this study directly investigates motor-auditory connectivity during meter imagination. Findings highlight the importance of the motor system in the active imagination of meter. This study sheds new light on a fundamental form of perception by demonstrating how auditory-motor interaction may support hierarchical timing processing, which may have clinical implications for speech and motor rehabilitation.
Collapse
Affiliation(s)
- Tzu-Han Zoe Cheng
- Department of Cognitive Science, University of California-San Diego, La Jolla, California 92093
- Institute for Neural Computation and Swartz Center for Computational Neuroscience, University of California-San Diego, La Jolla, California 92093
| | - Sarah C Creel
- Department of Cognitive Science, University of California-San Diego, La Jolla, California 92093
| | - John R Iversen
- Institute for Neural Computation and Swartz Center for Computational Neuroscience, University of California-San Diego, La Jolla, California 92093
| |
Collapse
|