1
|
Rotaru I, Geirnaert S, Heintz N, Van de Ryck I, Bertrand A, Francart T. What are we reallydecoding? Unveiling biases in EEG-based decoding of the spatial focus of auditory attention. J Neural Eng 2024; 21:016017. [PMID: 38266281 DOI: 10.1088/1741-2552/ad2214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 01/24/2024] [Indexed: 01/26/2024]
Abstract
Objective.Spatial auditory attention decoding (Sp-AAD) refers to the task of identifying the direction of the speaker to which a person is attending in a multi-talker setting, based on the listener's neural recordings, e.g. electroencephalography (EEG). The goal of this study is to thoroughly investigate potential biases when training such Sp-AAD decoders on EEG data, particularly eye-gaze biases and latent trial-dependent confounds, which may result in Sp-AAD models that decode eye-gaze or trial-specific fingerprints rather than spatial auditory attention.Approach.We designed a two-speaker audiovisual Sp-AAD protocol in which the spatial auditory and visual attention were enforced to be either congruent or incongruent, and we recorded EEG data from sixteen participants undergoing several trials recorded at distinct timepoints. We trained a simple linear model for Sp-AAD based on common spatial patterns filters in combination with either linear discriminant analysis (LDA) or k-means clustering, and evaluated them both across- and within-trial.Main results.We found that even a simple linear Sp-AAD model is susceptible to overfitting to confounding signal patterns such as eye-gaze and trial fingerprints (e.g. due to feature shifts across trials), resulting in artificially high decoding accuracies. Furthermore, we found that changes in the EEG signal statistics across trials deteriorate the trial generalization of the classifier, even when the latter is retrained on the test trial with an unsupervised algorithm.Significance.Collectively, our findings confirm that there exist subtle biases and confounds that can strongly interfere with the decoding of spatial auditory attention from EEG. It is expected that more complicated non-linear models based on deep neural networks, which are often used for Sp-AAD, are even more vulnerable to such biases. Future work should perform experiments and model evaluations that avoid and/or control for such biases in Sp-AAD tasks.
Collapse
Affiliation(s)
- Iustina Rotaru
- Department of Neurosciences, ExpORL, KU Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
| | - Simon Geirnaert
- Department of Neurosciences, ExpORL, KU Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
- Leuven.AI-KU Leuven Institute for AI, Leuven, Belgium
| | - Nicolas Heintz
- Department of Neurosciences, ExpORL, KU Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
- Leuven.AI-KU Leuven Institute for AI, Leuven, Belgium
| | - Iris Van de Ryck
- Department of Neurosciences, ExpORL, KU Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium
| | - Alexander Bertrand
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
- Leuven.AI-KU Leuven Institute for AI, Leuven, Belgium
| | - Tom Francart
- Department of Neurosciences, ExpORL, KU Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium
- Leuven.AI-KU Leuven Institute for AI, Leuven, Belgium
| |
Collapse
|
2
|
Wang Q, Luo L, Xu N, Wang J, Yang R, Chen G, Ren J, Luan G, Fang F. Neural response properties predict perceived contents and locations elicited by intracranial electrical stimulation of human auditory cortex. Cereb Cortex 2024; 34:bhad517. [PMID: 38185991 DOI: 10.1093/cercor/bhad517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 12/09/2023] [Accepted: 12/10/2023] [Indexed: 01/09/2024] Open
Abstract
Intracranial electrical stimulation (iES) of auditory cortex can elicit sound experiences with a variety of perceived contents (hallucination or illusion) and locations (contralateral or bilateral side), independent of actual acoustic inputs. However, the neural mechanisms underlying this elicitation heterogeneity remain undiscovered. Here, we collected subjective reports following iES at 3062 intracranial sites in 28 patients (both sexes) and identified 113 auditory cortical sites with iES-elicited sound experiences. We then decomposed the sound-induced intracranial electroencephalogram (iEEG) signals recorded from all 113 sites into time-frequency features. We found that the iES-elicited perceived contents can be predicted by the early high-γ features extracted from sound-induced iEEG. In contrast, the perceived locations elicited by stimulating hallucination sites and illusion sites are determined by the late high-γ and long-lasting α features, respectively. Our study unveils the crucial neural signatures of iES-elicited sound experiences in human and presents a new strategy to hearing restoration for individuals suffering from deafness.
Collapse
Affiliation(s)
- Qian Wang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
| | - Lu Luo
- School of Psychology, Beijing Sport University, Beijing 100084, China
| | - Na Xu
- Division of Brain Sciences, Changping Laboratory, Beijing 102206, China
| | - Jing Wang
- Department of Neurology, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
| | - Ruolin Yang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Guanpeng Chen
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Jie Ren
- Department of Functional Neurosurgery, Beijing Key Laboratory of Epilepsy, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
- Epilepsy Center, Kunming Sanbo Brain Hospital, Kunming 650100 China
| | - Guoming Luan
- Department of Functional Neurosurgery, Beijing Key Laboratory of Epilepsy, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
- Beijing Institute for Brain Disorders, Beijing 100069, China
| | - Fang Fang
- School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China
- IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
3
|
Trigeminal stimulation is required for neural representations of bimodal odor localization: A time-resolved multivariate EEG and fNIRS study. Neuroimage 2023; 269:119903. [PMID: 36708974 DOI: 10.1016/j.neuroimage.2023.119903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 11/28/2022] [Accepted: 01/24/2023] [Indexed: 01/26/2023] Open
Abstract
Whereas neural representations of spatial information are commonly studied in vision, olfactory stimuli might also be able to create such representations via the trigeminal system. We explored in two independent multi-method electroencephalography-functional near-infrared spectroscopy (EEG+fNIRS) experiments (n1=18, n2=14) if monorhinal odor stimuli can evoke spatial representations in the brain. We tested whether this representation depends on trigeminal properties of the stimulus, and if the retention in short-term memory follows the "sensorimotor recruitment theory", using multivariate representational similarity analysis (RSA). We demonstrate that the delta frequency band up to 5 Hz across the scull entail spatial information of which nostril has been stimulated. Delta frequencies were localized in a network involving primary and secondary olfactory, motor-sensory and occipital regions. RSA on fNIRS data showed that monorhinal stimulations evoke neuronal representations in motor-sensory regions and that this representation is kept stable beyond the time of perception. These effects were no longer valid when the odor stimulus did not sufficiently stimulate the trigeminal nerve as well. Our results are first evidence that the trigeminal system can create spatial representations of bimodal odors in the brain and that these representations follow similar principles as the other sensory systems.
Collapse
|
4
|
Popov T, Gips B, Weisz N, Jensen O. Brain areas associated with visual spatial attention display topographic organization during auditory spatial attention. Cereb Cortex 2023; 33:3478-3489. [PMID: 35972419 PMCID: PMC10068281 DOI: 10.1093/cercor/bhac285] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 07/02/2022] [Accepted: 07/05/2022] [Indexed: 11/12/2022] Open
Abstract
Spatially selective modulation of alpha power (8-14 Hz) is a robust finding in electrophysiological studies of visual attention, and has been recently generalized to auditory spatial attention. This modulation pattern is interpreted as reflecting a top-down mechanism for suppressing distracting input from unattended directions of sound origin. The present study on auditory spatial attention extends this interpretation by demonstrating that alpha power modulation is closely linked to oculomotor action. We designed an auditory paradigm in which participants were required to attend to upcoming sounds from one of 24 loudspeakers arranged in a circular array around the head. Maintaining the location of an auditory cue was associated with a topographically modulated distribution of posterior alpha power resembling the findings known from visual attention. Multivariate analyses allowed the prediction of the sound location in the horizontal plane. Importantly, this prediction was also possible, when derived from signals capturing saccadic activity. A control experiment on auditory spatial attention confirmed that, in absence of any visual/auditory input, lateralization of alpha power is linked to the lateralized direction of gaze. Attending to an auditory target engages oculomotor and visual cortical areas in a topographic manner akin to the retinotopic organization associated with visual attention.
Collapse
Affiliation(s)
- Tzvetan Popov
- Methods of Plasticity Research, Department of Psychology, University of Zurich, 1-80502-784644-50205-B15 2TT, Zurich, Switzerland
- Department of Psychology, University of Konstanz, Konstanz, Germany
| | - Bart Gips
- NATO Science and Technology Organization Centre for Maritime Research and Experimentation (CMRE) La Spezia, La Spezia 19126, Italy
| | - Nathan Weisz
- Centre for Cognitive Neuroscience and Department of Psychology, University of Salzburg, Salzburg, Austria
| | - Ole Jensen
- School of Psychology, University of Birmingham, Birmingham, UK
| |
Collapse
|
5
|
Haro S, Rao HM, Quatieria TF, Smalt CJ. EEG Alpha and Pupil Diameter Reflect Endogenous Auditory Attention Switching and Listening Effort. Eur J Neurosci 2022; 55:1262-1277. [PMID: 35098604 PMCID: PMC9305413 DOI: 10.1111/ejn.15616] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 01/15/2022] [Accepted: 01/19/2022] [Indexed: 11/30/2022]
Abstract
Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been well studied in the context of competing continuous speech. In this work, we developed two variants of endogenous attention switching and a sustained attention control. We characterized these three experimental conditions under the context of decoding auditory attention, while simultaneously evaluating listening effort and neural markers of spatial‐audio cues. A least‐squares, electroencephalography (EEG)‐based, attention decoding algorithm was implemented across all conditions. It achieved an accuracy of 69.4% and 64.0% when computed over nonoverlapping 10 and 5‐s correlation windows, respectively. Both decoders illustrated smooth transitions in the attended talker prediction through switches at approximately half of the analysis window size (e.g., the mean lag taken across the two switch conditions was 2.2 s when the 5‐s correlation window was used). Expended listening effort, as measured by simultaneous EEG and pupillometry, was also a strong indicator of whether the listeners sustained attention or performed an endogenous attention switch (peak pupil diameter measure [
p=0.034] and minimum parietal alpha power measure [
p=0.016]). We additionally found evidence of talker spatial cues in the form of centrotemporal alpha power lateralization (
p=0.0428). These results suggest that listener effort and spatial cues may be promising features to pursue in a decoding context, in addition to speech‐based features.
Collapse
Affiliation(s)
- Stephanie Haro
- Human Health and Performance Systems, MIT Lincoln Laboratory Lexington MA USA
- Speech and Hearing Bioscience and Technology Harvard Medical School Boston MA USA
| | - Hrishikesh M. Rao
- Human Health and Performance Systems, MIT Lincoln Laboratory Lexington MA USA
| | - Thomas F. Quatieria
- Speech and Hearing Bioscience and Technology Harvard Medical School Boston MA USA
| | | |
Collapse
|
6
|
Crosse MJ, Zuk NJ, Di Liberto GM, Nidiffer AR, Molholm S, Lalor EC. Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research. Front Neurosci 2021; 15:705621. [PMID: 34880719 PMCID: PMC8648261 DOI: 10.3389/fnins.2021.705621] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/21/2021] [Indexed: 01/01/2023] Open
Abstract
Cognitive neuroscience, in particular research on speech and language, has seen an increase in the use of linear modeling techniques for studying the processing of natural, environmental stimuli. The availability of such computational tools has prompted similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits under more naturalistic conditions. However, studying clinical (and often highly heterogeneous) cohorts introduces an added layer of complexity to such modeling procedures, potentially leading to instability of such techniques and, as a result, inconsistent findings. Here, we outline some key methodological considerations for applied research, referring to a hypothetical clinical experiment involving speech processing and worked examples of simulated electrophysiological (EEG) data. In particular, we focus on experimental design, data preprocessing, stimulus feature extraction, model design, model training and evaluation, and interpretation of model weights. Throughout the paper, we demonstrate the implementation of each step in MATLAB using the mTRF-Toolbox and discuss how to address issues that could arise in applied research. In doing so, we hope to provide better intuition on these more technical points and provide a resource for applied and clinical researchers investigating sensory and cognitive processing using ecologically rich stimuli.
Collapse
Affiliation(s)
- Michael J Crosse
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,X, The Moonshot Factory, Mountain View, CA, United States.,Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States.,Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Nathaniel J Zuk
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Giovanni M Di Liberto
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Centre for Biomedical Engineering, School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland.,School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
| | - Aaron R Nidiffer
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| | - Sophie Molholm
- Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States.,Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
| | - Edmund C Lalor
- Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States.,Department of Neuroscience, University of Rochester, Rochester, NY, United States
| |
Collapse
|
7
|
Li J, Hong B, Nolte G, Engel AK, Zhang D. Preparatory delta phase response is correlated with naturalistic speech comprehension performance. Cogn Neurodyn 2021; 16:337-352. [PMID: 35401861 PMCID: PMC8934811 DOI: 10.1007/s11571-021-09711-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 07/09/2021] [Accepted: 08/12/2021] [Indexed: 01/07/2023] Open
Abstract
While human speech comprehension is thought to be an active process that involves top-down predictions, it remains unclear how predictive information is used to prepare for the processing of upcoming speech information. We aimed to identify the neural signatures of the preparatory processing of upcoming speech. Participants selectively attended to one of two competing naturalistic, narrative speech streams, and a temporal response function (TRF) method was applied to derive event-related-like neural responses from electroencephalographic data. The phase responses to the attended speech at the delta band (1-4 Hz) were correlated with the comprehension performance of individual participants, with a latency of - 200-0 ms relative to the onset of speech amplitude envelope fluctuations over the fronto-central and left-lateralized parietal electrodes. The phase responses to the attended speech at the alpha band also correlated with comprehension performance but with a latency of 650-980 ms post-onset over the fronto-central electrodes. Distinct neural signatures were found for the attentional modulation, taking the form of TRF-based amplitude responses at a latency of 240-320 ms post-onset over the left-lateralized fronto-central and occipital electrodes. Our findings reveal how the brain gets prepared to process an upcoming speech in a continuous, naturalistic speech context.
Collapse
Affiliation(s)
- Jiawei Li
- Department of Psychology, School of Social Sciences, Tsinghua University, Room 334, Mingzhai Building, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| | - Bo Hong
- Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| | - Guido Nolte
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, Hamburg, Germany
| | - Andreas K. Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg Eppendorf, Hamburg, Germany
| | - Dan Zhang
- Department of Psychology, School of Social Sciences, Tsinghua University, Room 334, Mingzhai Building, Beijing, China
- Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China
| |
Collapse
|
8
|
Vandecappelle S, Deckers L, Das N, Ansari AH, Bertrand A, Francart T. EEG-based detection of the locus of auditory attention with convolutional neural networks. eLife 2021; 10:e56481. [PMID: 33929315 PMCID: PMC8143791 DOI: 10.7554/elife.56481] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 04/28/2021] [Indexed: 01/16/2023] Open
Abstract
In a multi-speaker scenario, the human auditory system is able to attend to one particular speaker of interest and ignore the others. It has been demonstrated that it is possible to use electroencephalography (EEG) signals to infer to which speaker someone is attending by relating the neural activity to the speech signals. However, classifying auditory attention within a short time interval remains the main challenge. We present a convolutional neural network-based approach to extract the locus of auditory attention (left/right) without knowledge of the speech envelopes. Our results show that it is possible to decode the locus of attention within 1-2 s, with a median accuracy of around 81%. These results are promising for neuro-steered noise suppression in hearing aids, in particular in scenarios where per-speaker envelopes are unavailable.
Collapse
Affiliation(s)
- Servaas Vandecappelle
- Department of Neurosciences, Experimental Oto-rhino-laryngologyLeuvenBelgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data AnalyticsLeuvenBelgium
| | - Lucas Deckers
- Department of Neurosciences, Experimental Oto-rhino-laryngologyLeuvenBelgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data AnalyticsLeuvenBelgium
| | - Neetha Das
- Department of Neurosciences, Experimental Oto-rhino-laryngologyLeuvenBelgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data AnalyticsLeuvenBelgium
| | - Amir Hossein Ansari
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data AnalyticsLeuvenBelgium
| | - Alexander Bertrand
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data AnalyticsLeuvenBelgium
| | - Tom Francart
- Department of Neurosciences, Experimental Oto-rhino-laryngologyLeuvenBelgium
| |
Collapse
|
9
|
Vandecappelle S, Deckers L, Das N, Ansari AH, Bertrand A, Francart T. EEG-based detection of the locus of auditory attention with convolutional neural networks. eLife 2021; 10:56481. [PMID: 33929315 DOI: 10.1101/475673] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 04/28/2021] [Indexed: 05/27/2023] Open
Abstract
In a multi-speaker scenario, the human auditory system is able to attend to one particular speaker of interest and ignore the others. It has been demonstrated that it is possible to use electroencephalography (EEG) signals to infer to which speaker someone is attending by relating the neural activity to the speech signals. However, classifying auditory attention within a short time interval remains the main challenge. We present a convolutional neural network-based approach to extract the locus of auditory attention (left/right) without knowledge of the speech envelopes. Our results show that it is possible to decode the locus of attention within 1-2 s, with a median accuracy of around 81%. These results are promising for neuro-steered noise suppression in hearing aids, in particular in scenarios where per-speaker envelopes are unavailable.
Collapse
Affiliation(s)
- Servaas Vandecappelle
- Department of Neurosciences, Experimental Oto-rhino-laryngology, Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Lucas Deckers
- Department of Neurosciences, Experimental Oto-rhino-laryngology, Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Neetha Das
- Department of Neurosciences, Experimental Oto-rhino-laryngology, Leuven, Belgium
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Amir Hossein Ansari
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Alexander Bertrand
- Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Tom Francart
- Department of Neurosciences, Experimental Oto-rhino-laryngology, Leuven, Belgium
| |
Collapse
|
10
|
Geirnaert S, Francart T, Bertrand A. Fast EEG-Based Decoding Of The Directional Focus Of Auditory Attention Using Common Spatial Patterns. IEEE Trans Biomed Eng 2021; 68:1557-1568. [PMID: 33095706 DOI: 10.1109/tbme.2020.3033446] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
OBJECTIVE Noise reduction algorithms in current hearing devices lack informationabout the sound source a user attends to when multiple sources are present. To resolve this issue, they can be complemented with auditory attention decoding (AAD) algorithms, which decode the attention using electroencephalography (EEG) sensors. State-of-the-art AAD algorithms employ a stimulus reconstruction approach, in which the envelope of the attended source is reconstructed from the EEG and correlated with the envelopes of the individual sources. This approach, however, performs poorly on short signal segments, whilelonger segments yield impractically long detection delays when the user switches attention. METHODS We propose decoding the directional focus of attention using filterbank common spatial pattern filters (FB-CSP) as an alternative AAD paradigm, whichdoes not require access to the clean source envelopes. RESULTS The proposed FB-CSP approach outperforms both the stimulus reconstruction approach on short signal segments, as well as a convolutional neural network approach on the same task. We achieve a high accuracy (80% for [Formula: see text] windows and 70% for quasi-instantaneous decisions), which is sufficient to reach minimal expected switch durations below [Formula: see text]. We also demonstrate that the decoder can adapt to unlabeled data from anunseen subject and works with only a subset of EEG channels located around the ear to emulate a wearable EEG setup. CONCLUSION The proposed FB-CSP method provides fast and accurate decoding of the directional focus of auditory attention. SIGNIFICANCE The high accuracy on very short data segments is a major step forward towards practical neuro-steered hearing devices.
Collapse
|
11
|
Wang S, Lin M, Sun L, Chen X, Fu X, Yan L, Li C, Zhang X. Neural Mechanisms of Hearing Recovery for Cochlear-Implanted Patients: An Electroencephalogram Follow-Up Study. Front Neurosci 2021; 14:624484. [PMID: 33633529 PMCID: PMC7901906 DOI: 10.3389/fnins.2020.624484] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 12/22/2020] [Indexed: 12/11/2022] Open
Abstract
Background Patients with severe profound hearing loss could benefit from cochlear implantation (CI). However, the neural mechanism of such benefit is still unclear. Therefore, we analyzed the electroencephalogram (EEG) and behavioral indicators of auditory function remodeling in patients with CI. Both indicators were sampled at multiple time points after implantation (1, 90, and 180 days). Methods First, the speech perception ability was evaluated with the recording of a list of Chinese words and sentences in 15 healthy controls (HC group) and 10 patients with CI (CI group). EEG data were collected using an oddball paradigm. Then, the characteristics of event-related potentials (ERPs) and mismatch negative (MMN) were compared between the CI group and the HC group. In addition, we analyzed the phase lag indices (PLI) in the CI group and the HC group and calculated the difference in functional connectivity between the two groups at different stages after implantation. Results The behavioral indicator, speech recognition ability, in CI patients improved as the implantation time increased. The MMN analysis showed that CI patients could recognize the difference between standard and deviation stimuli just like the HCs 90 days after cochlear implantation. Comparing the latencies of N1/P2/MMN between the CI group and the HC group, we found that the latency of N1/P2 in CI patients was longer, while the latency of MMN in CI users was shorter. In addition, PLI-based whole-brain functional connectivity (PLI-FC) showed that the difference between the CI group and the HC group mainly exists in electrode pairs between the bilateral auditory area and the frontal area. Furthermore, all those differences gradually decreased with the increase in implantation time. Conclusion The N1 amplitude, N1/P2/MMN latency, and PLI-FC in the alpha band may reflect the process of auditory function remodeling and could be an objective index for the assessment of speech perception ability and the effect of cochlear implantation.
Collapse
Affiliation(s)
- Songjian Wang
- School of Biomedical Engineering, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China
| | - Meng Lin
- School of Biomedical Engineering, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China
| | - Liwei Sun
- School of Biomedical Engineering, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China
| | - Xueqing Chen
- Key Laboratory of Otolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Ministry of Education, Beijing, China
| | - Xinxing Fu
- Key Laboratory of Otolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Ministry of Education, Beijing, China
| | - LiLi Yan
- Key Laboratory of Otolaryngology Head and Neck Surgery, Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Ministry of Education, Beijing, China
| | - Chunlin Li
- School of Biomedical Engineering, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China
| | - Xu Zhang
- School of Biomedical Engineering, Capital Medical University, Beijing, China.,Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China
| |
Collapse
|
12
|
Bednar A, Lalor EC. Where is the cocktail party? Decoding locations of attended and unattended moving sound sources using EEG. Neuroimage 2019; 205:116283. [PMID: 31629828 DOI: 10.1016/j.neuroimage.2019.116283] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 10/08/2019] [Accepted: 10/14/2019] [Indexed: 11/18/2022] Open
Abstract
Recently, we showed that in a simple acoustic scene with one sound source, auditory cortex tracks the time-varying location of a continuously moving sound. Specifically, we found that both the delta phase and alpha power of the electroencephalogram (EEG) can be used to reconstruct the sound source azimuth. However, in natural settings, we are often presented with a mixture of multiple competing sounds and so we must focus our attention on the relevant source in order to segregate it from the competing sources e.g. 'cocktail party effect'. While many studies have examined this phenomenon in the context of sound envelope tracking by the cortex, it is unclear how we process and utilize spatial information in complex acoustic scenes with multiple sound sources. To test this, we created an experiment where subjects listened to two concurrent sound stimuli that were moving within the horizontal plane over headphones while we recorded their EEG. Participants were tasked with paying attention to one of the two presented stimuli. The data were analyzed by deriving linear mappings, temporal response functions (TRF), between EEG data and attended as well unattended sound source trajectories. Next, we used these TRFs to reconstruct both trajectories from previously unseen EEG data. In a first experiment we used noise stimuli and included the task involved spatially localizing embedded targets. Then, in a second experiment, we employed speech stimuli and a non-spatial speech comprehension task. Results showed the trajectory of an attended sound source can be reliably reconstructed from both delta phase and alpha power of EEG even in the presence of distracting stimuli. Moreover, the reconstruction was robust to task and stimulus type. The cortical representation of the unattended source position was below detection level for the noise stimuli, but we observed weak tracking of the unattended source location for the speech stimuli by the delta phase of EEG. In addition, we demonstrated that the trajectory reconstruction method can in principle be used to decode selective attention on a single-trial basis, however, its performance was inferior to envelope-based decoders. These results suggest a possible dissociation of delta phase and alpha power of EEG in the context of sound trajectory tracking. Moreover, the demonstrated ability to localize and determine the attended speaker in complex acoustic environments is particularly relevant for cognitively controlled hearing devices.
Collapse
Affiliation(s)
- Adam Bednar
- School of Engineering, Trinity College Dublin, Dublin, Ireland; Trinity Center for Bioengineering, Trinity College Dublin, Dublin, Ireland.
| | - Edmund C Lalor
- School of Engineering, Trinity College Dublin, Dublin, Ireland; Trinity Center for Bioengineering, Trinity College Dublin, Dublin, Ireland; Department of Biomedical Engineering, Department of Neuroscience, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
13
|
Obleser J, Kayser C. Neural Entrainment and Attentional Selection in the Listening Brain. Trends Cogn Sci 2019; 23:913-926. [PMID: 31606386 DOI: 10.1016/j.tics.2019.08.004] [Citation(s) in RCA: 181] [Impact Index Per Article: 36.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 08/16/2019] [Accepted: 08/20/2019] [Indexed: 01/07/2023]
Abstract
The streams of sounds we typically attend to abound in acoustic regularities. Neural entrainment is seen as an important mechanism that the listening brain exploits to attune to these regularities and to enhance the representation of attended sounds. We delineate the neurophysiology underlying this mechanism and review entrainment alongside its more pragmatic signature, often called 'speech tracking'. The latter has become a popular analytical approach to trace the reflection of acoustic and linguistic information at different levels of granularity, from neurophysiology to neuroimaging. As we discuss, the concept of entrainment offers both a putative neurophysiological mechanism for selective listening and a versatile window onto the neural basis of hearing and speech comprehension.
Collapse
Affiliation(s)
- Jonas Obleser
- Department of Psychology, University of Lübeck, 23562 Lübeck, Germany.
| | - Christoph Kayser
- Department for Cognitive Neuroscience and Cognitive Interaction Technology, Center of Excellence, Bielefeld University, 33615 Bielefeld, Germany.
| |
Collapse
|
14
|
O'Sullivan AE, Lim CY, Lalor EC. Look at me when I'm talking to you: Selective attention at a multisensory cocktail party can be decoded using stimulus reconstruction and alpha power modulations. Eur J Neurosci 2019; 50:3282-3295. [PMID: 31013361 DOI: 10.1111/ejn.14425] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Revised: 03/25/2019] [Accepted: 04/17/2019] [Indexed: 11/30/2022]
Abstract
Recent work using electroencephalography has applied stimulus reconstruction techniques to identify the attended speaker in a cocktail party environment. The success of these approaches has been primarily based on the ability to detect cortical tracking of the acoustic envelope at the scalp level. However, most studies have ignored the effects of visual input, which is almost always present in naturalistic scenarios. In this study, we investigated the effects of visual input on envelope-based cocktail party decoding in two multisensory cocktail party situations: (a) Congruent AV-facing the attended speaker while ignoring another speaker represented by the audio-only stream and (b) Incongruent AV (eavesdropping)-attending the audio-only speaker while looking at the unattended speaker. We trained and tested decoders for each condition separately and found that we can successfully decode attention to congruent audiovisual speech and can also decode attention when listeners were eavesdropping, i.e., looking at the face of the unattended talker. In addition to this, we found alpha power to be a reliable measure of attention to the visual speech. Using parieto-occipital alpha power, we found that we can distinguish whether subjects are attending or ignoring the speaker's face. Considering the practical applications of these methods, we demonstrate that with only six near-ear electrodes we can successfully determine the attended speech. This work extends the current framework for decoding attention to speech to more naturalistic scenarios, and in doing so provides additional neural measures which may be incorporated to improve decoding accuracy.
Collapse
Affiliation(s)
- Aisling E O'Sullivan
- School of Engineering, Trinity Centre for Bioengineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
| | - Chantelle Y Lim
- Department of Biomedical Engineering, University of Rochester, Rochester, New York
| | - Edmund C Lalor
- School of Engineering, Trinity Centre for Bioengineering and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland.,Department of Biomedical Engineering, University of Rochester, Rochester, New York.,Department of Neuroscience, Del Monte Institute for Neuroscience, University of Rochester, Rochester, New York
| |
Collapse
|
15
|
Alickovic E, Lunner T, Gustafsson F, Ljung L. A Tutorial on Auditory Attention Identification Methods. Front Neurosci 2019; 13:153. [PMID: 30941002 PMCID: PMC6434370 DOI: 10.3389/fnins.2019.00153] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 02/11/2019] [Indexed: 01/14/2023] Open
Abstract
Auditory attention identification methods attempt to identify the sound source of a listener's interest by analyzing measurements of electrophysiological data. We present a tutorial on the numerous techniques that have been developed in recent decades, and we present an overview of current trends in multivariate correlation-based and model-based learning frameworks. The focus is on the use of linear relations between electrophysiological and audio data. The way in which these relations are computed differs. For example, canonical correlation analysis (CCA) finds a linear subset of electrophysiological data that best correlates to audio data and a similar subset of audio data that best correlates to electrophysiological data. Model-based (encoding and decoding) approaches focus on either of these two sets. We investigate the similarities and differences between these linear model philosophies. We focus on (1) correlation-based approaches (CCA), (2) encoding/decoding models based on dense estimation, and (3) (adaptive) encoding/decoding models based on sparse estimation. The specific focus is on sparsity-driven adaptive encoding models and comparing the methodology in state-of-the-art models found in the auditory literature. Furthermore, we outline the main signal processing pipeline for how to identify the attended sound source in a cocktail party environment from the raw electrophysiological data with all the necessary steps, complemented with the necessary MATLAB code and the relevant references for each step. Our main aim is to compare the methodology of the available methods, and provide numerical illustrations to some of them to get a feeling for their potential. A thorough performance comparison is outside the scope of this tutorial.
Collapse
Affiliation(s)
- Emina Alickovic
- Department of Electrical Engineering, Linkoping University, Linkoping, Sweden
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
| | - Thomas Lunner
- Department of Electrical Engineering, Linkoping University, Linkoping, Sweden
- Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark
- Hearing Systems, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
- Swedish Institute for Disability Research, Linnaeus Centre HEAD, Linkoping University, Linkoping, Sweden
| | - Fredrik Gustafsson
- Department of Electrical Engineering, Linkoping University, Linkoping, Sweden
| | - Lennart Ljung
- Department of Electrical Engineering, Linkoping University, Linkoping, Sweden
| |
Collapse
|
16
|
Teoh ES, Lalor EC. EEG decoding of the target speaker in a cocktail party scenario: considerations regarding dynamic switching of talker location. J Neural Eng 2019; 16:036017. [PMID: 30836345 DOI: 10.1088/1741-2552/ab0cf1] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE It has been shown that attentional selection in a simple dichotic listening paradigm can be decoded offline by reconstructing the stimulus envelope from single-trial neural response data. Here, we test the efficacy of this approach in an environment with non-stationary talkers. We then look beyond the envelope reconstructions themselves and consider whether incorporating the decoder values-which reflect the weightings applied to the multichannel EEG data at different time lags and scalp locations when reconstructing the stimulus envelope-can improve decoding performance. APPROACH High-density EEG was recorded as subjects attended to one of two talkers. The two speech streams were filtered using HRTFs, and the talkers were alternated between the left and right locations at varying intervals to simulate a dynamic environment. We trained spatio-temporal decoders mapping from EEG data to the attended and unattended stimulus envelopes. We then decoded auditory attention by (1) using the attended decoder to reconstruct the envelope and (2) exploiting the fact that decoder weightings themselves contain signatures of attention, resulting in consistent patterns across subjects that can be classified. MAIN RESULTS The previously established decoding approach was found to be effective even with non-stationary talkers. Signatures of attentional selection and attended direction were found in the spatio-temporal structure of the decoders and were consistent across subjects. The inclusion of decoder weights into the decoding algorithm resulted in significantly improved decoding accuracies (from 61.07% to 65.31% for 4 s windows). An attempt was made to include alpha power lateralization as another feature to improve decoding, although this was unsuccessful at the single-trial level. SIGNIFICANCE This work suggests that the spatial-temporal decoder weights can be utilised to improve decoding. More generally, looking beyond envelope reconstruction and incorporating other signatures of attention is an avenue that should be explored to improve selective auditory attention decoding.
Collapse
Affiliation(s)
- Emily S Teoh
- School of Engineering, Trinity College Dublin, University of Dublin, Dublin, Ireland. Trinity Centre for Bioengineering, Trinity College Dublin, Dublin, Ireland
| | | |
Collapse
|