1
|
Mizokuchi K, Tanaka T, Sato TG, Shiraki Y. Alpha band modulation caused by selective attention to music enables EEG classification. Cogn Neurodyn 2024; 18:1005-1020. [PMID: 38826648 PMCID: PMC11143110 DOI: 10.1007/s11571-023-09955-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 02/19/2023] [Accepted: 03/08/2023] [Indexed: 06/04/2024] Open
Abstract
Humans are able to pay selective attention to music or speech in the presence of multiple sounds. It has been reported that in the speech domain, selective attention enhances the cross-correlation between the envelope of speech and electroencephalogram (EEG) while also affecting the spatial modulation of the alpha band. However, when multiple music pieces are performed at the same time, it is unclear how selective attention affects neural entrainment and spatial modulation. In this paper, we hypothesized that the entrainment to the attended music differs from that to the unattended music and that spatial modulation in the alpha band occurs in conjunction with attention. We conducted experiments in which we presented musical excerpts to 15 participants, each listening to two excerpts simultaneously but paying attention to one of the two. The results showed that the cross-correlation function between the EEG signal and the envelope of the unattended melody had a more prominent peak than that of the attended melody, contrary to the findings for speech. In addition, the spatial modulation in the alpha band was found with a data-driven approach called the common spatial pattern method. Classification of the EEG signal with a support vector machine identified attended melodies and achieved an accuracy of 100% for 11 of the 15 participants. These results suggest that selective attention to music suppresses entrainment to the melody and that spatial modulation of the alpha band occurs in conjunction with attention. To the best of our knowledge, this is the first report to detect attended music consisting of several types of music notes only with EEG.
Collapse
Affiliation(s)
- Kana Mizokuchi
- Department of Electrical and Electronic Engineering, Tokyo University of Agriculture and Technology, Tokyo, Japan
| | - Toshihisa Tanaka
- Department of Electrical Engineering and Computer Science, Tokyo University of Agriculture and Technology, Tokyo, Japan
| | - Takashi G. Sato
- NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Kanagawa, Japan
| | - Yoshifumi Shiraki
- NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Kanagawa, Japan
| |
Collapse
|
2
|
EskandariNasab M, Raeisi Z, Lashaki RA, Najafi H. A GRU-CNN model for auditory attention detection using microstate and recurrence quantification analysis. Sci Rep 2024; 14:8861. [PMID: 38632246 PMCID: PMC11024110 DOI: 10.1038/s41598-024-58886-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 04/04/2024] [Indexed: 04/19/2024] Open
Abstract
Attention as a cognition ability plays a crucial role in perception which helps humans to concentrate on specific objects of the environment while discarding others. In this paper, auditory attention detection (AAD) is investigated using different dynamic features extracted from multichannel electroencephalography (EEG) signals when listeners attend to a target speaker in the presence of a competing talker. To this aim, microstate and recurrence quantification analysis are utilized to extract different types of features that reflect changes in the brain state during cognitive tasks. Then, an optimized feature set is determined by employing the processes of significant feature selection based on classification performance. The classifier model is developed by hybrid sequential learning that employs Gated Recurrent Units (GRU) and Convolutional Neural Network (CNN) into a unified framework for accurate attention detection. The proposed AAD method shows that the selected feature set achieves the most discriminative features for the classification process. Also, it yields the best performance as compared with state-of-the-art AAD approaches from the literature in terms of various measures. The current study is the first to validate the use of microstate and recurrence quantification parameters to differentiate auditory attention using reinforcement learning without access to stimuli.
Collapse
Affiliation(s)
| | - Zahra Raeisi
- Department of Computer Science, University of Fairleigh Dickinson, Vancouver Campus, Vancouver, Canada
| | - Reza Ahmadi Lashaki
- Department of Computer Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | - Hamidreza Najafi
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| |
Collapse
|
3
|
Wikman P, Salmela V, Sjöblom E, Leminen M, Laine M, Alho K. Attention to audiovisual speech shapes neural processing through feedback-feedforward loops between different nodes of the speech network. PLoS Biol 2024; 22:e3002534. [PMID: 38466713 PMCID: PMC10957087 DOI: 10.1371/journal.pbio.3002534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/21/2024] [Accepted: 01/30/2024] [Indexed: 03/13/2024] Open
Abstract
Selective attention-related top-down modulation plays a significant role in separating relevant speech from irrelevant background speech when vocal attributes separating concurrent speakers are small and continuously evolving. Electrophysiological studies have shown that such top-down modulation enhances neural tracking of attended speech. Yet, the specific cortical regions involved remain unclear due to the limited spatial resolution of most electrophysiological techniques. To overcome such limitations, we collected both electroencephalography (EEG) (high temporal resolution) and functional magnetic resonance imaging (fMRI) (high spatial resolution), while human participants selectively attended to speakers in audiovisual scenes containing overlapping cocktail party speech. To utilise the advantages of the respective techniques, we analysed neural tracking of speech using the EEG data and performed representational dissimilarity-based EEG-fMRI fusion. We observed that attention enhanced neural tracking and modulated EEG correlates throughout the latencies studied. Further, attention-related enhancement of neural tracking fluctuated in predictable temporal profiles. We discuss how such temporal dynamics could arise from a combination of interactions between attention and prediction as well as plastic properties of the auditory cortex. EEG-fMRI fusion revealed attention-related iterative feedforward-feedback loops between hierarchically organised nodes of the ventral auditory object related processing stream. Our findings support models where attention facilitates dynamic neural changes in the auditory cortex, ultimately aiding discrimination of relevant sounds from irrelevant ones while conserving neural resources.
Collapse
Affiliation(s)
- Patrik Wikman
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
- Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Viljami Salmela
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
- Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| | - Eetu Sjöblom
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
| | - Miika Leminen
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
- AI and Analytics Unit, Helsinki University Hospital, Helsinki, Finland
| | - Matti Laine
- Department of Psychology, Åbo Akademi University, Turku, Finland
| | - Kimmo Alho
- Department of Psychology and Logopedics, University of Helsinki, Helsinki, Finland
- Advanced Magnetic Imaging Centre, Aalto NeuroImaging, Aalto University, Espoo, Finland
| |
Collapse
|
4
|
Ha J, Baek SC, Lim Y, Chung JH. Validation of cost-efficient EEG experimental setup for neural tracking in an auditory attention task. Sci Rep 2023; 13:22682. [PMID: 38114579 PMCID: PMC10730561 DOI: 10.1038/s41598-023-49990-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 12/14/2023] [Indexed: 12/21/2023] Open
Abstract
When individuals listen to speech, their neural activity phase-locks to the slow temporal rhythm, which is commonly referred to as "neural tracking". The neural tracking mechanism allows for the detection of an attended sound source in a multi-talker situation by decoding neural signals obtained by electroencephalography (EEG), known as auditory attention decoding (AAD). Neural tracking with AAD can be utilized as an objective measurement tool for diverse clinical contexts, and it has potential to be applied to neuro-steered hearing devices. To effectively utilize this technology, it is essential to enhance the accessibility of EEG experimental setup and analysis. The aim of the study was to develop a cost-efficient neural tracking system and validate the feasibility of neural tracking measurement by conducting an AAD task using an offline and real-time decoder model outside the soundproof environment. We devised a neural tracking system capable of conducting AAD experiments using an OpenBCI and Arduino board. Nine participants were recruited to assess the performance of the AAD using the developed system, which involved presenting competing speech signals in an experiment setting without soundproofing. As a result, the offline decoder model demonstrated an average performance of 90%, and real-time decoder model exhibited a performance of 78%. The present study demonstrates the feasibility of implementing neural tracking and AAD using cost-effective devices in a practical environment.
Collapse
Affiliation(s)
- Jiyeon Ha
- Department of HY-KIST Bio-Convergence, Hanyang University, Seoul, 04763, Korea
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul, 02792, Korea
| | - Seung-Cheol Baek
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul, 02792, Korea
- Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, 60322, Frankfurt\ Main, Germany
| | - Yoonseob Lim
- Department of HY-KIST Bio-Convergence, Hanyang University, Seoul, 04763, Korea.
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul, 02792, Korea.
| | - Jae Ho Chung
- Department of HY-KIST Bio-Convergence, Hanyang University, Seoul, 04763, Korea.
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul, 02792, Korea.
- Department of Otolaryngology-Head and Neck Surgery, College of Medicine, Hanyang University, Seoul, 04763, Korea.
- Department of Otolaryngology-Head and Neck Surgery, School of Medicine, Hanyang University, 222-Wangshimni-ro, Seongdong-gu, Seoul, 133-792, Korea.
| |
Collapse
|
5
|
Pires F, Leitão P, Moreira AP, Ahmad B. Reinforcement learning based trustworthy recommendation model for digital twin-driven decision-support in manufacturing systems. COMPUT IND 2023. [DOI: 10.1016/j.compind.2023.103884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
|
6
|
Lee J, Hong H, Song JM, Yeom E. Neural network ensemble model for prediction of erythrocyte sedimentation rate (ESR) using partial least squares regression. Sci Rep 2022; 12:19618. [PMID: 36379969 PMCID: PMC9666533 DOI: 10.1038/s41598-022-23174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 10/26/2022] [Indexed: 11/16/2022] Open
Abstract
The erythrocyte sedimentation rate (ESR) is a non-specific blood test for determining inflammatory conditions. However, the long measurement time (60 min) to obtain ESR is an obstacle for a prompt evaluation. In this study, to reduce the measurement time of ESR, deep neural networks (DNNs) were applied to the sedimentation tendency of blood samples. DNNs using multilayer perceptron (MLP), long short-term memory (LSTM), and gated recurrent unit (GRU) were assessed and compared to determine a suitable length of time for the input sequence. To avoid overfitting, a stacking ensemble learning was adopted, which combines multiple models by using a meta model. Four meta models were compared: mean, median, least absolute shrinkage and selection operator, and partial least squares regression (PLSR) schemes. From the empirical results, LSTM and GRU models have better prediction than MLP over sequence lengths of 5 to 20 min. The decrease in [Formula: see text] and [Formula: see text] of GRU and LSTM was attenuated after a sequence length of 15 min, so the input sequence length is determined as 15 min. In terms of the meta model, the statistical comparison suggests that GRU combined with PLSR (GRU-PLSR) is the best case. Then, the GRU-PLSR was tested for prediction of ESR data obtained from periodontitis patients to check its applicability to a specific disease. The Bland-Altman plot shows acceptable agreement between measured and predicted ESR values. Based on the results, the GRU-PLSR can predict ESR with improved performance within 15 min and has potential applicability to ESR data with inflammatory and non-inflammatory conditions.
Collapse
Affiliation(s)
- Jaejin Lee
- grid.262229.f0000 0001 0719 8572School of Mechanical Engineering, Pusan National University, Busan, South Korea
| | - Hyeonji Hong
- grid.262229.f0000 0001 0719 8572School of Mechanical Engineering, Pusan National University, Busan, South Korea
| | - Jae Min Song
- grid.262229.f0000 0001 0719 8572Department of Oral and Maxillofacial Surgery, School of Dentistry, Pusan National University, Yangsan, South Korea ,grid.262229.f0000 0001 0719 8572Dental and Life Science Institute, School of Dentistry, Pusan National University, Yangsan, South Korea
| | - Eunseop Yeom
- grid.262229.f0000 0001 0719 8572School of Mechanical Engineering, Pusan National University, Busan, South Korea
| |
Collapse
|
7
|
Xie Y, Ma J. How to discern external acoustic waves in a piezoelectric neuron under noise? J Biol Phys 2022; 48:339-353. [PMID: 35948818 PMCID: PMC9411441 DOI: 10.1007/s10867-022-09611-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 07/27/2022] [Indexed: 10/15/2022] Open
Abstract
Biological neurons keep sensitive to external stimuli and appropriate firing modes can be triggered to give effective response to external chemical and physical signals. A piezoelectric neural circuit can perceive external voice and nonlinear vibration by generating equivalent piezoelectric voltage, which can generate an equivalent trans-membrane current for inducing a variety of firing modes in the neural activities. Biological neurons can receive external stimuli from more ion channels and synapse synchronously, but the further encoding and priority in mode selection are competitive. In particular, noisy disturbance and electromagnetic radiation make it more difficult in signals identification and mode selection in the firing patterns of neurons driven by multi-channel signals. In this paper, two different periodic signals accompanied by noise are used to excite the piezoelectric neural circuit, and the signal processing in the piezoelectric neuron driven by acoustic waves under noise is reproduced and explained. The physical energy of the piezoelectric neural circuit and Hamilton energy in the neuron driven by mixed signals are calculated to explain the biophysical mechanism of auditory neuron when external stimuli are applied. It is found that the neuron prefers to respond to the external stimulus with higher physical energy and the signal which can increase the Hamilton energy of the neuron. For example, stronger inputs used to inject higher energy and it is detected and responded more sensitively. The involvement of noise is helpful to detect the external signal under stochastic resonance, and the additive noise changes the excitability of neuron as the external stimulus. The results indicate that energy controls the firing patterns and mode selection in neurons, and it provides clues to control the neural activities by injecting appropriate energy into the neurons and network.
Collapse
Affiliation(s)
- Ying Xie
- Department of Physics, Lanzhou University of Technology, Lanzhou, 730050, China
| | - Jun Ma
- Department of Physics, Lanzhou University of Technology, Lanzhou, 730050, China.
- School of Science, Chongqing University of Posts and Telecommunications, Chongqing, 430065, China.
| |
Collapse
|
8
|
Xia W, Zheng L, Fang J, Li F, Zhou Y, Zeng Z, Zhang B, Li Z, Li H, Zhu F. PFmulDL: a novel strategy enabling multi-class and multi-label protein function annotation by integrating diverse deep learning methods. Comput Biol Med 2022; 145:105465. [PMID: 35366467 DOI: 10.1016/j.compbiomed.2022.105465] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 02/06/2023]
Abstract
Bioinformatic annotation of protein function is essential but extremely sophisticated, which asks for extensive efforts to develop effective prediction method. However, the existing methods tend to amplify the representativeness of the families with large number of proteins by misclassifying the proteins in the families with small number of proteins. That is to say, the ability of the existing methods to annotate proteins in the 'rare classes' remains limited. Herein, a new protein function annotation strategy, PFmulDL, integrating multiple deep learning methods, was thus constructed. First, the recurrent neural network was integrated, for the first time, with the convolutional neural network to facilitate the function annotation. Second, a transfer learning method was introduced to the model construction for further improving the prediction performances. Third, based on the latest data of Gene Ontology, the newly constructed model could annotate the largest number of protein families comparing with the existing methods. Finally, this newly constructed model was found capable of significantly elevating the prediction performance for the 'rare classes' without sacrificing that for the 'major classes'. All in all, due to the emerging requirements on improving the prediction performance for the proteins in 'rare classes', this new strategy would become an essential complement to the existing methods for protein function prediction. All the models and source codes are freely available and open to all users at: https://github.com/idrblab/PFmulDL.
Collapse
Affiliation(s)
- Weiqi Xia
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Jiebin Fang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Ying Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Honglin Li
- School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China.
| |
Collapse
|
9
|
Wang L, Wang Y, Liu Z, Wu EX, Chen F. A Speech-Level–Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes. Front Neurosci 2022; 15:760611. [PMID: 35221885 PMCID: PMC8866945 DOI: 10.3389/fnins.2021.760611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 12/30/2021] [Indexed: 11/21/2022] Open
Abstract
In the competing speaker environments, human listeners need to focus or switch their auditory attention according to dynamic intentions. The reliable cortical tracking ability to the speech envelope is an effective feature for decoding the target speech from the neural signals. Moreover, previous studies revealed that the root mean square (RMS)–level–based speech segmentation made a great contribution to the target speech perception with the modulation of sustained auditory attention. This study further investigated the effect of the RMS-level–based speech segmentation on the auditory attention decoding (AAD) performance with both sustained and switched attention in the competing speaker auditory scenes. Objective biomarkers derived from the cortical activities were also developed to index the dynamic auditory attention states. In the current study, subjects were asked to concentrate or switch their attention between two competing speaker streams. The neural responses to the higher- and lower-RMS-level speech segments were analyzed via the linear temporal response function (TRF) before and after the attention switching from one to the other speaker stream. Furthermore, the AAD performance decoded by the unified TRF decoding model was compared to that by the speech-RMS-level–based segmented decoding model with the dynamic change of the auditory attention states. The results showed that the weight of the typical TRF component approximately 100-ms time lag was sensitive to the switching of the auditory attention. Compared to the unified AAD model, the segmented AAD model improved attention decoding performance under both the sustained and switched auditory attention modulations in a wide range of signal-to-masker ratios (SMRs). In the competing speaker scenes, the TRF weight and AAD accuracy could be used as effective indicators to detect the changes of the auditory attention. In addition, with a wide range of SMRs (i.e., from 6 to –6 dB in this study), the segmented AAD model showed the robust decoding performance even with short decision window length, suggesting that this speech-RMS-level–based model has the potential to decode dynamic attention states in the realistic auditory scenarios.
Collapse
Affiliation(s)
- Lei Wang
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Yihan Wang
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Zhixing Liu
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Ed X. Wu
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
- *Correspondence: Fei Chen,
| |
Collapse
|
10
|
Haro S, Rao HM, Quatieria TF, Smalt CJ. EEG Alpha and Pupil Diameter Reflect Endogenous Auditory Attention Switching and Listening Effort. Eur J Neurosci 2022; 55:1262-1277. [PMID: 35098604 PMCID: PMC9305413 DOI: 10.1111/ejn.15616] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 01/15/2022] [Accepted: 01/19/2022] [Indexed: 11/30/2022]
Abstract
Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been well studied in the context of competing continuous speech. In this work, we developed two variants of endogenous attention switching and a sustained attention control. We characterized these three experimental conditions under the context of decoding auditory attention, while simultaneously evaluating listening effort and neural markers of spatial‐audio cues. A least‐squares, electroencephalography (EEG)‐based, attention decoding algorithm was implemented across all conditions. It achieved an accuracy of 69.4% and 64.0% when computed over nonoverlapping 10 and 5‐s correlation windows, respectively. Both decoders illustrated smooth transitions in the attended talker prediction through switches at approximately half of the analysis window size (e.g., the mean lag taken across the two switch conditions was 2.2 s when the 5‐s correlation window was used). Expended listening effort, as measured by simultaneous EEG and pupillometry, was also a strong indicator of whether the listeners sustained attention or performed an endogenous attention switch (peak pupil diameter measure [
p=0.034] and minimum parietal alpha power measure [
p=0.016]). We additionally found evidence of talker spatial cues in the form of centrotemporal alpha power lateralization (
p=0.0428). These results suggest that listener effort and spatial cues may be promising features to pursue in a decoding context, in addition to speech‐based features.
Collapse
Affiliation(s)
- Stephanie Haro
- Human Health and Performance Systems, MIT Lincoln Laboratory Lexington MA USA
- Speech and Hearing Bioscience and Technology Harvard Medical School Boston MA USA
| | - Hrishikesh M. Rao
- Human Health and Performance Systems, MIT Lincoln Laboratory Lexington MA USA
| | - Thomas F. Quatieria
- Speech and Hearing Bioscience and Technology Harvard Medical School Boston MA USA
| | | |
Collapse
|