Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Akram S, Simon JZ, Babadi B. Dynamic Estimation of the Auditory Temporal Response Function From MEG in Competing-Speaker Environments. IEEE Trans Biomed Eng 2016;64:1896-1905. [PMID: 28113290 DOI: 10.1109/tbme.2016.2628884] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

For:	Akram S, Simon JZ, Babadi B. Dynamic Estimation of the Auditory Temporal Response Function From MEG in Competing-Speaker Environments. IEEE Trans Biomed Eng 2016;64:1896-1905. [PMID: 28113290 DOI: 10.1109/tbme.2016.2628884] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Number

Cited by Other Article(s)

Wikman P, Salmela V, Sjöblom E, Leminen M, Laine M, Alho K. Attention to audiovisual speech shapes neural processing through feedback-feedforward loops between different nodes of the speech network. PLoS Biol 2024;22:e3002534. [PMID: 38466713 PMCID: PMC10957087 DOI: 10.1371/journal.pbio.3002534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/21/2024] [Accepted: 01/30/2024] [Indexed: 03/13/2024] Open

Brown JA, Bidelman GM. Attention, Musicality, and Familiarity Shape Cortical Speech Tracking at the Musical Cocktail Party. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.28.562773. [PMID: 37961204 PMCID: PMC10634879 DOI: 10.1101/2023.10.28.562773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]

Kulasingham JP, Simon JZ. Algorithms for Estimating Time-Locked Neural Response Components in Cortical Processing of Continuous Speech. IEEE Trans Biomed Eng 2023;70:88-96. [PMID: 35727788 PMCID: PMC9946293 DOI: 10.1109/tbme.2022.3185005] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

OBJECTIVE

The Temporal Response Function (TRF) is a linear model of neural activity time-locked to continuous stimuli, including continuous speech. TRFs based on speech envelopes typically have distinct components that have provided remarkable insights into the cortical processing of speech. However, current methods may lead to less than reliable estimates of single-subject TRF components. Here, we compare two established methods, in TRF component estimation, and also propose novel algorithms that utilize prior knowledge of these components, bypassing the full TRF estimation.

METHODS

We compared two established algorithms, ridge and boosting, and two novel algorithms based on Subspace Pursuit (SP) and Expectation Maximization (EM), which directly estimate TRF components given plausible assumptions regarding component characteristics. Single-channel, multi-channel, and source-localized TRFs were fit on simulations and real magnetoencephalographic data. Performance metrics included model fit and component estimation accuracy.

RESULTS

Boosting and ridge have comparable performance in component estimation. The novel algorithms outperformed the others in simulations, but not on real data, possibly due to the plausible assumptions not actually being met. Ridge had slightly better model fits on real data compared to boosting, but also more spurious TRF activity.

CONCLUSION

Results indicate that both smooth (ridge) and sparse (boosting) algorithms perform comparably at TRF component estimation. The SP and EM algorithms may be accurate, but rely on assumptions of component characteristics.

SIGNIFICANCE

This systematic comparison establishes the suitability of widely used and novel algorithms for estimating robust TRF components, which is essential for improved subject-specific investigations into the cortical processing of speech.

Collapse

Huet MP, Micheyl C, Gaudrain E, Parizet E. Vocal and semantic cues for the segregation of long concurrent speech stimuli in diotic and dichotic listening-The Long-SWoRD test. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022;151:1557. [PMID: 35364949 DOI: 10.1121/10.0007225] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 10/25/2021] [Indexed: 06/14/2023]

Wang L, Wang Y, Liu Z, Wu EX, Chen F. A Speech-Level–Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes. Front Neurosci 2022;15:760611. [PMID: 35221885 PMCID: PMC8866945 DOI: 10.3389/fnins.2021.760611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 12/30/2021] [Indexed: 11/21/2022] Open

Abstract

In the competing speaker environments, human listeners need to focus or switch their auditory attention according to dynamic intentions. The reliable cortical tracking ability to the speech envelope is an effective feature for decoding the target speech from the neural signals. Moreover, previous studies revealed that the root mean square (RMS)–level–based speech segmentation made a great contribution to the target speech perception with the modulation of sustained auditory attention. This study further investigated the effect of the RMS-level–based speech segmentation on the auditory attention decoding (AAD) performance with both sustained and switched attention in the competing speaker auditory scenes. Objective biomarkers derived from the cortical activities were also developed to index the dynamic auditory attention states. In the current study, subjects were asked to concentrate or switch their attention between two competing speaker streams. The neural responses to the higher- and lower-RMS-level speech segments were analyzed via the linear temporal response function (TRF) before and after the attention switching from one to the other speaker stream. Furthermore, the AAD performance decoded by the unified TRF decoding model was compared to that by the speech-RMS-level–based segmented decoding model with the dynamic change of the auditory attention states. The results showed that the weight of the typical TRF component approximately 100-ms time lag was sensitive to the switching of the auditory attention. Compared to the unified AAD model, the segmented AAD model improved attention decoding performance under both the sustained and switched auditory attention modulations in a wide range of signal-to-masker ratios (SMRs). In the competing speaker scenes, the TRF weight and AAD accuracy could be used as effective indicators to detect the changes of the auditory attention. In addition, with a wide range of SMRs (i.e., from 6 to –6 dB in this study), the segmented AAD model showed the robust decoding performance even with short decision window length, suggesting that this speech-RMS-level–based model has the potential to decode dynamic attention states in the realistic auditory scenarios.

Collapse

Huet MP, Micheyl C, Parizet E, Gaudrain E. Behavioral Account of Attended Stream Enhances Neural Tracking. Front Neurosci 2021;15:674112. [PMID: 34966252 PMCID: PMC8710602 DOI: 10.3389/fnins.2021.674112] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open

Abstract

During the past decade, several studies have identified electroencephalographic (EEG) correlates of selective auditory attention to speech. In these studies, typically, listeners are instructed to focus on one of two concurrent speech streams (the "target"), while ignoring the other (the "masker"). EEG signals are recorded while participants are performing this task, and subsequently analyzed to recover the attended stream. An assumption often made in these studies is that the participant's attention can remain focused on the target throughout the test. To check this assumption, and assess when a participant's attention in a concurrent speech listening task was directed toward the target, the masker, or neither, we designed a behavioral listen-then-recall task (the Long-SWoRD test). After listening to two simultaneous short stories, participants had to identify keywords from the target story, randomly interspersed among words from the masker story and words from neither story, on a computer screen. To modulate task difficulty, and hence, the likelihood of attentional switches, masker stories were originally uttered by the same talker as the target stories. The masker voice parameters were then manipulated to parametrically control the similarity of the two streams, from clearly dissimilar to almost identical. While participants listened to the stories, EEG signals were measured and subsequently, analyzed using a temporal response function (TRF) model to reconstruct the speech stimuli. Responses in the behavioral recall task were used to infer, retrospectively, when attention was directed toward the target, the masker, or neither. During the model-training phase, the results of these behavioral-data-driven inferences were used as inputs to the model in addition to the EEG signals, to determine if this additional information would improve stimulus reconstruction accuracy, relative to performance of models trained under the assumption that the listener's attention was unwaveringly focused on the target. Results from 21 participants show that information regarding the actual - as opposed to, assumed - attentional focus can be used advantageously during model training, to enhance subsequent (test phase) accuracy of auditory stimulus-reconstruction based on EEG signals. This is the case, especially, in challenging listening situations, where the participants' attention is less likely to remain focused entirely on the target talker. In situations where the two competing voices are clearly distinct and easily separated perceptually, the assumption that listeners are able to stay focused on the target is reasonable. The behavioral recall protocol introduced here provides experimenters with a means to behaviorally track fluctuations in auditory selective attention, including, in combined behavioral/neurophysiological studies.

Collapse

Har-shai Yahav P, Zion Golumbic E. Linguistic processing of task-irrelevant speech at a cocktail party. eLife 2021;10:e65096. [PMID: 33942722 PMCID: PMC8163500 DOI: 10.7554/elife.65096] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 04/26/2021] [Indexed: 01/05/2023] Open

Kuruvila I, Can Demir K, Fischer E, Hoppe U. Inference of the Selective Auditory Attention Using Sequential LMMSE Estimation. IEEE Trans Biomed Eng 2021;68:3501-3512. [PMID: 33891545 DOI: 10.1109/tbme.2021.3075337] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Miran S, Presacco A, Simon JZ, Fu MC, Marcus SI, Babadi B. Dynamic estimation of auditory temporal response functions via state-space models with Gaussian mixture process noise. PLoS Comput Biol 2020;16:e1008172. [PMID: 32813712 PMCID: PMC7485982 DOI: 10.1371/journal.pcbi.1008172] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 09/11/2020] [Accepted: 07/21/2020] [Indexed: 11/19/2022] Open

Abstract

Estimating the latent dynamics underlying biological processes is a central problem in computational biology. State-space models with Gaussian statistics are widely used for estimation of such latent dynamics and have been successfully utilized in the analysis of biological data. Gaussian statistics, however, fail to capture several key features of the dynamics of biological processes (e.g., brain dynamics) such as abrupt state changes and exogenous processes that affect the states in a structured fashion. Although Gaussian mixture process noise models have been considered as an alternative to capture such effects, data-driven inference of their parameters is not well-established in the literature. The objective of this paper is to develop efficient algorithms for inferring the parameters of a general class of Gaussian mixture process noise models from noisy and limited observations, and to utilize them in extracting the neural dynamics that underlie auditory processing from magnetoencephalography (MEG) data in a cocktail party setting. We develop an algorithm based on Expectation-Maximization to estimate the process noise parameters from state-space observations. We apply our algorithm to simulated and experimentally-recorded MEG data from auditory experiments in the cocktail party paradigm to estimate the underlying dynamic Temporal Response Functions (TRFs). Our simulation results show that the richer representation of the process noise as a Gaussian mixture significantly improves state estimation and capturing the heterogeneity of the TRF dynamics. Application to MEG data reveals improvements over existing TRF estimation techniques, and provides a reliable alternative to current approaches for probing neural dynamics in a cocktail party scenario, as well as attention decoding in emerging applications such as smart hearing aids. Our proposed methodology provides a framework for efficient inference of Gaussian mixture process noise models, with application to a wide range of biological data with underlying heterogeneous and latent dynamics.

Collapse

Hu M, Wang D, Ji X, Yu T, Shan Y, Fan X, Du J, Zhang X, Zhao G, Wang Y, Ren L, Liégeois-Chauvel C. Neural processes of auditory perception in Heschl's gyrus for upcoming acoustic stimuli in humans. Hear Res 2020;388:107895. [PMID: 31982643 DOI: 10.1016/j.heares.2020.107895] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 12/20/2019] [Accepted: 01/10/2020] [Indexed: 11/29/2022]

Presacco A, Miran S, Babadi B, Simon JZ. Real-Time Tracking of Magnetoencephalographic Neuromarkers during a Dynamic Attention-Switching Task. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020;2019:4148-4151. [PMID: 31946783 DOI: 10.1109/embc.2019.8857953] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Das P, Brodbeck C, Simon JZ, Babadi B. Neuro-current response functions: A unified approach to MEG source analysis under the continuous stimuli paradigm. Neuroimage 2020;211:116528. [PMID: 31945510 DOI: 10.1016/j.neuroimage.2020.116528] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 11/16/2019] [Accepted: 01/07/2020] [Indexed: 11/25/2022] Open

Abstract

Characterizing the neural dynamics underlying sensory processing is one of the central areas of investigation in systems and cognitive neuroscience. Neuroimaging techniques such as magnetoencephalography (MEG) and Electroencephalography (EEG) have provided significant insights into the neural processing of continuous stimuli, such as speech, thanks to their high temporal resolution. Existing work in the context of auditory processing suggests that certain features of speech, such as the acoustic envelope, can be used as reliable linear predictors of the neural response manifested in M/EEG. The corresponding linear filters are referred to as temporal response functions (TRFs). While the functional roles of specific components of the TRF are well-studied and linked to behavioral attributes such as attention, the cortical origins of the underlying neural processes are not as well understood. In this work, we address this issue by estimating a linear filter representation of cortical sources directly from neuroimaging data in the context of continuous speech processing. To this end, we introduce Neuro-Current Response Functions (NCRFs), a set of linear filters, spatially distributed throughout the cortex, that predict the cortical currents giving rise to the observed ongoing MEG (or EEG) data in response to continuous speech. NCRF estimation is cast within a Bayesian framework, which allows unification of the TRF and source estimation problems, and also facilitates the incorporation of prior information on the structural properties of the NCRFs. To generalize this analysis to M/EEG recordings which lack individual structural magnetic resonance (MR) scans, NCRFs are extended to free-orientation dipoles and a novel regularizing scheme is put forward to lessen reliance on fine-tuned coordinate co-registration. We present a fast estimation algorithm, which we refer to as the Champ-Lasso algorithm, by leveraging recent advances in optimization, and demonstrate its utility through application to simulated and experimentally recorded MEG data under auditory experiments. Our simulation studies reveal significant improvements over existing methods that typically operate in a two-stage fashion, in terms of spatial resolution, response function reconstruction, and recovering dipole orientations. The analysis of experimentally-recorded MEG data without MR scans corroborates existing findings, but also delineates the distinct cortical distribution of the underlying neural processes at high spatiotemporal resolution. In summary, we provide a principled modeling and estimation paradigm for MEG source analysis tailored to extracting the cortical origin of electrophysiological responses to continuous stimuli.

Collapse

Das N, Vanthornhout J, Francart T, Bertrand A. Stimulus-aware spatial filtering for single-trial neural response and temporal response function estimation in high-density EEG with applications in auditory research. Neuroimage 2020;204:116211. [PMID: 31546052 PMCID: PMC7355237 DOI: 10.1016/j.neuroimage.2019.116211] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 08/30/2019] [Accepted: 09/17/2019] [Indexed: 12/21/2022] Open

Aroudi A, Mirkovic B, De Vos M, Doclo S. Impact of Different Acoustic Components on EEG-Based Auditory Attention Decoding in Noisy and Reverberant Conditions. IEEE Trans Neural Syst Rehabil Eng 2019;27:652-663. [DOI: 10.1109/tnsre.2019.2903404] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Alickovic E, Lunner T, Gustafsson F, Ljung L. A Tutorial on Auditory Attention Identification Methods. Front Neurosci 2019;13:153. [PMID: 30941002 PMCID: PMC6434370 DOI: 10.3389/fnins.2019.00153] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 02/11/2019] [Indexed: 01/14/2023] Open

Teoh ES, Lalor EC. EEG decoding of the target speaker in a cocktail party scenario: considerations regarding dynamic switching of talker location. J Neural Eng 2019;16:036017. [PMID: 30836345 DOI: 10.1088/1741-2552/ab0cf1] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Abstract

OBJECTIVE

It has been shown that attentional selection in a simple dichotic listening paradigm can be decoded offline by reconstructing the stimulus envelope from single-trial neural response data. Here, we test the efficacy of this approach in an environment with non-stationary talkers. We then look beyond the envelope reconstructions themselves and consider whether incorporating the decoder values-which reflect the weightings applied to the multichannel EEG data at different time lags and scalp locations when reconstructing the stimulus envelope-can improve decoding performance.

APPROACH

High-density EEG was recorded as subjects attended to one of two talkers. The two speech streams were filtered using HRTFs, and the talkers were alternated between the left and right locations at varying intervals to simulate a dynamic environment. We trained spatio-temporal decoders mapping from EEG data to the attended and unattended stimulus envelopes. We then decoded auditory attention by (1) using the attended decoder to reconstruct the envelope and (2) exploiting the fact that decoder weightings themselves contain signatures of attention, resulting in consistent patterns across subjects that can be classified.

MAIN RESULTS

The previously established decoding approach was found to be effective even with non-stationary talkers. Signatures of attentional selection and attended direction were found in the spatio-temporal structure of the decoders and were consistent across subjects. The inclusion of decoder weights into the decoding algorithm resulted in significantly improved decoding accuracies (from 61.07% to 65.31% for 4 s windows). An attempt was made to include alpha power lateralization as another feature to improve decoding, although this was unsuccessful at the single-trial level.

SIGNIFICANCE

This work suggests that the spatial-temporal decoder weights can be utilised to improve decoding. More generally, looking beyond envelope reconstruction and incorporating other signatures of attention is an avenue that should be explored to improve selective auditory attention decoding.

Collapse

Brodbeck C, Presacco A, Simon JZ. Neural source dynamics of brain responses to continuous stimuli: Speech processing from acoustics to comprehension. Neuroimage 2018;172:162-174. [PMID: 29366698 PMCID: PMC5910254 DOI: 10.1016/j.neuroimage.2018.01.042] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Revised: 12/12/2017] [Accepted: 01/17/2018] [Indexed: 11/24/2022] Open

Abstract

Human experience often involves continuous sensory information that unfolds over time. This is true in particular for speech comprehension, where continuous acoustic signals are processed over seconds or even minutes. We show that brain responses to such continuous stimuli can be investigated in detail, for magnetoencephalography (MEG) data, by combining linear kernel estimation with minimum norm source localization. Previous research has shown that the requirement to average data over many trials can be overcome by modeling the brain response as a linear convolution of the stimulus and a kernel, or response function, and estimating a kernel that predicts the response from the stimulus. However, such analysis has been typically restricted to sensor space. Here we demonstrate that this analysis can also be performed in neural source space. We first computed distributed minimum norm current source estimates for continuous MEG recordings, and then computed response functions for the current estimate at each source element, using the boosting algorithm with cross-validation. Permutation tests can then assess the significance of individual predictor variables, as well as features of the corresponding spatio-temporal response functions. We demonstrate the viability of this technique by computing spatio-temporal response functions for speech stimuli, using predictor variables reflecting acoustic, lexical and semantic processing. Results indicate that processes related to comprehension of continuous speech can be differentiated anatomically as well as temporally: acoustic information engaged auditory cortex at short latencies, followed by responses over the central sulcus and inferior frontal gyrus, possibly related to somatosensory/motor cortex involvement in speech perception; lexical frequency was associated with a left-lateralized response in auditory cortex and subsequent bilateral frontal activity; and semantic composition was associated with bilateral temporal and frontal brain activity. We conclude that this technique can be used to study the neural processing of continuous stimuli in time and anatomical space with the millisecond temporal resolution of MEG. This suggests new avenues for analyzing neural processing of naturalistic stimuli, without the necessity of averaging over artificially short or truncated stimuli.

Collapse

Miran S, Akram S, Sheikhattar A, Simon JZ, Zhang T, Babadi B. Real-Time Tracking of Selective Auditory Attention From M/EEG: A Bayesian Filtering Approach. Front Neurosci 2018;12:262. [PMID: 29765298 PMCID: PMC5938416 DOI: 10.3389/fnins.2018.00262] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2017] [Accepted: 04/05/2018] [Indexed: 11/13/2022] Open

Abstract

Humans are able to identify and track a target speaker amid a cacophony of acoustic interference, an ability which is often referred to as the cocktail party phenomenon. Results from several decades of studying this phenomenon have culminated in recent years in various promising attempts to decode the attentional state of a listener in a competing-speaker environment from non-invasive neuroimaging recordings such as magnetoencephalography (MEG) and electroencephalography (EEG). To this end, most existing approaches compute correlation-based measures by either regressing the features of each speech stream to the M/EEG channels (the decoding approach) or vice versa (the encoding approach). To produce robust results, these procedures require multiple trials for training purposes. Also, their decoding accuracy drops significantly when operating at high temporal resolutions. Thus, they are not well-suited for emerging real-time applications such as smart hearing aid devices or brain-computer interface systems, where training data might be limited and high temporal resolutions are desired. In this paper, we close this gap by developing an algorithmic pipeline for real-time decoding of the attentional state. Our proposed framework consists of three main modules: (1) Real-time and robust estimation of encoding or decoding coefficients, achieved by sparse adaptive filtering, (2) Extracting reliable markers of the attentional state, and thereby generalizing the widely-used correlation-based measures thereof, and (3) Devising a near real-time state-space estimator that translates the noisy and variable attention markers to robust and statistically interpretable estimates of the attentional state with minimal delay. Our proposed algorithms integrate various techniques including forgetting factor-based adaptive filtering, ℓ₁-regularization, forward-backward splitting algorithms, fixed-lag smoothing, and Expectation Maximization. We validate the performance of our proposed framework using comprehensive simulations as well as application to experimentally acquired M/EEG data. Our results reveal that the proposed real-time algorithms perform nearly as accurately as the existing state-of-the-art offline techniques, while providing a significant degree of adaptivity, statistical robustness, and computational savings.

Collapse