Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kong YY, Mullangi A, Ding N. Differential modulation of auditory responses to attended and unattended speech in different listening conditions. Hear Res 2014;316:73-81. [PMID: 25124153 DOI: 10.1016/j.heares.2014.07.009] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Revised: 07/24/2014] [Accepted: 07/31/2014] [Indexed: 10/24/2022]

For:	Kong YY, Mullangi A, Ding N. Differential modulation of auditory responses to attended and unattended speech in different listening conditions. Hear Res 2014;316:73-81. [PMID: 25124153 DOI: 10.1016/j.heares.2014.07.009] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Revised: 07/24/2014] [Accepted: 07/31/2014] [Indexed: 10/24/2022]

Number

Cited by Other Article(s)

Roebben A, Heintz N, Geirnaert S, Francart T, Bertrand A. 'Are you even listening?' - EEG-based decoding of absolute auditory attention to natural speech. J Neural Eng 2024;21:036046. [PMID: 38834062 DOI: 10.1088/1741-2552/ad5403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 06/04/2024] [Indexed: 06/06/2024]

Abstract

Objective.In this study, we use electroencephalography (EEG) recordings to determine whether a subject is actively listening to a presented speech stimulus. More precisely, we aim to discriminate between an active listening condition, and a distractor condition where subjects focus on an unrelated distractor task while being exposed to a speech stimulus. We refer to this task as absolute auditory attention decoding.Approach.We re-use an existing EEG dataset where the subjects watch a silent movie as a distractor condition, and introduce a new dataset with two distractor conditions (silently reading a text and performing arithmetic exercises). We focus on two EEG features, namely neural envelope tracking (NET) and spectral entropy (SE). Additionally, we investigate whether the detection of such an active listening condition can be combined with a selective auditory attention decoding (sAAD) task, where the goal is to decide to which of multiple competing speakers the subject is attending. The latter is a key task in so-called neuro-steered hearing devices that aim to suppress unattended audio, while preserving the attended speaker.Main results.Contrary to a previous hypothesis of higher SE being related with actively listening rather than passively listening (without any distractors), we find significantly lower SE in the active listening condition compared to the distractor conditions. Nevertheless, the NET is consistently significantly higher when actively listening. Similarly, we show that the accuracy of a sAAD task improves when evaluating the accuracy only on the highest NET segments. However, the reverse is observed when evaluating the accuracy only on the lowest SE segments.Significance.We conclude that the NET is more reliable for decoding absolute auditory attention as it is consistently higher when actively listening, whereas the relation of the SE between actively and passively listening seems to depend on the nature of the distractor.

Collapse

Mizokuchi K, Tanaka T, Sato TG, Shiraki Y. Alpha band modulation caused by selective attention to music enables EEG classification. Cogn Neurodyn 2024;18:1005-1020. [PMID: 38826648 PMCID: PMC11143110 DOI: 10.1007/s11571-023-09955-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 02/19/2023] [Accepted: 03/08/2023] [Indexed: 06/04/2024] Open

Gao J, Chen H, Fang M, Ding N. Original speech and its echo are segregated and separately processed in the human brain. PLoS Biol 2024;22:e3002498. [PMID: 38358954 PMCID: PMC10868781 DOI: 10.1371/journal.pbio.3002498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/15/2024] [Indexed: 02/17/2024] Open

Zhang X, Li J, Li Z, Hong B, Diao T, Ma X, Nolte G, Engel AK, Zhang D. Leading and following: Noise differently affects semantic and acoustic processing during naturalistic speech comprehension. Neuroimage 2023;282:120404. [PMID: 37806465 DOI: 10.1016/j.neuroimage.2023.120404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 08/19/2023] [Accepted: 10/05/2023] [Indexed: 10/10/2023] Open

Abstract

Despite the distortion of speech signals caused by unavoidable noise in daily life, our ability to comprehend speech in noisy environments is relatively stable. However, the neural mechanisms underlying reliable speech-in-noise comprehension remain to be elucidated. The present study investigated the neural tracking of acoustic and semantic speech information during noisy naturalistic speech comprehension. Participants listened to narrative audio recordings mixed with spectrally matched stationary noise at three signal-to-ratio (SNR) levels (no noise, 3 dB, -3 dB), and 60-channel electroencephalography (EEG) signals were recorded. A temporal response function (TRF) method was employed to derive event-related-like responses to the continuous speech stream at both the acoustic and the semantic levels. Whereas the amplitude envelope of the naturalistic speech was taken as the acoustic feature, word entropy and word surprisal were extracted via the natural language processing method as two semantic features. Theta-band frontocentral TRF responses to the acoustic feature were observed at around 400 ms following speech fluctuation onset over all three SNR levels, and the response latencies were more delayed with increasing noise. Delta-band frontal TRF responses to the semantic feature of word entropy were observed at around 200 to 600 ms leading to speech fluctuation onset over all three SNR levels. The response latencies became more leading with increasing noise and decreasing speech comprehension and intelligibility. While the following responses to speech acoustics were consistent with previous studies, our study revealed the robustness of leading responses to speech semantics, which suggests a possible predictive mechanism at the semantic level for maintaining reliable speech comprehension in noisy environments.

Collapse

Wang B, Xu X, Niu Y, Wu C, Wu X, Chen J. EEG-based auditory attention decoding with audiovisual speech for hearing-impaired listeners. Cereb Cortex 2023;33:10972-10983. [PMID: 37750333 DOI: 10.1093/cercor/bhad325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/21/2023] [Accepted: 08/22/2023] [Indexed: 09/27/2023] Open

Jia Z, Xu C, Li J, Gao J, Ding N, Luo B, Zou J. Phase Property of Envelope-Tracking EEG Response Is Preserved in Patients with Disorders of Consciousness. eNeuro 2023;10:ENEURO.0130-23.2023. [PMID: 37500493 PMCID: PMC10420405 DOI: 10.1523/eneuro.0130-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 07/29/2023] Open

Deoisres S, Lu Y, Vanheusden FJ, Bell SL, Simpson DM. Continuous speech with pauses inserted between words increases cortical tracking of speech envelope. PLoS One 2023;18:e0289288. [PMID: 37498891 PMCID: PMC10374040 DOI: 10.1371/journal.pone.0289288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 07/17/2023] [Indexed: 07/29/2023] Open

Abstract

The decoding multivariate Temporal Response Function (decoder) or speech envelope reconstruction approach is a well-known tool for assessing the cortical tracking of speech envelope. It is used to analyse the correlation between the speech stimulus and the neural response. It is known that auditory late responses are enhanced with longer gaps between stimuli, but it is not clear if this applies to the decoder, and whether the addition of gaps/pauses in continuous speech could be used to increase the envelope reconstruction accuracy. We investigated this in normal hearing participants who listened to continuous speech with no added pauses (natural speech), and then with short (250 ms) or long (500 ms) silent pauses inserted between each word. The total duration for continuous speech stimulus with no, short, and long pauses were approximately, 10 minutes, 16 minutes, and 21 minutes, respectively. EEG and speech envelope were simultaneously acquired and then filtered into delta (1-4 Hz) and theta (4-8 Hz) frequency bands. In addition to analysing responses to the whole speech envelope, speech envelope was also segmented to focus response analysis on onset and non-onset regions of speech separately. Our results show that continuous speech with additional pauses inserted between words significantly increases the speech envelope reconstruction correlations compared to using natural speech, in both the delta and theta frequency bands. It also appears that these increase in speech envelope reconstruction are dominated by the onset regions in the speech envelope. Introducing pauses in speech stimuli has potential clinical benefit for increasing auditory evoked response detectability, though with the disadvantage of speech sounding less natural. The strong effect of pauses and onsets on the decoder should be considered when comparing results from different speech corpora. Whether the increased cortical response, when longer pauses are introduced, reflect improved intelligibility requires further investigation.

Collapse

Takai S, Kanno A, Kawase T, Shirakura M, Suzuki J, Nakasato N, Kawashima R, Katori Y. Possibility of additive effects by the presentation of visual information related to distractor sounds on the contra-sound effects of the N100m responses. Hear Res 2023;434:108778. [PMID: 37105052 DOI: 10.1016/j.heares.2023.108778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 04/13/2023] [Accepted: 04/21/2023] [Indexed: 04/29/2023]

Affiliation(s)

Shunsuke Takai Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan.
Akitake Kanno Department of Advanced Spintronics Medical Engineering, Graduate School of Engineering, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan; Department of Epileptology, Tohoku University Graduate School of Medicine, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
Tetsuaki Kawase Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan; Laboratory of Rehabilitative Auditory Science, Tohoku University Graduate School of Biomedical Engineering, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan; Department of Audiology, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan
Masayuki Shirakura Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan
Jun Suzuki Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan
Nobukatsu Nakasato Department of Advanced Spintronics Medical Engineering, Graduate School of Engineering, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan; Department of Epileptology, Tohoku University Graduate School of Medicine, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
Ryuta Kawashima Institute of Development, Aging and Cancer, Tohoku University, 4-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
Yukio Katori Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8574, Japan

Collapse

Slugocki C, Kuk F, Korhonen P. Left Lateralization of the Cortical Auditory-Evoked Potential Reflects Aided Processing and Speech-in-Noise Performance of Older Listeners With a Hearing Loss. Ear Hear 2023;44:399-410. [PMID: 36331191 DOI: 10.1097/aud.0000000000001293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Abstract

OBJECTIVES

We analyzed the lateralization of the cortical auditory-evoked potential recorded previously from aided hearing-impaired listeners as part of a study on noise-mitigating hearing aid technologies. Specifically, we asked whether the degree of leftward lateralization in the magnitudes and latencies of these components was reduced by noise and, conversely, enhanced/restored by hearing aid technology. We further explored if individual differences in lateralization could predict speech-in-noise abilities in listeners when tested in the aided mode.

DESIGN

The study followed a double-blind within-subjects design. Nineteen older adults (8 females; mean age = 73.6 years, range = 56 to 86 years) with moderate to severe hearing loss participated. The cortical auditory-evoked potential was measured over 400 presentations of a synthetic /da/ stimulus which was delivered binaurally in a simulated aided mode using shielded ear-insert transducers. Sequences of the /da/ syllable were presented from the front at 75 dB SPL-C with continuous speech-shaped noise presented from the back at signal-to-noise ratios of 0, 5, and 10 dB. Four hearing aid conditions were tested: (1) omnidirectional microphone (OM) with noise reduction (NR) disabled, (2) OM with NR enabled, (3) directional microphone (DM) with NR disabled, and (4) DM with NR enabled. Lateralization of the P1 component and N1P2 complex was quantified across electrodes spanning the mid-coronal plane. Subsequently, listener speech-in-noise performance was assessed using the Repeat-Recall Test at the same signal-to-noise ratios and hearing aid conditions used to measure cortical activity.

RESULTS

As expected, both the P1 component and the N1P2 complex were of greater magnitude in electrodes over the left compared to the right hemisphere. In addition, N1 and P2 peaks tended to occur earlier over the left hemisphere, although the effect was mediated by an interaction of signal-to-noise ratio and hearing aid technology. At a group level, degrees of lateralization for the P1 component and the N1P2 complex were enhanced in the DM relative to the OM mode. Moreover, linear mixed-effects models suggested that the degree of leftward lateralization in the N1P2 complex, but not the P1 component, accounted for a significant portion of variability in speech-in-noise performance that was not related to age, hearing loss, hearing aid processing, or signal-to-noise ratio.

CONCLUSIONS

A robust leftward lateralization of cortical potentials was observed in older listeners when tested in the aided mode. Moreover, the degree of lateralization was enhanced by hearing aid technologies that improve the signal-to-noise ratio for speech. Accounting for the effects of signal-to-noise ratio, hearing aid technology, semantic context, and audiometric thresholds, individual differences in left-lateralized speech-evoked cortical activity were found to predict listeners' speech-in-noise abilities. Quantifying cortical auditory-evoked potential component lateralization may then be useful for profiling listeners' likelihood of communication success following clinical amplification.

Collapse

Aljarboa GS, Bell SL, Simpson DM. Detecting cortical responses to continuous running speech using EEG data from only one channel. Int J Audiol 2023;62:199-208. [PMID: 35152811 DOI: 10.1080/14992027.2022.2035832] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Rosenkranz M, Cetin T, Uslar VN, Bleichner MG. Investigating the attentional focus to workplace-related soundscapes in a complex audio-visual-motor task using EEG. FRONTIERS IN NEUROERGONOMICS 2023;3:1062227. [PMID: 38235454 PMCID: PMC10790850 DOI: 10.3389/fnrgo.2022.1062227] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 12/16/2022] [Indexed: 01/19/2024]

Mesik J, Wojtczak M. The effects of data quantity on performance of temporal response function analyses of natural speech processing. Front Neurosci 2023;16:963629. [PMID: 36711133 PMCID: PMC9878558 DOI: 10.3389/fnins.2022.963629] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 12/26/2022] [Indexed: 01/15/2023] Open

Luo C, Gao Y, Fan J, Liu Y, Yu Y, Zhang X. Compromised word-level neural tracking in the high-gamma band for children with attention deficit hyperactivity disorder. Front Hum Neurosci 2023;17:1174720. [PMID: 37213926 PMCID: PMC10196181 DOI: 10.3389/fnhum.2023.1174720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 04/18/2023] [Indexed: 05/23/2023] Open

Wang S, Zhang X, Zhang J, Zong C. A synchronized multimodal neuroimaging dataset for studying brain language processing. Sci Data 2022;9:590. [PMID: 36180444 PMCID: PMC9525723 DOI: 10.1038/s41597-022-01708-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 08/22/2022] [Indexed: 11/15/2022] Open

Gillis M, Van Canneyt J, Francart T, Vanthornhout J. Neural tracking as a diagnostic tool to assess the auditory pathway. Hear Res 2022;426:108607. [PMID: 36137861 DOI: 10.1016/j.heares.2022.108607] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 08/11/2022] [Accepted: 09/12/2022] [Indexed: 11/20/2022]

Na Y, Joo H, Trang LT, Quan LDA, Woo J. Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses. Front Neurosci 2022;16:906616. [PMID: 36061597 PMCID: PMC9433707 DOI: 10.3389/fnins.2022.906616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 07/25/2022] [Indexed: 11/29/2022] Open

Abstract

Auditory prostheses provide an opportunity for rehabilitation of hearing-impaired patients. Speech intelligibility can be used to estimate the extent to which the auditory prosthesis improves the user’s speech comprehension. Although behavior-based speech intelligibility is the gold standard, precise evaluation is limited due to its subjectiveness. Here, we used a convolutional neural network to predict speech intelligibility from electroencephalography (EEG). Sixty-four–channel EEGs were recorded from 87 adult participants with normal hearing. Sentences spectrally degraded by a 2-, 3-, 4-, 5-, and 8-channel vocoder were used to set relatively low speech intelligibility conditions. A Korean sentence recognition test was used. The speech intelligibility scores were divided into 41 discrete levels ranging from 0 to 100%, with a step of 2.5%. Three scores, namely 30.0, 37.5, and 40.0%, were not collected. The speech features, i.e., the speech temporal envelope (ENV) and phoneme (PH) onset, were used to extract continuous-speech EEGs for speech intelligibility prediction. The deep learning model was trained by a dataset of event-related potentials (ERP), correlation coefficients between the ERPs and ENVs, between the ERPs and PH onset, or between ERPs and the product of the multiplication of PH and ENV (PHENV). The speech intelligibility prediction accuracies were 97.33% (ERP), 99.42% (ENV), 99.55% (PH), and 99.91% (PHENV). The models were interpreted using the occlusion sensitivity approach. While the ENV models’ informative electrodes were located in the occipital area, the informative electrodes of the phoneme models, i.e., PH and PHENV, were based on the occlusion sensitivity map located in the language processing area. Of the models tested, the PHENV model obtained the best speech intelligibility prediction accuracy. This model may promote clinical prediction of speech intelligibility with a comfort speech intelligibility test.

Collapse

Muncke J, Kuruvila I, Hoppe U. Prediction of Speech Intelligibility by Means of EEG Responses to Sentences in Noise. Front Neurosci 2022;16:876421. [PMID: 35720724 PMCID: PMC9198593 DOI: 10.3389/fnins.2022.876421] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open

Abstract

Objective

Understanding speech in noisy conditions is challenging even for people with mild hearing loss, and intelligibility for an individual person is usually evaluated by using several subjective test methods. In the last few years, a method has been developed to determine a temporal response function (TRF) between speech envelope and simultaneous electroencephalographic (EEG) measurements. By using this TRF it is possible to predict the EEG signal for any speech signal. Recent studies have suggested that the accuracy of this prediction varies with the level of noise added to the speech signal and can predict objectively the individual speech intelligibility. Here we assess the variations of the TRF itself when it is calculated for measurements with different signal-to-noise ratios and apply these variations to predict speech intelligibility.

Methods

For 18 normal hearing subjects the individual threshold of 50% speech intelligibility was determined by using a speech in noise test. Additionally, subjects listened passively to speech material of the speech in noise test at different signal-to-noise ratios close to individual threshold of 50% speech intelligibility while an EEG was recorded. Afterwards the shape of TRFs for each signal-to-noise ratio and subject were compared with the derived intelligibility.

Results

The strongest effect of variations in stimulus signal-to-noise ratio on the TRF shape occurred close to 100 ms after the stimulus presentation, and was located in the left central scalp region. The investigated variations in TRF morphology showed a strong correlation with speech intelligibility, and we were able to predict the individual threshold of 50% speech intelligibility with a mean deviation of less then 1.5 dB.

Conclusion

The intelligibility of speech in noise can be predicted by analyzing the shape of the TRF derived from different stimulus signal-to-noise ratios. Because TRFs are interpretable, in a manner similar to auditory evoked potentials, this method offers new options for clinical diagnostics.

Collapse

Nogueira W, Dolhopiatenko H. Predicting speech intelligibility from a selective attention decoding paradigm in cochlear implant users. J Neural Eng 2022;19. [PMID: 35234663 DOI: 10.1088/1741-2552/ac599f] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 03/01/2022] [Indexed: 11/12/2022]

Abstract

OBJECTIVES

Electroencephalography (EEG) can be used to decode selective attention in cochlear implant (CI) users. This work investigates if selective attention to an attended speech source in the presence of a concurrent speech source can predict speech understanding in CI users.

APPROACH

CI users were instructed to attend to one out of two speech streams while EEG was recorded. Both speech streams were presented to the same ear and at different signal to interference ratios (SIRs). Speech envelope reconstruction of the to-be-attended speech from EEG was obtained by training decoders using regularized least squares. The correlation coefficient between the reconstructed and the attended (ρ_(A_SIR )) or the unattended (ρ_(U_SIR )) speech stream at each SIR was computed. Additionally, we computed the difference correlation coefficient at the same 〖(ρ〗_Diff= ρ_(A_SIR )-ρ_(U_SIR )) and opposite SIR (ρ_DiffOpp= ρ_(A_SIR )-ρ_(U_(-SIR) )). ρ_Diff compares the attended and unattended correlation coefficient to speech sources presented at different presentation levels depending on SIR. In contrast, ρ_DiffOpp compares the attended and unattended correlation coefficients to speech sources presented at the same presentation level irrespective of SIR.

MAIN RESULTS

Selective attention decoding in CI users is possible even if both speech streams are presented monaurally. A significant effect of SIR on ρ_(A_SIR ), ρ_Diff and ρ_DiffOpp, but not on ρ_(U_SIR ), was observed. Finally, the results show a significant correlation between speech understanding performance and ρ_(A_SIR ) as well as with ρ_(U_SIR ) across subjects. Moreover, ρ_DiffOpp which is less affected by the CI artifact, also demonstrated a significant correlation with speech understanding.

SIGNIFICANCE

Selective attention decoding in CI users is possible, however care needs to be taken with the CI artifact and the speech material used to train the decoders. These results are important for future development of objective speech understanding measures for CI users.

Collapse

Wang L, Wang Y, Liu Z, Wu EX, Chen F. A Speech-Level–Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes. Front Neurosci 2022;15:760611. [PMID: 35221885 PMCID: PMC8866945 DOI: 10.3389/fnins.2021.760611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 12/30/2021] [Indexed: 11/21/2022] Open

Abstract

In the competing speaker environments, human listeners need to focus or switch their auditory attention according to dynamic intentions. The reliable cortical tracking ability to the speech envelope is an effective feature for decoding the target speech from the neural signals. Moreover, previous studies revealed that the root mean square (RMS)–level–based speech segmentation made a great contribution to the target speech perception with the modulation of sustained auditory attention. This study further investigated the effect of the RMS-level–based speech segmentation on the auditory attention decoding (AAD) performance with both sustained and switched attention in the competing speaker auditory scenes. Objective biomarkers derived from the cortical activities were also developed to index the dynamic auditory attention states. In the current study, subjects were asked to concentrate or switch their attention between two competing speaker streams. The neural responses to the higher- and lower-RMS-level speech segments were analyzed via the linear temporal response function (TRF) before and after the attention switching from one to the other speaker stream. Furthermore, the AAD performance decoded by the unified TRF decoding model was compared to that by the speech-RMS-level–based segmented decoding model with the dynamic change of the auditory attention states. The results showed that the weight of the typical TRF component approximately 100-ms time lag was sensitive to the switching of the auditory attention. Compared to the unified AAD model, the segmented AAD model improved attention decoding performance under both the sustained and switched auditory attention modulations in a wide range of signal-to-masker ratios (SMRs). In the competing speaker scenes, the TRF weight and AAD accuracy could be used as effective indicators to detect the changes of the auditory attention. In addition, with a wide range of SMRs (i.e., from 6 to –6 dB in this study), the segmented AAD model showed the robust decoding performance even with short decision window length, suggesting that this speech-RMS-level–based model has the potential to decode dynamic attention states in the realistic auditory scenarios.

Collapse

Li J, Hong B, Nolte G, Engel AK, Zhang D. Preparatory delta phase response is correlated with naturalistic speech comprehension performance. Cogn Neurodyn 2021;16:337-352. [PMID: 35401861 PMCID: PMC8934811 DOI: 10.1007/s11571-021-09711-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 07/09/2021] [Accepted: 08/12/2021] [Indexed: 01/07/2023] Open

Sokoliuk R, Degano G, Melloni L, Noppeney U, Cruse D. The Influence of Auditory Attention on Rhythmic Speech Tracking: Implications for Studies of Unresponsive Patients. Front Hum Neurosci 2021;15:702768. [PMID: 34456697 PMCID: PMC8385206 DOI: 10.3389/fnhum.2021.702768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/21/2021] [Indexed: 11/13/2022] Open

Abstract

Language comprehension relies on integrating words into progressively more complex structures, like phrases and sentences. This hierarchical structure-building is reflected in rhythmic neural activity across multiple timescales in E/MEG in healthy, awake participants. However, recent studies have shown evidence for this “cortical tracking” of higher-level linguistic structures also in a proportion of unresponsive patients. What does this tell us about these patients’ residual levels of cognition and consciousness? Must the listener direct their attention toward higher level speech structures to exhibit cortical tracking, and would selective attention across levels of the hierarchy influence the expression of these rhythms? We investigated these questions in an EEG study of 72 healthy human volunteers listening to streams of monosyllabic isochronous English words that were either unrelated (scrambled condition) or composed of four-word-sequences building meaningful sentences (sentential condition). Importantly, there were no physical cues between four-word-sentences. Rather, boundaries were marked by syntactic structure and thematic role assignment. Participants were divided into three attention groups: from passive listening (passive group) to attending to individual words (word group) or sentences (sentence group). The passive and word groups were initially naïve to the sentential stimulus structure, while the sentence group was not. We found significant tracking at word- and sentence rate across all three groups, with sentence tracking linked to left middle temporal gyrus and right superior temporal gyrus. Goal-directed attention to words did not enhance word-rate-tracking, suggesting that word tracking here reflects largely automatic mechanisms, as was shown for tracking at the syllable-rate before. Importantly, goal-directed attention to sentences relative to words significantly increased sentence-rate-tracking over left inferior frontal gyrus. This attentional modulation of rhythmic EEG activity at the sentential rate highlights the role of attention in integrating individual words into complex linguistic structures. Nevertheless, given the presence of high-level cortical tracking under conditions of lower attentional effort, our findings underline the suitability of the paradigm in its clinical application in patients after brain injury. The neural dissociation between passive tracking of sentences and directed attention to sentences provides a potential means to further characterise the cognitive state of each unresponsive patient.

Collapse

Chang A, Bedoin N, Canette LH, Nozaradan S, Thompson D, Corneyllie A, Tillmann B, Trainor LJ. Atypical beta power fluctuation while listening to an isochronous sequence in dyslexia. Clin Neurophysiol 2021;132:2384-2390. [PMID: 34454265 DOI: 10.1016/j.clinph.2021.05.037] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 04/22/2021] [Accepted: 05/31/2021] [Indexed: 11/29/2022]

Bonacci LM, Bressler S, Shinn-Cunningham BG. Nonspatial Features Reduce the Reliance on Sustained Spatial Auditory Attention. Ear Hear 2021;41:1635-1647. [PMID: 33136638 PMCID: PMC9831360 DOI: 10.1097/aud.0000000000000879] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Verschueren E, Vanthornhout J, Francart T. The Effect of Stimulus Choice on an EEG-Based Objective Measure of Speech Intelligibility. Ear Hear 2021;41:1586-1597. [PMID: 33136634 DOI: 10.1097/aud.0000000000000875] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Abstract

OBJECTIVES

Recently, an objective measure of speech intelligibility (SI), based on brain responses derived from the electroencephalogram (EEG), has been developed using isolated Matrix sentences as a stimulus. We investigated whether this objective measure of SI can also be used with natural speech as a stimulus, as this would be beneficial for clinical applications.

DESIGN

We recorded the EEG in 19 normal-hearing participants while they listened to two types of stimuli: Matrix sentences and a natural story. Each stimulus was presented at different levels of SI by adding speech weighted noise. SI was assessed in two ways for both stimuli: (1) behaviorally and (2) objectively by reconstructing the speech envelope from the EEG using a linear decoder and correlating it with the acoustic envelope. We also calculated temporal response functions (TRFs) to investigate the temporal characteristics of the brain responses in the EEG channels covering different brain areas.

RESULTS

For both stimulus types, the correlation between the speech envelope and the reconstructed envelope increased with increasing SI. In addition, correlations were higher for the natural story than for the Matrix sentences. Similar to the linear decoder analysis, TRF amplitudes increased with increasing SI for both stimuli. Remarkable is that although SI remained unchanged under the no-noise and +2.5 dB SNR conditions, neural speech processing was affected by the addition of this small amount of noise: TRF amplitudes across the entire scalp decreased between 0 and 150 ms, while amplitudes between 150 and 200 ms increased in the presence of noise. TRF latency changes in function of SI appeared to be stimulus specific: the latency of the prominent negative peak in the early responses (50 to 300 ms) increased with increasing SI for the Matrix sentences, but remained unchanged for the natural story.

CONCLUSIONS

These results show (1) the feasibility of natural speech as a stimulus for the objective measure of SI; (2) that neural tracking of speech is enhanced using a natural story compared to Matrix sentences; and (3) that noise and the stimulus type can change the temporal characteristics of the brain responses. These results might reflect the integration of incoming acoustic features and top-down information, suggesting that the choice of the stimulus has to be considered based on the intended purpose of the measurement.

Collapse

Rosenkranz M, Holtze B, Jaeger M, Debener S. EEG-Based Intersubject Correlations Reflect Selective Attention in a Competing Speaker Scenario. Front Neurosci 2021;15:685774. [PMID: 34194296 PMCID: PMC8236636 DOI: 10.3389/fnins.2021.685774] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 05/18/2021] [Indexed: 11/13/2022] Open

Mesik J, Ray L, Wojtczak M. Effects of Age on Cortical Tracking of Word-Level Features of Continuous Competing Speech. Front Neurosci 2021;15:635126. [PMID: 33867920 PMCID: PMC8047075 DOI: 10.3389/fnins.2021.635126] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 03/12/2021] [Indexed: 01/17/2023] Open

Abstract

Speech-in-noise comprehension difficulties are common among the elderly population, yet traditional objective measures of speech perception are largely insensitive to this deficit, particularly in the absence of clinical hearing loss. In recent years, a growing body of research in young normal-hearing adults has demonstrated that high-level features related to speech semantics and lexical predictability elicit strong centro-parietal negativity in the EEG signal around 400 ms following the word onset. Here we investigate effects of age on cortical tracking of these word-level features within a two-talker speech mixture, and their relationship with self-reported difficulties with speech-in-noise understanding. While undergoing EEG recordings, younger and older adult participants listened to a continuous narrative story in the presence of a distractor story. We then utilized forward encoding models to estimate cortical tracking of four speech features: (1) word onsets, (2) "semantic" dissimilarity of each word relative to the preceding context, (3) lexical surprisal for each word, and (4) overall word audibility. Our results revealed robust tracking of all features for attended speech, with surprisal and word audibility showing significantly stronger contributions to neural activity than dissimilarity. Additionally, older adults exhibited significantly stronger tracking of word-level features than younger adults, especially over frontal electrode sites, potentially reflecting increased listening effort. Finally, neuro-behavioral analyses revealed trends of a negative relationship between subjective speech-in-noise perception difficulties and the model goodness-of-fit for attended speech, as well as a positive relationship between task performance and the goodness-of-fit, indicating behavioral relevance of these measures. Together, our results demonstrate the utility of modeling cortical responses to multi-talker speech using complex, word-level features and the potential for their use to study changes in speech processing due to aging and hearing loss.

Collapse

Holtze B, Jaeger M, Debener S, Adiloğlu K, Mirkovic B. Are They Calling My Name? Attention Capture Is Reflected in the Neural Tracking of Attended and Ignored Speech. Front Neurosci 2021;15:643705. [PMID: 33828451 PMCID: PMC8019946 DOI: 10.3389/fnins.2021.643705] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/19/2021] [Indexed: 11/15/2022] Open

Luo C, Ding N. Cortical encoding of acoustic and linguistic rhythms in spoken narratives. eLife 2020;9:60433. [PMID: 33345775 PMCID: PMC7775109 DOI: 10.7554/elife.60433] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 12/20/2020] [Indexed: 11/13/2022] Open

Wang L, Wu EX, Chen F. Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions. Front Hum Neurosci 2020;14:557534. [PMID: 33132874 PMCID: PMC7576187 DOI: 10.3389/fnhum.2020.557534] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 09/09/2020] [Indexed: 11/25/2022] Open

Alickovic E, Lunner T, Wendt D, Fiedler L, Hietkamp R, Ng EHN, Graversen C. Neural Representation Enhanced for Speech and Reduced for Background Noise With a Hearing Aid Noise Reduction Scheme During a Selective Attention Task. Front Neurosci 2020;14:846. [PMID: 33071722 PMCID: PMC7533612 DOI: 10.3389/fnins.2020.00846] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 07/20/2020] [Indexed: 12/23/2022] Open

Abstract

Objectives

Selectively attending to a target talker while ignoring multiple interferers (competing talkers and background noise) is more difficult for hearing-impaired (HI) individuals compared to normal-hearing (NH) listeners. Such tasks also become more difficult as background noise levels increase. To overcome these difficulties, hearing aids (HAs) offer noise reduction (NR) schemes. The objective of this study was to investigate the effect of NR processing (inactive, where the NR feature was switched off, vs. active, where the NR feature was switched on) on the neural representation of speech envelopes across two different background noise levels [+3 dB signal-to-noise ratio (SNR) and +8 dB SNR] by using a stimulus reconstruction (SR) method.

Design

To explore how NR processing supports the listeners’ selective auditory attention, we recruited 22 HI participants fitted with HAs. To investigate the interplay between NR schemes, background noise, and neural representation of the speech envelopes, we used electroencephalography (EEG). The participants were instructed to listen to a target talker in front while ignoring a competing talker in front in the presence of multi-talker background babble noise.

Results

The results show that the neural representation of the attended speech envelope was enhanced by the active NR scheme for both background noise levels. The neural representation of the attended speech envelope at lower (+3 dB) SNR was shifted, approximately by 5 dB, toward the higher (+8 dB) SNR when the NR scheme was turned on. The neural representation of the ignored speech envelope was modulated by the NR scheme and was mostly enhanced in the conditions with more background noise. The neural representation of the background noise was modulated (i.e., reduced) by the NR scheme and was significantly reduced in the conditions with more background noise. The neural representation of the net sum of the ignored acoustic scene (ignored talker and background babble) was not modulated by the NR scheme but was significantly reduced in the conditions with a reduced level of background noise. Taken together, we showed that the active NR scheme enhanced the neural representation of both the attended and the ignored speakers and reduced the neural representation of background noise, while the net sum of the ignored acoustic scene was not enhanced.

Conclusion

Altogether our results support the hypothesis that the NR schemes in HAs serve to enhance the neural representation of speech and reduce the neural representation of background noise during a selective attention task. We contend that these results provide a neural index that could be useful for assessing the effects of HAs on auditory and cognitive processing in HI populations.

Collapse

Jaeger M, Mirkovic B, Bleichner MG, Debener S. Decoding the Attended Speaker From EEG Using Adaptive Evaluation Intervals Captures Fluctuations in Attentional Listening. Front Neurosci 2020;14:603. [PMID: 32612507 PMCID: PMC7308709 DOI: 10.3389/fnins.2020.00603] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 05/15/2020] [Indexed: 11/13/2022] Open

Abstract

Listeners differ in their ability to attend to a speech stream in the presence of a competing sound. Differences in speech intelligibility in noise cannot be fully explained by the hearing ability which suggests the involvement of additional cognitive factors. A better understanding of the temporal fluctuations in the ability to pay selective auditory attention to a desired speech stream may help in explaining these variabilities. In order to better understand the temporal dynamics of selective auditory attention, we developed an online auditory attention decoding (AAD) processing pipeline based on speech envelope tracking in the electroencephalogram (EEG). Participants had to attend to one audiobook story while a second one had to be ignored. Online AAD was applied to track the attention toward the target speech signal. Individual temporal attention profiles were computed by combining an established AAD method with an adaptive staircase procedure. The individual decoding performance over time was analyzed and linked to behavioral performance as well as subjective ratings of listening effort, motivation, and fatigue. The grand average attended speaker decoding profile derived in the online experiment indicated performance above chance level. Parameters describing the individual AAD performance in each testing block indicated significant differences in decoding performance over time to be closely related to the behavioral performance in the selective listening task. Further, an exploratory analysis indicated that subjects with poor decoding performance reported higher listening effort and fatigue compared to good performers. Taken together our results show that online EEG based AAD in a complex listening situation is feasible. Adaptive attended speaker decoding profiles over time could be used as an objective measure of behavioral performance and listening effort. The developed online processing pipeline could also serve as a basis for future EEG based near real-time auditory neurofeedback systems.

Collapse

Wang Y, Zhang J, Zou J, Luo H, Ding N. Prior Knowledge Guides Speech Segregation in Human Auditory Cortex. Cereb Cortex 2020;29:1561-1571. [PMID: 29788144 DOI: 10.1093/cercor/bhy052] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/22/2018] [Accepted: 02/15/2018] [Indexed: 11/12/2022] Open

Paul BT, Uzelac M, Chan E, Dimitrijevic A. Poor early cortical differentiation of speech predicts perceptual difficulties of severely hearing-impaired listeners in multi-talker environments. Sci Rep 2020;10:6141. [PMID: 32273536 PMCID: PMC7145807 DOI: 10.1038/s41598-020-63103-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 03/24/2020] [Indexed: 11/23/2022] Open

Cortical auditory responses index the contributions of different RMS-level-dependent segments to speech intelligibility. Hear Res 2019;383:107808. [DOI: 10.1016/j.heares.2019.107808] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 09/17/2019] [Accepted: 10/01/2019] [Indexed: 10/25/2022]

Vanthornhout J, Decruy L, Francart T. Effect of Task and Attention on Neural Tracking of Speech. Front Neurosci 2019;13:977. [PMID: 31607841 PMCID: PMC6756133 DOI: 10.3389/fnins.2019.00977] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Accepted: 08/30/2019] [Indexed: 12/02/2022] Open

Lesenfants D, Vanthornhout J, Verschueren E, Decruy L, Francart T. Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations. Hear Res 2019;380:1-9. [DOI: 10.1016/j.heares.2019.05.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 05/20/2019] [Accepted: 05/21/2019] [Indexed: 10/26/2022]

Xie Z, Reetzke R, Chandrasekaran B. Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019;62:587-601. [PMID: 30950746 PMCID: PMC6802895 DOI: 10.1044/2018_jslhr-s-astm-18-0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 10/28/2018] [Accepted: 11/26/2018] [Indexed: 05/27/2023]

Abstract

Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based approaches to analyzing speech-evoked neurophysiological responses. Method Two categories of ML-based approaches are introduced: decoding models, which generate a speech stimulus output using the features from the neurophysiological responses, and encoding models, which use speech stimulus features to predict neurophysiological responses. In this review, we focus on (a) a decoding model classification approach, wherein speech-evoked neurophysiological responses are classified as belonging to 1 of a finite set of possible speech events (e.g., phonological categories), and (b) an encoding model temporal response function approach, which quantifies the transformation of a speech stimulus feature to continuous neural activity. Results We illustrate the utility of the classification approach to analyze early electroencephalographic (EEG) responses to Mandarin lexical tone categories from a traditional experimental design, and to classify EEG responses to English phonemes evoked by natural continuous speech (i.e., an audiobook) into phonological categories (plosive, fricative, nasal, and vowel). We also demonstrate the utility of temporal response function to predict EEG responses to natural continuous speech from acoustic features. Neural metrics from the 3 examples all exhibit statistically significant effects at the individual level. Conclusion We propose that ML-based approaches can complement traditional analysis approaches to analyze neurophysiological responses to speech signals and provide a deeper understanding of natural speech and language processing using ecologically valid paradigms in both typical and clinical populations.

Collapse

Müller JA, Wendt D, Kollmeier B, Debener S, Brand T. Effect of Speech Rate on Neural Tracking of Speech. Front Psychol 2019;10:449. [PMID: 30906273 PMCID: PMC6418035 DOI: 10.3389/fpsyg.2019.00449] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 02/14/2019] [Indexed: 12/03/2022] Open

Hambrook DA, Tata MS. The effects of distractor set-size on neural tracking of attended speech. BRAIN AND LANGUAGE 2019;190:1-9. [PMID: 30616147 DOI: 10.1016/j.bandl.2018.12.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 11/19/2018] [Accepted: 12/19/2018] [Indexed: 06/09/2023]

Kumagai Y, Matsui R, Tanaka T. Music Familiarity Affects EEG Entrainment When Little Attention Is Paid. Front Hum Neurosci 2018;12:444. [PMID: 30459583 PMCID: PMC6232314 DOI: 10.3389/fnhum.2018.00444] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 10/16/2018] [Indexed: 11/21/2022] Open

Fiedler L, Wöstmann M, Herbst SK, Obleser J. Late cortical tracking of ignored speech facilitates neural selectivity in acoustically challenging conditions. Neuroimage 2018;186:33-42. [PMID: 30367953 DOI: 10.1016/j.neuroimage.2018.10.057] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Revised: 09/12/2018] [Accepted: 10/21/2018] [Indexed: 11/25/2022] Open

Gandras K, Grimm S, Bendixen A. Electrophysiological Correlates of Speaker Segregation and Foreground-Background Selection in Ambiguous Listening Situations. Neuroscience 2018;389:19-29. [PMID: 28735101 DOI: 10.1016/j.neuroscience.2017.07.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 07/10/2017] [Accepted: 07/10/2017] [Indexed: 11/15/2022]

Holt LL, Tierney AT, Guerra G, Laffere A, Dick F. Dimension-selective attention as a possible driver of dynamic, context-dependent re-weighting in speech processing. Hear Res 2018;366:50-64. [PMID: 30131109 PMCID: PMC6107307 DOI: 10.1016/j.heares.2018.06.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 06/10/2018] [Accepted: 06/19/2018] [Indexed: 12/24/2022]

Olguin A, Bekinschtein TA, Bozic M. Neural Encoding of Attended Continuous Speech under Different Types of Interference. J Cogn Neurosci 2018;30:1606-1619. [PMID: 30004849 DOI: 10.1162/jocn_a_01303] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Haghighi M, Moghadamfalahi M, Akcakaya M, Shinn-Cunningham BG, Erdogmus D. A Graphical Model for Online Auditory Scene Modulation Using EEG Evidence for Attention. IEEE Trans Neural Syst Rehabil Eng 2017;25:1970-1977. [PMID: 28600256 PMCID: PMC5681401 DOI: 10.1109/tnsre.2017.2712419] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Haghighi M, Moghadamfalahi M, Akcakaya M, Erdogmus D. EEG-assisted Modulation of Sound Sources in the Auditory Scene. Biomed Signal Process Control 2017;39:263-270. [PMID: 31118975 DOI: 10.1016/j.bspc.2017.08.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Fiedler L, Obleser J, Lunner T, Graversen C. Ear-EEG allows extraction of neural responses in challenging listening scenarios - A future technology for hearing aids? ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017;2016:5697-5700. [PMID: 28269548 DOI: 10.1109/embc.2016.7592020] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Kumagai Y, Arvaneh M, Okawa H, Wada T, Tanaka T. Classification of familiarity based on cross-correlation features between EEG and music. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017;2017:2879-2882. [PMID: 29060499 DOI: 10.1109/embc.2017.8037458] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Fuglsang SA, Dau T, Hjortkjær J. Noise-robust cortical tracking of attended speech in real-world acoustic scenes. Neuroimage 2017;156:435-444. [PMID: 28412441 DOI: 10.1016/j.neuroimage.2017.04.026] [Citation(s) in RCA: 97] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2016] [Revised: 04/07/2017] [Accepted: 04/10/2017] [Indexed: 11/30/2022] Open

Abstract

Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream of a particular talker. Across the different listening environments, we found that the attended talker could be accurately decoded from single-trial EEG data irrespective of the different distortions in the acoustic input. For highly reverberant environments, speech envelopes reconstructed from neural responses to the distorted stimuli resembled the original clean signal more than the distorted input. With reverberant speech, we observed a late cortical response to the attended speech stream that encoded temporal modulations in the speech signal without its reverberant distortion. Single-trial attention decoding accuracies based on 40-50s long blocks of data from 64 scalp electrodes were equally high (80-90% correct) in all considered listening environments and remained statistically significant using down to 10 scalp electrodes and short (<30-s) unaveraged EEG segments. In contrast to the robust decoding of the attended talker we found that decoding of the unattended talker deteriorated with the acoustic distortions. These results suggest that cortical activity tracks an attended speech signal in a way that is invariant to acoustic distortions encountered in real-life sound environments. Noise-robust attention decoding additionally suggests a potential utility of stimulus reconstruction techniques in attention-controlled brain-computer interfaces.

Collapse

Fiedler L, Wöstmann M, Graversen C, Brandmeyer A, Lunner T, Obleser J. Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech. J Neural Eng 2017;14:036020. [PMID: 28384124 DOI: 10.1088/1741-2552/aa66dd] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]