Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: O'Sullivan J, Herrero J, Smith E, Schevon C, McKhann GM, Sheth SA, Mehta AD, Mesgarani N. Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception. Neuron 2019;104:1195-1209.e3. [PMID: 31648900 DOI: 10.1016/j.neuron.2019.09.007] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/11/2019] [Accepted: 09/06/2019] [Indexed: 11/15/2022]

For:	O'Sullivan J, Herrero J, Smith E, Schevon C, McKhann GM, Sheth SA, Mehta AD, Mesgarani N. Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception. Neuron 2019;104:1195-1209.e3. [PMID: 31648900 DOI: 10.1016/j.neuron.2019.09.007] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/11/2019] [Accepted: 09/06/2019] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Joshi N, Ng Y, Thakkar K, Duque D, Yin P, Fritz J, Elhilali M, Shamma S. Temporal Coherence Shapes Cortical Responses to Speech Mixtures in a Ferret Cocktail Party. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.595171. [PMID: 38915590 PMCID: PMC11195067 DOI: 10.1101/2024.05.21.595171] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]

Abstract

Segregation of complex sounds such as speech, music and animal vocalizations as they simultaneously emanate from multiple sources (referred to as the "cocktail party problem") is a remarkable ability that is common in humans and animals alike. The neural underpinnings of this process have been extensively studied behaviorally and physiologically in non-human animals primarily with simplified sounds (tones and noise sequences). In humans, segregation experiments utilizing more complex speech mixtures are common; but physiological experiments have relied on EEG/MEG/ECoG recordings that sample activity from thousands of neurons, often obscuring the detailed processes that give rise to the observed segregation. The present study combines the insights from animal single-unit physiology with segregation of speech-like mixtures. Ferrets were trained to attend to a female voice and detect a target word, both in presence or absence of a concurrent, equally salient male voice. Single neuron recordings were obtained from primary and secondary ferret auditory cortical fields, as well as frontal cortex. During task performance, representation of the female words became more enhanced relative to those of the (distractor) male in all cortical regions, especially in the higher auditory cortical field. Analysis of the temporal and spectral response characteristics during task performance reveals how speech segregation gradually emerges in the auditory cortex. A computational model evaluated on the same voice mixtures replicates and extends these results to different attentional targets (attention to female or male voices). These findings are consistent with the temporal coherence theory whereby attention to a target voice anchors neural activity in cortical networks hence binding together channels that are coherently temporally-modulated with the target, and ultimately forming a common auditory stream.

Collapse

Lu K, Dutta K, Mohammed A, Elhilali M, Shamma S. Temporal-Coherence Induces Binding of Responses to Sound Sequences in Ferret Auditory Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.595170. [PMID: 38854125 PMCID: PMC11160575 DOI: 10.1101/2024.05.21.595170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

van der Heijden K, Patel P, Bickel S, Herrero JL, Mehta AD, Mesgarani N. Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.13.593814. [PMID: 38798551 PMCID: PMC11118436 DOI: 10.1101/2024.05.13.593814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Abstract

Listeners readily extract multi-dimensional auditory objects such as a 'localized talker' from complex acoustic scenes with multiple talkers. Yet, the neural mechanisms underlying simultaneous encoding and linking of different sound features - for example, a talker's voice and location - are poorly understood. We analyzed invasive intracranial recordings in neurosurgical patients attending to a localized talker in real-life cocktail party scenarios. We found that sensitivity to an individual talker's voice and location features was distributed throughout auditory cortex and that neural sites exhibited a gradient from sensitivity to a single feature to joint sensitivity to both features. On a population level, cortical response patterns of both dual-feature sensitive sites but also single-feature sensitive sites revealed simultaneous encoding of an attended talker's voice and location features. However, for single-feature sensitive sites, the representation of the primary feature was more precise. Further, sites which selective tracked an attended speech stream concurrently encoded an attended talker's voice and location features, indicating that such sites combine selective tracking of an attended auditory object with encoding of the object's features. Finally, we found that attending a localized talker selectively enhanced temporal coherence between single-feature voice sensitive sites and single-feature location sensitive sites, providing an additional mechanism for linking voice and location in multi-talker scenes. These results demonstrate that a talker's voice and location features are linked during multi-dimensional object formation in naturalistic multi-talker scenes by joint population coding as well as by temporal coherence between neural sites.

SIGNIFICANCE STATEMENT

Listeners effortlessly extract auditory objects from complex acoustic scenes consisting of multiple sound sources in naturalistic, spatial sound scenes. Yet, how the brain links different sound features to form a multi-dimensional auditory object is poorly understood. We investigated how neural responses encode and integrate an attended talker's voice and location features in spatial multi-talker sound scenes to elucidate which neural mechanisms underlie simultaneous encoding and linking of different auditory features. Our results show that joint population coding as well as temporal coherence mechanisms contribute to distributed multi-dimensional auditory object encoding. These findings shed new light on cortical functional specialization and multidimensional auditory object formation in complex, naturalistic listening scenes.

HIGHLIGHTS

Cortical responses to an single talker exhibit a distributed gradient, ranging from sites that are sensitive to both a talker's voice and location (dual-feature sensitive sites) to sites that are sensitive to either voice or location (single-feature sensitive sites).Population response patterns of dual-feature sensitive sites encode voice and location features of the attended talker in multi-talker scenes jointly and with equal precision.Despite their sensitivity to a single feature at the level of individual cortical sites, population response patterns of single-feature sensitive sites also encode location and voice features of a talker jointly, but with higher precision for the feature they are primarily sensitive to.Neural sites which selectively track an attended speech stream concurrently encode the attended talker's voice and location features.Attention selectively enhances temporal coherence between voice and location selective sites over time.Joint population coding as well as temporal coherence mechanisms underlie distributed multi-dimensional auditory object encoding in auditory cortex.

Collapse

Puschmann S, Regev M, Fakhar K, Zatorre RJ, Thiel CM. Attention-Driven Modulation of Auditory Cortex Activity during Selective Listening in a Multispeaker Setting. J Neurosci 2024;44:e1157232023. [PMID: 38388426 PMCID: PMC11007309 DOI: 10.1523/jneurosci.1157-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/30/2023] [Accepted: 11/05/2023] [Indexed: 02/24/2024] Open

Viswanathan V, Rupp KM, Hect JL, Harford EE, Holt LL, Abel TJ. Intracranial Mapping of Response Latencies and Task Effects for Spoken Syllable Processing in the Human Brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588349. [PMID: 38617227 PMCID: PMC11014624 DOI: 10.1101/2024.04.05.588349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]

Tune S, Obleser J. Neural attentional filters and behavioural outcome follow independent individual trajectories over the adult lifespan. eLife 2024;12:RP92079. [PMID: 38470243 DOI: 10.7554/elife.92079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024] Open

Wikman P, Salmela V, Sjöblom E, Leminen M, Laine M, Alho K. Attention to audiovisual speech shapes neural processing through feedback-feedforward loops between different nodes of the speech network. PLoS Biol 2024;22:e3002534. [PMID: 38466713 PMCID: PMC10957087 DOI: 10.1371/journal.pbio.3002534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/21/2024] [Accepted: 01/30/2024] [Indexed: 03/13/2024] Open

Gao J, Chen H, Fang M, Ding N. Original speech and its echo are segregated and separately processed in the human brain. PLoS Biol 2024;22:e3002498. [PMID: 38358954 PMCID: PMC10868781 DOI: 10.1371/journal.pbio.3002498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/15/2024] [Indexed: 02/17/2024] Open

Wang Q, Luo L, Xu N, Wang J, Yang R, Chen G, Ren J, Luan G, Fang F. Neural response properties predict perceived contents and locations elicited by intracranial electrical stimulation of human auditory cortex. Cereb Cortex 2024;34:bhad517. [PMID: 38185991 DOI: 10.1093/cercor/bhad517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 12/09/2023] [Accepted: 12/10/2023] [Indexed: 01/09/2024] Open

Affiliation(s)

Qian Wang School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China National Key Laboratory of General Artificial Intelligence, Peking University, Beijing 100871, China
Lu Luo School of Psychology, Beijing Sport University, Beijing 100084, China
Na Xu Division of Brain Sciences, Changping Laboratory, Beijing 102206, China
Jing Wang Department of Neurology, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
Ruolin Yang School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
Guanpeng Chen School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
Jie Ren Department of Functional Neurosurgery, Beijing Key Laboratory of Epilepsy, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China Epilepsy Center, Kunming Sanbo Brain Hospital, Kunming 650100 China
Guoming Luan Department of Functional Neurosurgery, Beijing Key Laboratory of Epilepsy, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China Beijing Institute for Brain Disorders, Beijing 100069, China
Fang Fang School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing 100871, China IDG/McGovern Institute for Brain Research, Peking University, Beijing 100871, China Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China

Collapse

Commuri V, Kulasingham JP, Simon JZ. Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention. Front Neurosci 2023;17:1264453. [PMID: 38156264 PMCID: PMC10752935 DOI: 10.3389/fnins.2023.1264453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 11/21/2023] [Indexed: 12/30/2023] Open

Schüller A, Schilling A, Krauss P, Rampp S, Reichenbach T. Attentional Modulation of the Cortical Contribution to the Frequency-Following Response Evoked by Continuous Speech. J Neurosci 2023;43:7429-7440. [PMID: 37793908 PMCID: PMC10621774 DOI: 10.1523/jneurosci.1247-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 09/07/2023] [Accepted: 09/21/2023] [Indexed: 10/06/2023] Open

Abstract

Selective attention to one of several competing speakers is required for comprehending a target speaker among other voices and for successful communication with them. It moreover has been found to involve the neural tracking of low-frequency speech rhythms in the auditory cortex. Effects of selective attention have also been found in subcortical neural activities, in particular regarding the frequency-following response related to the fundamental frequency of speech (speech-FFR). Recent investigations have, however, shown that the speech-FFR contains cortical contributions as well. It remains unclear whether these are also modulated by selective attention. Here we used magnetoencephalography to assess the attentional modulation of the cortical contributions to the speech-FFR. We presented both male and female participants with two competing speech signals and analyzed the cortical responses during attentional switching between the two speakers. Our findings revealed robust attentional modulation of the cortical contribution to the speech-FFR: the neural responses were higher when the speaker was attended than when they were ignored. We also found that, regardless of attention, a voice with a lower fundamental frequency elicited a larger cortical contribution to the speech-FFR than a voice with a higher fundamental frequency. Our results show that the attentional modulation of the speech-FFR does not only occur subcortically but extends to the auditory cortex as well.SIGNIFICANCE STATEMENT Understanding speech in noise requires attention to a target speaker. One of the speech features that a listener can use to identify a target voice among others and attend it is the fundamental frequency, together with its higher harmonics. The fundamental frequency arises from the opening and closing of the vocal folds and is tracked by high-frequency neural activity in the auditory brainstem and in the cortex. Previous investigations showed that the subcortical neural tracking is modulated by selective attention. Here we show that attention affects the cortical tracking of the fundamental frequency as well: it is stronger when a particular voice is attended than when it is ignored.

Collapse

Commuri V, Kulasingham JP, Simon JZ. Cortical Responses Time-Locked to Continuous Speech in the High-Gamma Band Depend on Selective Attention. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.20.549567. [PMID: 37546895 PMCID: PMC10401961 DOI: 10.1101/2023.07.20.549567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

Ying R, Hamlette L, Nikoobakht L, Balaji R, Miko N, Caras ML. Organization of orbitofrontal-auditory pathways in the Mongolian gerbil. J Comp Neurol 2023;531:1459-1481. [PMID: 37477903 PMCID: PMC10529810 DOI: 10.1002/cne.25525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 06/11/2023] [Accepted: 06/26/2023] [Indexed: 07/22/2023]

Grijseels DM, Prendergast BJ, Gorman JC, Miller CT. The neurobiology of vocal communication in marmosets. Ann N Y Acad Sci 2023;1528:13-28. [PMID: 37615212 PMCID: PMC10592205 DOI: 10.1111/nyas.15057] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]

Raghavan VS, O’Sullivan J, Bickel S, Mehta AD, Mesgarani N. Distinct neural encoding of glimpsed and masked speech in multitalker situations. PLoS Biol 2023;21:e3002128. [PMID: 37279203 PMCID: PMC10243639 DOI: 10.1371/journal.pbio.3002128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 04/19/2023] [Indexed: 06/08/2023] Open

Ahmed F, Nidiffer AR, O'Sullivan AE, Zuk NJ, Lalor EC. The integration of continuous audio and visual speech in a cocktail-party environment depends on attention. Neuroimage 2023;274:120143. [PMID: 37121375 DOI: 10.1016/j.neuroimage.2023.120143] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 03/17/2023] [Accepted: 04/27/2023] [Indexed: 05/02/2023] Open

Manting CL, Gulyas B, Ullén F, Lundqvist D. Steady-state responses to concurrent melodies: source distribution, top-down, and bottom-up attention. Cereb Cortex 2023;33:3053-3066. [PMID: 35858223 PMCID: PMC10016039 DOI: 10.1093/cercor/bhac260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 06/03/2022] [Accepted: 06/03/2022] [Indexed: 11/13/2022] Open

Makov S, Pinto D, Har-Shai Yahav P, Miller LM, Zion Golumbic E. "Unattended, distracting or irrelevant": Theoretical implications of terminological choices in auditory selective attention research. Cognition 2023;231:105313. [PMID: 36344304 DOI: 10.1016/j.cognition.2022.105313] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/30/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]

Simon JZ, Commuri V, Kulasingham JP. Time-locked auditory cortical responses in the high-gamma band: A window into primary auditory cortex. Front Neurosci 2022;16:1075369. [PMID: 36570848 PMCID: PMC9773383 DOI: 10.3389/fnins.2022.1075369] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open

Abstract

Primary auditory cortex is a critical stage in the human auditory pathway, a gateway between subcortical and higher-level cortical areas. Receiving the output of all subcortical processing, it sends its output on to higher-level cortex. Non-invasive physiological recordings of primary auditory cortex using electroencephalography (EEG) and magnetoencephalography (MEG), however, may not have sufficient specificity to separate responses generated in primary auditory cortex from those generated in underlying subcortical areas or neighboring cortical areas. This limitation is important for investigations of effects of top-down processing (e.g., selective-attention-based) on primary auditory cortex: higher-level areas are known to be strongly influenced by top-down processes, but subcortical areas are often assumed to perform strictly bottom-up processing. Fortunately, recent advances have made it easier to isolate the neural activity of primary auditory cortex from other areas. In this perspective, we focus on time-locked responses to stimulus features in the high gamma band (70-150 Hz) and with early cortical latency (∼40 ms), intermediate between subcortical and higher-level areas. We review recent findings from physiological studies employing either repeated simple sounds or continuous speech, obtaining either a frequency following response (FFR) or temporal response function (TRF). The potential roles of top-down processing are underscored, and comparisons with invasive intracranial EEG (iEEG) and animal model recordings are made. We argue that MEG studies employing continuous speech stimuli may offer particular benefits, in that only a few minutes of speech generates robust high gamma responses from bilateral primary auditory cortex, and without measurable interference from subcortical or higher-level areas.

Collapse

Lee JH, Shim H, Gantz B, Choi I. Strength of Attentional Modulation on Cortical Auditory Evoked Responses Correlates with Speech-in-Noise Performance in Bimodal Cochlear Implant Users. Trends Hear 2022;26:23312165221141143. [PMID: 36464791 PMCID: PMC9726851 DOI: 10.1177/23312165221141143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Gillis M, Van Canneyt J, Francart T, Vanthornhout J. Neural tracking as a diagnostic tool to assess the auditory pathway. Hear Res 2022;426:108607. [PMID: 36137861 DOI: 10.1016/j.heares.2022.108607] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 08/11/2022] [Accepted: 09/12/2022] [Indexed: 11/20/2022]

Zhou LF, Zhao D, Cui X, Guo B, Zhu F, Feng C, Wang J, Meng M. Separate neural subsystems support goal-directed speech listening. Neuroimage 2022;263:119613. [PMID: 36075539 DOI: 10.1016/j.neuroimage.2022.119613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 08/30/2022] [Accepted: 09/04/2022] [Indexed: 10/31/2022] Open

Abstract

How do humans excel at tracking the narrative of a particular speaker with a distracting noisy background? This feat places great demands on the collaboration between speech processing and goal-related regulatory functions. Here, we propose that separate subsystems with different cross-task dynamic activity properties and distinct functional purposes support goal-directed speech listening. We adopted a naturalistic dichotic speech listening paradigm in which listeners were instructed to attend to only one narrative from two competing inputs. Using functional magnetic resonance imaging with inter- and intra-subject correlation techniques, we discovered a dissociation in response consistency in temporal, parietal and frontal brain areas as the task demand varied. Specifically, some areas in the bilateral temporal cortex (SomMotB_Aud and TempPar) and lateral prefrontal cortex (DefaultB_PFCl and ContA_PFCl) always showed consistent activation across subjects and across scan runs, regardless of the task demand. In contrast, some areas in the parietal cortex (DefaultA_pCunPCC and ContC_pCun) responded reliably only when the task goal remained the same. These results suggested two dissociated functional neural networks that were independently validated by performing a data-driven clustering analysis of voxelwise functional connectivity patterns. A subsequent meta-analysis revealed distinct functional profiles for these two brain correlation maps. The different-task correlation map was strongly associated with language-related processes (e.g., listening, speech and sentences), whereas the same-task versus different-task correlation map was linked to self-referencing functions (e.g., default mode, theory of mind and autobiographical topics). Altogether, the three-pronged findings revealed two anatomically and functionally dissociated subsystems supporting goal-directed speech listening.

Collapse

Affiliation(s)

Liu-Fang Zhou Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, China; Key Lab of BI-AI Collaborated Information Behavior, School of Business and Management, Shanghai International Studies University, Shanghai, 201620, China
Dan Zhao Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, China; Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, 510631, China
Xuan Cui Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, 510631, China
Bingbing Guo Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, 510631, China; School of Teacher Education, Nanjing Xiaozhuang University, Nanjing, 211171, China
Fangwei Zhu Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, China; Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, 510631, China
Chunliang Feng Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, South China Normal University, Guangzhou, 510631, China
Jinhui Wang Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, China; Center for Studies of Psychological Application, Guangdong Key Laboratory of Mental Health and Cognitive Science, Institute for Brain Research and Rehabilitation, South China Normal University, Guangzhou, 510631, China.
Ming Meng Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, China.

Collapse

Xie Y, Ma J. How to discern external acoustic waves in a piezoelectric neuron under noise? J Biol Phys 2022;48:339-353. [PMID: 35948818 PMCID: PMC9411441 DOI: 10.1007/s10867-022-09611-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 07/27/2022] [Indexed: 10/15/2022] Open

Abstract

Biological neurons keep sensitive to external stimuli and appropriate firing modes can be triggered to give effective response to external chemical and physical signals. A piezoelectric neural circuit can perceive external voice and nonlinear vibration by generating equivalent piezoelectric voltage, which can generate an equivalent trans-membrane current for inducing a variety of firing modes in the neural activities. Biological neurons can receive external stimuli from more ion channels and synapse synchronously, but the further encoding and priority in mode selection are competitive. In particular, noisy disturbance and electromagnetic radiation make it more difficult in signals identification and mode selection in the firing patterns of neurons driven by multi-channel signals. In this paper, two different periodic signals accompanied by noise are used to excite the piezoelectric neural circuit, and the signal processing in the piezoelectric neuron driven by acoustic waves under noise is reproduced and explained. The physical energy of the piezoelectric neural circuit and Hamilton energy in the neuron driven by mixed signals are calculated to explain the biophysical mechanism of auditory neuron when external stimuli are applied. It is found that the neuron prefers to respond to the external stimulus with higher physical energy and the signal which can increase the Hamilton energy of the neuron. For example, stronger inputs used to inject higher energy and it is detected and responded more sensitively. The involvement of noise is helpful to detect the external signal under stochastic resonance, and the additive noise changes the excitability of neuron as the external stimulus. The results indicate that energy controls the firing patterns and mode selection in neurons, and it provides clues to control the neural activities by injecting appropriate energy into the neurons and network.

Collapse

Xu Z, Bai Y, Zhao R, Zheng Q, Ni G, Ming D. Auditory attention decoding from EEG-based Mandarin speech envelope reconstruction. Hear Res 2022;422:108552. [PMID: 35714555 DOI: 10.1016/j.heares.2022.108552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 06/01/2022] [Accepted: 06/08/2022] [Indexed: 11/23/2022]

Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception. Curr Biol 2022;32:3971-3986.e4. [PMID: 35973430 DOI: 10.1016/j.cub.2022.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/08/2022] [Accepted: 07/19/2022] [Indexed: 11/20/2022]

Myers JC, Smith EH, Leszczynski M, O'Sullivan J, Yates MJ, McKhann G, Mesgarani N, Schroeder C, Schevon C, Sheth SA. The Spatial Reach of Neuronal Coherence and Spike-Field Coupling across the Human Neocortex. J Neurosci 2022;42:6285-6294. [PMID: 35790403 PMCID: PMC9374135 DOI: 10.1523/jneurosci.0050-22.2022] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 04/21/2022] [Accepted: 05/25/2022] [Indexed: 11/21/2022] Open

Abstract

Neuronal coherence is thought to be a fundamental mechanism of communication in the brain, where synchronized field potentials coordinate synaptic and spiking events to support plasticity and learning. Although the spread of field potentials has garnered great interest, little is known about the spatial reach of phase synchronization, or neuronal coherence. Functional connectivity between different brain regions is known to occur across long distances, but the locality of synchronization across the neocortex is understudied. Here we used simultaneous recordings from electrocorticography (ECoG) grids and high-density microelectrode arrays to estimate the spatial reach of neuronal coherence and spike-field coherence (SFC) across frontal, temporal, and occipital cortices during cognitive tasks in humans. We observed the strongest coherence within a 2-3 cm distance from the microelectrode arrays, potentially defining an effective range for local communication. This range was relatively consistent across brain regions, spectral frequencies, and cognitive tasks. The magnitude of coherence showed power law decay with increasing distance from the microelectrode arrays, where the highest coherence occurred between ECoG contacts, followed by coherence between ECoG and deep cortical local field potential (LFP), and then SFC (i.e., ECoG > LFP > SFC). The spectral frequency of coherence also affected its magnitude. Alpha coherence (8-14 Hz) was generally higher than other frequencies for signals nearest the microelectrode arrays, whereas delta coherence (1-3 Hz) was higher for signals that were farther away. Action potentials in all brain regions were most coherent with the phase of alpha oscillations, which suggests that alpha waves could play a larger, more spatially local role in spike timing than other frequencies. These findings provide a deeper understanding of the spatial and spectral dynamics of neuronal synchronization, further advancing knowledge about how activity propagates across the human brain.SIGNIFICANCE STATEMENT Coherence is theorized to facilitate information transfer across cerebral space by providing a convenient electrophysiological mechanism to modulate membrane potentials in spatiotemporally complex patterns. Our work uses a multiscale approach to evaluate the spatial reach of phase coherence and spike-field coherence during cognitive tasks in humans. Locally, coherence can reach up to 3 cm around a given area of neocortex. The spectral properties of coherence revealed that alpha phase-field and spike-field coherence were higher within ranges <2 cm, whereas lower-frequency delta coherence was higher for contacts farther away. Spatiotemporally shared information (i.e., coherence) across neocortex seems to reach farther than field potentials alone.

Collapse

Brodbeck C, Simon JZ. Cortical tracking of voice pitch in the presence of multiple speakers depends on selective attention. Front Neurosci 2022;16:828546. [PMID: 36003957 PMCID: PMC9393379 DOI: 10.3389/fnins.2022.828546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open

Decoding Selective Auditory Attention with EEG using A Transformer Model. Methods 2022;204:410-417. [DOI: 10.1016/j.ymeth.2022.04.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/20/2022] [Accepted: 04/14/2022] [Indexed: 11/23/2022] Open

Wang L, Liu H, Zhang X, Zhao S, Guo L, Han J, Hu X. Exploring Hierarchical Auditory Representation via a Neural Encoding Model. Front Neurosci 2022;16:843988. [PMID: 35401085 PMCID: PMC8987159 DOI: 10.3389/fnins.2022.843988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 02/16/2022] [Indexed: 11/13/2022] Open

Di Liberto GM, Hjortkjær J, Mesgarani N. Editorial: Neural Tracking: Closing the Gap Between Neurophysiology and Translational Medicine. Front Neurosci 2022;16:872600. [PMID: 35368278 PMCID: PMC8966872 DOI: 10.3389/fnins.2022.872600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 02/17/2022] [Indexed: 11/25/2022] Open

Teoh ES, Ahmed F, Lalor EC. Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment. J Neurosci 2022;42:682-691. [PMID: 34893546 PMCID: PMC8805628 DOI: 10.1523/jneurosci.1455-20.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/28/2021] [Accepted: 09/29/2021] [Indexed: 11/21/2022] Open

Abstract

Humans have the remarkable ability to selectively focus on a single talker in the midst of other competing talkers. The neural mechanisms that underlie this phenomenon remain incompletely understood. In particular, there has been longstanding debate over whether attention operates at an early or late stage in the speech processing hierarchy. One way to better understand this is to examine how attention might differentially affect neurophysiological indices of hierarchical acoustic and linguistic speech representations. In this study, we do this by using encoding models to identify neural correlates of speech processing at various levels of representation. Specifically, we recorded EEG from fourteen human subjects (nine female and five male) during a "cocktail party" attention experiment. Model comparisons based on these data revealed phonetic feature processing for attended, but not unattended speech. Furthermore, we show that attention specifically enhances isolated indices of phonetic feature processing, but that such attention effects are not apparent for isolated measures of acoustic processing. These results provide new insights into the effects of attention on different prelexical representations of speech, insights that complement recent anatomic accounts of the hierarchical encoding of attended speech. Furthermore, our findings support the notion that, for attended speech, phonetic features are processed as a distinct stage, separate from the processing of the speech acoustics.SIGNIFICANCE STATEMENT Humans are very good at paying attention to one speaker in an environment with multiple speakers. However, the details of how attended and unattended speech are processed differently by the brain is not completely clear. Here, we explore how attention affects the processing of the acoustic sounds of speech as well as the mapping of those sounds onto categorical phonetic features. We find evidence of categorical phonetic feature processing for attended, but not unattended speech. Furthermore, we find evidence that categorical phonetic feature processing is enhanced by attention, but acoustic processing is not. These findings add an important new layer in our understanding of how the human brain solves the cocktail party problem.

Collapse

Ylinen A, Wikman P, Leminen M, Alho K. Task-dependent cortical activations during selective attention to audiovisual speech. Brain Res 2022;1775:147739. [PMID: 34843702 DOI: 10.1016/j.brainres.2021.147739] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/21/2021] [Accepted: 11/21/2021] [Indexed: 11/28/2022]

Petersen EB. Hearing-Aid Directionality Improves Neural Speech Tracking in Older Hearing-Impaired Listeners. Trends Hear 2022;26:23312165221099894. [PMID: 35730193 PMCID: PMC9228639 DOI: 10.1177/23312165221099894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Effect of Noise Reduction on Cortical Speech-in-Noise Processing and Its Variance due to Individual Noise Tolerance. Ear Hear 2021;43:849-861. [PMID: 34751679 PMCID: PMC9010348 DOI: 10.1097/aud.0000000000001144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

OBJECTIVES

Despite the widespread use of noise reduction (NR) in modern digital hearing aids, our neurophysiological understanding of how NR affects speech-in-noise perception and why its effect is variable is limited. The current study aimed to (1) characterize the effect of NR on the neural processing of target speech and (2) seek neural determinants of individual differences in the NR effect on speech-in-noise performance, hypothesizing that an individual's own capability to inhibit background noise would inversely predict NR benefits in speech-in-noise perception.

DESIGN

Thirty-six adult listeners with normal hearing participated in the study. Behavioral and electroencephalographic responses were simultaneously obtained during a speech-in-noise task in which natural monosyllabic words were presented at three different signal-to-noise ratios, each with NR off and on. A within-subject analysis assessed the effect of NR on cortical evoked responses to target speech in the temporal-frontal speech and language brain regions, including supramarginal gyrus and inferior frontal gyrus in the left hemisphere. In addition, an across-subject analysis related an individual's tolerance to noise, measured as the amplitude ratio of auditory-cortical responses to target speech and background noise, to their speech-in-noise performance.

RESULTS

At the group level, in the poorest signal-to-noise ratio condition, NR significantly increased early supramarginal gyrus activity and decreased late inferior frontal gyrus activity, indicating a switch to more immediate lexical access and less effortful cognitive processing, although no improvement in behavioral performance was found. The across-subject analysis revealed that the cortical index of individual noise tolerance significantly correlated with NR-driven changes in speech-in-noise performance.

CONCLUSIONS

NR can facilitate speech-in-noise processing despite no improvement in behavioral performance. Findings from the current study also indicate that people with lower noise tolerance are more likely to get more benefits from NR. Overall, results suggest that future research should take a mechanistic approach to NR outcomes and individual noise tolerance.

Collapse

Decoding Object-Based Auditory Attention from Source-Reconstructed MEG Alpha Oscillations. J Neurosci 2021;41:8603-8617. [PMID: 34429378 DOI: 10.1523/jneurosci.0583-21.2021] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 08/08/2021] [Accepted: 08/11/2021] [Indexed: 11/21/2022] Open

Abstract

How do we attend to relevant auditory information in complex naturalistic scenes? Much research has focused on detecting which information is attended, without regarding underlying top-down control mechanisms. Studies investigating attentional control generally manipulate and cue specific features in simple stimuli. However, in naturalistic scenes it is impossible to dissociate relevant from irrelevant information based on low-level features. Instead, the brain has to parse and select auditory objects of interest. The neural underpinnings of object-based auditory attention remain not well understood. Here we recorded MEG while 15 healthy human subjects (9 female) prepared for the repetition of an auditory object presented in one of two overlapping naturalistic auditory streams. The stream containing the repetition was prospectively cued with 70% validity. Crucially, this task could not be solved by attending low-level features, but only by processing the objects fully. We trained a linear classifier on the cortical distribution of source-reconstructed oscillatory activity to distinguish which auditory stream was attended. We could successfully classify the attended stream from alpha (8-14 Hz) activity in anticipation of repetition onset. Importantly, attention could only be classified from trials in which subjects subsequently detected the repetition, but not from miss trials. Behavioral relevance was further supported by a correlation between classification accuracy and detection performance. Decodability was not sustained throughout stimulus presentation, but peaked shortly before repetition onset, suggesting that attention acted transiently according to temporal expectations. We thus demonstrate anticipatory alpha oscillations to underlie top-down control of object-based auditory attention in complex naturalistic scenes.SIGNIFICANCE STATEMENT In everyday life, we often find ourselves bombarded with auditory information, from which we need to select what is relevant to our current goals. Previous research has highlighted how we attend to specific highly controlled aspects of the auditory input. Although invaluable, it is still unclear how this relates to attentional control in naturalistic auditory scenes. Here we used the high precision of magnetoencephalography in space and time to investigate the brain mechanisms underlying top-down control of object-based attention in ecologically valid sound scenes. We show that rhythmic activity in auditory association cortex at a frequency of ∼10 Hz (alpha waves) controls attention to currently relevant segments within the auditory scene and predicts whether these segments are subsequently detected.

Collapse

Rezaeizadeh M, Shamma S. Binding the Acoustic Features of an Auditory Source through Temporal Coherence. Cereb Cortex Commun 2021;2:tgab060. [PMID: 34746791 PMCID: PMC8567849 DOI: 10.1093/texcom/tgab060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/29/2021] [Accepted: 09/30/2021] [Indexed: 11/25/2022] Open

Creating Clarity in Noisy Environments by Using Deep Learning in Hearing Aids. Semin Hear 2021;42:260-281. [PMID: 34594089 PMCID: PMC8463126 DOI: 10.1055/s-0041-1735134] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Lowe MX, Mohsenzadeh Y, Lahner B, Charest I, Oliva A, Teng S. Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations. Cogn Neuropsychol 2021;38:468-489. [PMID: 35729704 PMCID: PMC10589059 DOI: 10.1080/02643294.2022.2085085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 03/31/2022] [Accepted: 05/25/2022] [Indexed: 10/17/2022]

Kiremitçi I, Yilmaz Ö, Çelik E, Shahdloo M, Huth AG, Çukur T. Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment. Cereb Cortex 2021;31:4986-5005. [PMID: 34115102 PMCID: PMC8491717 DOI: 10.1093/cercor/bhab136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open

Zuk NJ, Murphy JW, Reilly RB, Lalor EC. Envelope reconstruction of speech and music highlights stronger tracking of speech at low frequencies. PLoS Comput Biol 2021;17:e1009358. [PMID: 34534211 PMCID: PMC8480853 DOI: 10.1371/journal.pcbi.1009358] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 09/29/2021] [Accepted: 08/18/2021] [Indexed: 11/19/2022] Open

Homma NY, Bajo VM. Lemniscal Corticothalamic Feedback in Auditory Scene Analysis. Front Neurosci 2021;15:723893. [PMID: 34489635 PMCID: PMC8417129 DOI: 10.3389/fnins.2021.723893] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/30/2021] [Indexed: 12/15/2022] Open

Keshavarzi M, Varano E, Reichenbach T. Cortical Tracking of a Background Speaker Modulates the Comprehension of a Foreground Speech Signal. J Neurosci 2021;41:5093-5101. [PMID: 33926996 PMCID: PMC8197648 DOI: 10.1523/jneurosci.3200-20.2021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 02/23/2021] [Accepted: 04/12/2021] [Indexed: 11/21/2022] Open

Abstract

Understanding speech in background noise is a difficult task. The tracking of speech rhythms such as the rate of syllables and words by cortical activity has emerged as a key neural mechanism for speech-in-noise comprehension. In particular, recent investigations have used transcranial alternating current stimulation (tACS) with the envelope of a speech signal to influence the cortical speech tracking, demonstrating that this type of stimulation modulates comprehension and therefore providing evidence of a functional role of the cortical tracking in speech processing. Cortical activity has been found to track the rhythms of a background speaker as well, but the functional significance of this neural response remains unclear. Here we use a speech-comprehension task with a target speaker in the presence of a distractor voice to show that tACS with the speech envelope of the target voice as well as tACS with the envelope of the distractor speaker both modulate the comprehension of the target speech. Because the envelope of the distractor speech does not carry information about the target speech stream, the modulation of speech comprehension through tACS with this envelope provides evidence that the cortical tracking of the background speaker affects the comprehension of the foreground speech signal. The phase dependency of the resulting modulation of speech comprehension is, however, opposite to that obtained from tACS with the envelope of the target speech signal. This suggests that the cortical tracking of the ignored speech stream and that of the attended speech stream may compete for neural resources.SIGNIFICANCE STATEMENT Loud environments such as busy pubs or restaurants can make conversation difficult. However, they also allow us to eavesdrop into other conversations that occur in the background. In particular, we often notice when somebody else mentions our name, even if we have not been listening to that person. However, the neural mechanisms by which background speech is processed remain poorly understood. Here we use transcranial alternating current stimulation, a technique through which neural activity in the cerebral cortex can be influenced, to show that cortical responses to rhythms in the distractor speech modulate the comprehension of the target speaker. Our results provide evidence that the cortical tracking of background speech rhythms plays a functional role in speech processing.

Collapse

Bröhl F, Kayser C. Delta/theta band EEG differentially tracks low and high frequency speech-derived envelopes. Neuroimage 2021;233:117958. [PMID: 33744458 PMCID: PMC8204264 DOI: 10.1016/j.neuroimage.2021.117958] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 11/01/2022] Open

Wang L, Wu EX, Chen F. EEG-based auditory attention decoding using speech-level-based segmented computational models. J Neural Eng 2021;18. [PMID: 33957606 DOI: 10.1088/1741-2552/abfeba] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 05/06/2021] [Indexed: 11/11/2022]

Abstract

Objective.Auditory attention in complex scenarios can be decoded by electroencephalography (EEG)-based cortical speech-envelope tracking. The relative root-mean-square (RMS) intensity is a valuable cue for the decomposition of speech into distinct characteristic segments. To improve auditory attention decoding (AAD) performance, this work proposed a novel segmented AAD approach to decode target speech envelopes from different RMS-level-based speech segments.Approach.Speech was decomposed into higher- and lower-RMS-level speech segments with a threshold of -10 dB relative RMS level. A support vector machine classifier was designed to identify higher- and lower-RMS-level speech segments, using clean target and mixed speech as reference signals based on corresponding EEG signals recorded when subjects listened to target auditory streams in competing two-speaker auditory scenes. Segmented computational models were developed with the classification results of higher- and lower-RMS-level speech segments. Speech envelopes were reconstructed based on segmented decoding models for either higher- or lower-RMS-level speech segments. AAD accuracies were calculated according to the correlations between actual and reconstructed speech envelopes. The performance of the proposed segmented AAD computational model was compared to those of traditional AAD methods with unified decoding functions.Main results.Higher- and lower-RMS-level speech segments in continuous sentences could be identified robustly with classification accuracies that approximated or exceeded 80% based on corresponding EEG signals at 6 dB, 3 dB, 0 dB, -3 dB and -6 dB signal-to-mask ratios (SMRs). Compared with unified AAD decoding methods, the proposed segmented AAD approach achieved more accurate results in the reconstruction of target speech envelopes and in the detection of attentional directions. Moreover, the proposed segmented decoding method had higher information transfer rates (ITRs) and shorter minimum expected switch times compared with the unified decoder.Significance.This study revealed that EEG signals may be used to classify higher- and lower-RMS-level-based speech segments across a wide range of SMR conditions (from 6 dB to -6 dB). A novel finding was that the specific information in different RMS-level-based speech segments facilitated EEG-based decoding of auditory attention. The significantly improved AAD accuracies and ITRs of the segmented decoding method suggests that this proposed computational model may be an effective method for the application of neuro-controlled brain-computer interfaces in complex auditory scenes.

Collapse

Boos M, Lücke J, Rieger JW. Generalizable dimensions of human cortical auditory processing of speech in natural soundscapes: A data-driven ultra high field fMRI approach. Neuroimage 2021;237:118106. [PMID: 33991696 DOI: 10.1016/j.neuroimage.2021.118106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 04/25/2021] [Indexed: 11/27/2022] Open

de Cheveigné A, Slaney M, Fuglsang SA, Hjortkjaer J. Auditory stimulus-response modeling with a match-mismatch task. J Neural Eng 2021;18. [PMID: 33849003 DOI: 10.1088/1741-2552/abf771] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 04/13/2021] [Indexed: 11/12/2022]

Mesik J, Ray L, Wojtczak M. Effects of Age on Cortical Tracking of Word-Level Features of Continuous Competing Speech. Front Neurosci 2021;15:635126. [PMID: 33867920 PMCID: PMC8047075 DOI: 10.3389/fnins.2021.635126] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 03/12/2021] [Indexed: 01/17/2023] Open

Abstract

Speech-in-noise comprehension difficulties are common among the elderly population, yet traditional objective measures of speech perception are largely insensitive to this deficit, particularly in the absence of clinical hearing loss. In recent years, a growing body of research in young normal-hearing adults has demonstrated that high-level features related to speech semantics and lexical predictability elicit strong centro-parietal negativity in the EEG signal around 400 ms following the word onset. Here we investigate effects of age on cortical tracking of these word-level features within a two-talker speech mixture, and their relationship with self-reported difficulties with speech-in-noise understanding. While undergoing EEG recordings, younger and older adult participants listened to a continuous narrative story in the presence of a distractor story. We then utilized forward encoding models to estimate cortical tracking of four speech features: (1) word onsets, (2) "semantic" dissimilarity of each word relative to the preceding context, (3) lexical surprisal for each word, and (4) overall word audibility. Our results revealed robust tracking of all features for attended speech, with surprisal and word audibility showing significantly stronger contributions to neural activity than dissimilarity. Additionally, older adults exhibited significantly stronger tracking of word-level features than younger adults, especially over frontal electrode sites, potentially reflecting increased listening effort. Finally, neuro-behavioral analyses revealed trends of a negative relationship between subjective speech-in-noise perception difficulties and the model goodness-of-fit for attended speech, as well as a positive relationship between task performance and the goodness-of-fit, indicating behavioral relevance of these measures. Together, our results demonstrate the utility of modeling cortical responses to multi-talker speech using complex, word-level features and the potential for their use to study changes in speech processing due to aging and hearing loss.

Collapse

Roswandowitz C, Swanborough H, Frühholz S. Categorizing human vocal signals depends on an integrated auditory-frontal cortical network. Hum Brain Mapp 2021;42:1503-1517. [PMID: 33615612 PMCID: PMC7927295 DOI: 10.1002/hbm.25309] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 11/20/2020] [Accepted: 11/25/2020] [Indexed: 11/30/2022] Open

Alickovic E, Ng EHN, Fiedler L, Santurette S, Innes-Brown H, Graversen C. Effects of Hearing Aid Noise Reduction on Early and Late Cortical Representations of Competing Talkers in Noise. Front Neurosci 2021;15:636060. [PMID: 33841081 PMCID: PMC8032942 DOI: 10.3389/fnins.2021.636060] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 02/26/2021] [Indexed: 11/13/2022] Open

Abstract

OBJECTIVES

Previous research using non-invasive (magnetoencephalography, MEG) and invasive (electrocorticography, ECoG) neural recordings has demonstrated the progressive and hierarchical representation and processing of complex multi-talker auditory scenes in the auditory cortex. Early responses (<85 ms) in primary-like areas appear to represent the individual talkers with almost equal fidelity and are independent of attention in normal-hearing (NH) listeners. However, late responses (>85 ms) in higher-order non-primary areas selectively represent the attended talker with significantly higher fidelity than unattended talkers in NH and hearing-impaired (HI) listeners. Motivated by these findings, the objective of this study was to investigate the effect of a noise reduction scheme (NR) in a commercial hearing aid (HA) on the representation of complex multi-talker auditory scenes in distinct hierarchical stages of the auditory cortex by using high-density electroencephalography (EEG).

DESIGN

We addressed this issue by investigating early (<85 ms) and late (>85 ms) EEG responses recorded in 34 HI subjects fitted with HAs. The HA noise reduction (NR) was either on or off while the participants listened to a complex auditory scene. Participants were instructed to attend to one of two simultaneous talkers in the foreground while multi-talker babble noise played in the background (+3 dB SNR). After each trial, a two-choice question about the content of the attended speech was presented.

RESULTS

Using a stimulus reconstruction approach, our results suggest that the attention-related enhancement of neural representations of target and masker talkers located in the foreground, as well as suppression of the background noise in distinct hierarchical stages is significantly affected by the NR scheme. We found that the NR scheme contributed to the enhancement of the foreground and of the entire acoustic scene in the early responses, and that this enhancement was driven by better representation of the target speech. We found that the target talker in HI listeners was selectively represented in late responses. We found that use of the NR scheme resulted in enhanced representations of the target and masker speech in the foreground and a suppressed representation of the noise in the background in late responses. We found a significant effect of EEG time window on the strengths of the cortical representation of the target and masker.

CONCLUSION

Together, our analyses of the early and late responses obtained from HI listeners support the existing view of hierarchical processing in the auditory cortex. Our findings demonstrate the benefits of a NR scheme on the representation of complex multi-talker auditory scenes in different areas of the auditory cortex in HI listeners.

Collapse

Kim S, Schwalje AT, Liu AS, Gander PE, McMurray B, Griffiths TD, Choi I. Pre- and post-target cortical processes predict speech-in-noise performance. Neuroimage 2021;228:117699. [PMID: 33387631 PMCID: PMC8291856 DOI: 10.1016/j.neuroimage.2020.117699] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 11/06/2020] [Accepted: 12/23/2020] [Indexed: 12/19/2022] Open