Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Brodbeck C, Hong LE, Simon JZ. Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech. Curr Biol 2018;28:3976-3983.e5. [PMID: 30503620 DOI: 10.1016/j.cub.2018.10.042] [Citation(s) in RCA: 140] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 09/06/2018] [Accepted: 10/16/2018] [Indexed: 01/06/2023]

For:	Brodbeck C, Hong LE, Simon JZ. Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech. Curr Biol 2018;28:3976-3983.e5. [PMID: 30503620 DOI: 10.1016/j.cub.2018.10.042] [Citation(s) in RCA: 140] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Revised: 09/06/2018] [Accepted: 10/16/2018] [Indexed: 01/06/2023]

Number

Cited by Other Article(s)

Kulasingham JP, Simon JZ. Algorithms for Estimating Time-Locked Neural Response Components in Cortical Processing of Continuous Speech. IEEE Trans Biomed Eng 2023;70:88-96. [PMID: 35727788 PMCID: PMC9946293 DOI: 10.1109/tbme.2022.3185005] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Abstract

OBJECTIVE

The Temporal Response Function (TRF) is a linear model of neural activity time-locked to continuous stimuli, including continuous speech. TRFs based on speech envelopes typically have distinct components that have provided remarkable insights into the cortical processing of speech. However, current methods may lead to less than reliable estimates of single-subject TRF components. Here, we compare two established methods, in TRF component estimation, and also propose novel algorithms that utilize prior knowledge of these components, bypassing the full TRF estimation.

METHODS

We compared two established algorithms, ridge and boosting, and two novel algorithms based on Subspace Pursuit (SP) and Expectation Maximization (EM), which directly estimate TRF components given plausible assumptions regarding component characteristics. Single-channel, multi-channel, and source-localized TRFs were fit on simulations and real magnetoencephalographic data. Performance metrics included model fit and component estimation accuracy.

RESULTS

Boosting and ridge have comparable performance in component estimation. The novel algorithms outperformed the others in simulations, but not on real data, possibly due to the plausible assumptions not actually being met. Ridge had slightly better model fits on real data compared to boosting, but also more spurious TRF activity.

CONCLUSION

Results indicate that both smooth (ridge) and sparse (boosting) algorithms perform comparably at TRF component estimation. The SP and EM algorithms may be accurate, but rely on assumptions of component characteristics.

SIGNIFICANCE

This systematic comparison establishes the suitability of widely used and novel algorithms for estimating robust TRF components, which is essential for improved subject-specific investigations into the cortical processing of speech.

Collapse

Luo C, Gao Y, Fan J, Liu Y, Yu Y, Zhang X. Compromised word-level neural tracking in the high-gamma band for children with attention deficit hyperactivity disorder. Front Hum Neurosci 2023;17:1174720. [PMID: 37213926 PMCID: PMC10196181 DOI: 10.3389/fnhum.2023.1174720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 04/18/2023] [Indexed: 05/23/2023] Open

Simon JZ, Commuri V, Kulasingham JP. Time-locked auditory cortical responses in the high-gamma band: A window into primary auditory cortex. Front Neurosci 2022;16:1075369. [PMID: 36570848 PMCID: PMC9773383 DOI: 10.3389/fnins.2022.1075369] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open

Abstract

Primary auditory cortex is a critical stage in the human auditory pathway, a gateway between subcortical and higher-level cortical areas. Receiving the output of all subcortical processing, it sends its output on to higher-level cortex. Non-invasive physiological recordings of primary auditory cortex using electroencephalography (EEG) and magnetoencephalography (MEG), however, may not have sufficient specificity to separate responses generated in primary auditory cortex from those generated in underlying subcortical areas or neighboring cortical areas. This limitation is important for investigations of effects of top-down processing (e.g., selective-attention-based) on primary auditory cortex: higher-level areas are known to be strongly influenced by top-down processes, but subcortical areas are often assumed to perform strictly bottom-up processing. Fortunately, recent advances have made it easier to isolate the neural activity of primary auditory cortex from other areas. In this perspective, we focus on time-locked responses to stimulus features in the high gamma band (70-150 Hz) and with early cortical latency (∼40 ms), intermediate between subcortical and higher-level areas. We review recent findings from physiological studies employing either repeated simple sounds or continuous speech, obtaining either a frequency following response (FFR) or temporal response function (TRF). The potential roles of top-down processing are underscored, and comparisons with invasive intracranial EEG (iEEG) and animal model recordings are made. We argue that MEG studies employing continuous speech stimuli may offer particular benefits, in that only a few minutes of speech generates robust high gamma responses from bilateral primary auditory cortex, and without measurable interference from subcortical or higher-level areas.

Collapse

Youssofzadeh V, Conant L, Stout J, Ustine C, Humphries C, Gross WL, Shah-Basak P, Mathis J, Awe E, Allen L, DeYoe EA, Carlson C, Anderson CT, Maganti R, Hermann B, Nair VA, Prabhakaran V, Meyerand B, Binder JR, Raghavan M. Late dominance of the right hemisphere during narrative comprehension. Neuroimage 2022;264:119749. [PMID: 36379420 PMCID: PMC9772156 DOI: 10.1016/j.neuroimage.2022.119749] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 10/12/2022] [Accepted: 11/11/2022] [Indexed: 11/15/2022] Open

Abstract

PET and fMRI studies suggest that auditory narrative comprehension is supported by a bilateral multilobar cortical network. The superior temporal resolution of magnetoencephalography (MEG) makes it an attractive tool to investigate the dynamics of how different neuroanatomic substrates engage during narrative comprehension. Using beta-band power changes as a marker of cortical engagement, we studied MEG responses during an auditory story comprehension task in 31 healthy adults. The protocol consisted of two runs, each interleaving 7 blocks of the story comprehension task with 15 blocks of an auditorily presented math task as a control for phonological processing, working memory, and attention processes. Sources at the cortical surface were estimated with a frequency-resolved beamformer. Beta-band power was estimated in the frequency range of 16-24 Hz over 1-sec epochs starting from 400 msec after stimulus onset until the end of a story or math problem presentation. These power estimates were compared to 1-second epochs of data before the stimulus block onset. The task-related cortical engagement was inferred from beta-band power decrements. Group-level source activations were statistically compared using non-parametric permutation testing. A story-math contrast of beta-band power changes showed greater bilateral cortical engagement within the fusiform gyrus, inferior and middle temporal gyri, parahippocampal gyrus, and left inferior frontal gyrus (IFG) during story comprehension. A math-story contrast of beta power decrements showed greater bilateral but left-lateralized engagement of the middle frontal gyrus and superior parietal lobule. The evolution of cortical engagement during five temporal windows across the presentation of stories showed significant involvement during the first interval of the narrative of bilateral opercular and insular regions as well as the ventral and lateral temporal cortex, extending more posteriorly on the left and medially on the right. Over time, there continued to be sustained right anterior ventral temporal engagement, with increasing involvement of the right anterior parahippocampal gyrus, STG, MTG, posterior superior temporal sulcus, inferior parietal lobule, frontal operculum, and insula, while left hemisphere engagement decreased. Our findings are consistent with prior imaging studies of narrative comprehension, but in addition, they demonstrate increasing right-lateralized engagement over the course of narratives, suggesting an important role for these right-hemispheric regions in semantic integration as well as social and pragmatic inference processing.

Collapse

Affiliation(s)

Vahab Youssofzadeh Neurology, Medical College of Wisconsin, Milwaukee, WI, USA,*Corresponding author. (V. Youssofzadeh)
Lisa Conant Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Jeffrey Stout Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Candida Ustine Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Colin Humphries Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
William L. Gross Neurology, Medical College of Wisconsin, Milwaukee, WI, USA,bAnesthesiology, Medical College of Wisconsin, Milwaukee, WI, USA
Priyanka Shah-Basak Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Jed Mathis Neurology, Medical College of Wisconsin, Milwaukee, WI, USA,cRadiology, Medical College of Wisconsin, Milwaukee, WI, USA
Elizabeth Awe Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Linda Allen Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Edgar A. DeYoe Radiology, Medical College of Wisconsin, Milwaukee, WI, USA
Chad Carlson Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Christopher T. Anderson Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Rama Maganti Neurology, University of Wisconsin-Madison, Madison, WI, USA
Bruce Hermann Neurology, University of Wisconsin-Madison, Madison, WI, USA
Veena A. Nair Radiology, University of Wisconsin-Madison, Madison, WI, USA
Vivek Prabhakaran Radiology, University of Wisconsin-Madison, Madison, WI, USA,fMedical Physics, University of Wisconsin-Madison, Madison, WI, USA,gPsychiatry, University of Wisconsin-Madison, Madison, WI, USA
Beth Meyerand Radiology, University of Wisconsin-Madison, Madison, WI, USA,fMedical Physics, University of Wisconsin-Madison, Madison, WI, USA,hBiomedical Engineering, University of Wisconsin-Madison, Madison, WI, USA
Jeffrey R. Binder Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
Manoj Raghavan Neurology, Medical College of Wisconsin, Milwaukee, WI, USA

Collapse

Liu Y, Luo C, Zheng J, Liang J, Ding N. Working memory asymmetrically modulates auditory and linguistic processing of speech. Neuroimage 2022;264:119698. [PMID: 36270622 DOI: 10.1016/j.neuroimage.2022.119698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 10/11/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022] Open

Gwilliams L, King JR, Marantz A, Poeppel D. Neural dynamics of phoneme sequences reveal position-invariant code for content and order. Nat Commun 2022;13:6606. [PMID: 36329058 PMCID: PMC9633780 DOI: 10.1038/s41467-022-34326-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 10/19/2022] [Indexed: 11/06/2022] Open

McMurray B, Sarrett ME, Chiu S, Black AK, Wang A, Canale R, Aslin RN. Decoding the temporal dynamics of spoken word and nonword processing from EEG. Neuroimage 2022;260:119457. [PMID: 35842096 PMCID: PMC10875705 DOI: 10.1016/j.neuroimage.2022.119457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/02/2022] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open

Verschueren E, Gillis M, Decruy L, Vanthornhout J, Francart T. Speech Understanding Oppositely Affects Acoustic and Linguistic Neural Tracking in a Speech Rate Manipulation Paradigm. J Neurosci 2022;42:7442-7453. [PMID: 36041851 PMCID: PMC9525161 DOI: 10.1523/jneurosci.0259-22.2022] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 06/29/2022] [Accepted: 07/17/2022] [Indexed: 11/21/2022] Open

Abstract

When listening to continuous speech, the human brain can track features of the presented speech signal. It has been shown that neural tracking of acoustic features is a prerequisite for speech understanding and can predict speech understanding in controlled circumstances. However, the brain also tracks linguistic features of speech, which may be more directly related to speech understanding. We investigated acoustic and linguistic speech processing as a function of varying speech understanding by manipulating the speech rate. In this paradigm, acoustic and linguistic speech processing is affected simultaneously but in opposite directions: When the speech rate increases, more acoustic information per second is present. In contrast, the tracking of linguistic information becomes more challenging when speech is less intelligible at higher speech rates. We measured the EEG of 18 participants (4 male) who listened to speech at various speech rates. As expected and confirmed by the behavioral results, speech understanding decreased with increasing speech rate. Accordingly, linguistic neural tracking decreased with increasing speech rate, but acoustic neural tracking increased. This indicates that neural tracking of linguistic representations can capture the gradual effect of decreasing speech understanding. In addition, increased acoustic neural tracking does not necessarily imply better speech understanding. This suggests that, although more challenging to measure because of the low signal-to-noise ratio, linguistic neural tracking may be a more direct predictor of speech understanding.SIGNIFICANCE STATEMENT An increasingly popular method to investigate neural speech processing is to measure neural tracking. Although much research has been done on how the brain tracks acoustic speech features, linguistic speech features have received less attention. In this study, we disentangled acoustic and linguistic characteristics of neural speech tracking via manipulating the speech rate. A proper way of objectively measuring auditory and language processing paves the way toward clinical applications: An objective measure of speech understanding would allow for behavioral-free evaluation of speech understanding, which allows to evaluate hearing loss and adjust hearing aids based on brain responses. This objective measure would benefit populations from whom obtaining behavioral measures may be complex, such as young children or people with cognitive impairments.

Collapse

Kim SG. On the encoding of natural music in computational models and human brains. Front Neurosci 2022;16:928841. [PMID: 36203808 PMCID: PMC9531138 DOI: 10.3389/fnins.2022.928841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 08/15/2022] [Indexed: 11/13/2022] Open

Gillis M, Van Canneyt J, Francart T, Vanthornhout J. Neural tracking as a diagnostic tool to assess the auditory pathway. Hear Res 2022;426:108607. [PMID: 36137861 DOI: 10.1016/j.heares.2022.108607] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 08/11/2022] [Accepted: 09/12/2022] [Indexed: 11/20/2022]

Gugnowska K, Novembre G, Kohler N, Villringer A, Keller PE, Sammler D. Endogenous sources of interbrain synchrony in duetting pianists. Cereb Cortex 2022;32:4110-4127. [PMID: 35029645 PMCID: PMC9476614 DOI: 10.1093/cercor/bhab469] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 11/16/2021] [Accepted: 11/17/2021] [Indexed: 11/12/2022] Open

Stewart HJ, Cash EK, Hunter LL, Maloney T, Vannest J, Moore DR. Speech cortical activation and connectivity in typically developing children and those with listening difficulties. Neuroimage Clin 2022;36:103172. [PMID: 36087559 PMCID: PMC9467868 DOI: 10.1016/j.nicl.2022.103172] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 12/14/2022]

Chai X, Liu M, Huang T, Wu M, Li J, Zhao X, Yan T, Song Y, Zhang YX. Neurophysiological evidence for goal-oriented modulation of speech perception. Cereb Cortex 2022;33:3910-3921. [PMID: 35972410 DOI: 10.1093/cercor/bhac315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 07/20/2022] [Accepted: 07/21/2022] [Indexed: 11/14/2022] Open

Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception. Curr Biol 2022;32:3971-3986.e4. [PMID: 35973430 DOI: 10.1016/j.cub.2022.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 06/08/2022] [Accepted: 07/19/2022] [Indexed: 11/20/2022]

Heilbron M, Armeni K, Schoffelen JM, Hagoort P, de Lange FP. A hierarchy of linguistic predictions during natural language comprehension. Proc Natl Acad Sci U S A 2022;119:e2201968119. [PMID: 35921434 PMCID: PMC9371745 DOI: 10.1073/pnas.2201968119] [Citation(s) in RCA: 75] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/28/2022] [Indexed: 02/05/2023] Open

Heilbron M, Armeni K, Schoffelen JM, Hagoort P, de Lange FP. A hierarchy of linguistic predictions during natural language comprehension. Proc Natl Acad Sci U S A 2022;119:e2201968119. [PMID: 35921434 DOI: 10.1101/2020.12.03.410399] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open

Brodbeck C, Simon JZ. Cortical tracking of voice pitch in the presence of multiple speakers depends on selective attention. Front Neurosci 2022;16:828546. [PMID: 36003957 PMCID: PMC9393379 DOI: 10.3389/fnins.2022.828546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open

Zhou D, Zhang G, Dang J, Unoki M, Liu X. Detection of Brain Network Communities During Natural Speech Comprehension From Functionally Aligned EEG Sources. Front Comput Neurosci 2022;16:919215. [PMID: 35874316 PMCID: PMC9301328 DOI: 10.3389/fncom.2022.919215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/14/2022] [Indexed: 11/30/2022] Open

Abstract

In recent years, electroencephalograph (EEG) studies on speech comprehension have been extended from a controlled paradigm to a natural paradigm. Under the hypothesis that the brain can be approximated as a linear time-invariant system, the neural response to natural speech has been investigated extensively using temporal response functions (TRFs). However, most studies have modeled TRFs in the electrode space, which is a mixture of brain sources and thus cannot fully reveal the functional mechanism underlying speech comprehension. In this paper, we propose methods for investigating the brain networks of natural speech comprehension using TRFs on the basis of EEG source reconstruction. We first propose a functional hyper-alignment method with an additive average method to reduce EEG noise. Then, we reconstruct neural sources within the brain based on the EEG signals to estimate TRFs from speech stimuli to source areas, and then investigate the brain networks in the neural source space on the basis of the community detection method. To evaluate TRF-based brain networks, EEG data were recorded in story listening tasks with normal speech and time-reversed speech. To obtain reliable structures of brain networks, we detected TRF-based communities from multiple scales. As a result, the proposed functional hyper-alignment method could effectively reduce the noise caused by individual settings in an EEG experiment and thus improve the accuracy of source reconstruction. The detected brain networks for normal speech comprehension were clearly distinctive from those for non-semantically driven (time-reversed speech) audio processing. Our result indicates that the proposed source TRFs can reflect the cognitive processing of spoken language and that the multi-scale community detection method is powerful for investigating brain networks.

Collapse

Bai F, Meyer AS, Martin AE. Neural dynamics differentially encode phrases and sentences during spoken language comprehension. PLoS Biol 2022;20:e3001713. [PMID: 35834569 PMCID: PMC9282610 DOI: 10.1371/journal.pbio.3001713] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 06/14/2022] [Indexed: 11/19/2022] Open

Abstract

Human language stands out in the natural world as a biological signal that uses a structured system to combine the meanings of small linguistic units (e.g., words) into larger constituents (e.g., phrases and sentences). However, the physical dynamics of speech (or sign) do not stand in a one-to-one relationship with the meanings listeners perceive. Instead, listeners infer meaning based on their knowledge of the language. The neural readouts of the perceptual and cognitive processes underlying these inferences are still poorly understood. In the present study, we used scalp electroencephalography (EEG) to compare the neural response to phrases (e.g., the red vase) and sentences (e.g., the vase is red), which were close in semantic meaning and had been synthesized to be physically indistinguishable. Differences in structure were well captured in the reorganization of neural phase responses in delta (approximately <2 Hz) and theta bands (approximately 2 to 7 Hz),and in power and power connectivity changes in the alpha band (approximately 7.5 to 13.5 Hz). Consistent with predictions from a computational model, sentences showed more power, more power connectivity, and more phase synchronization than phrases did. Theta-gamma phase-amplitude coupling occurred, but did not differ between the syntactic structures. Spectral-temporal response function (STRF) modeling revealed different encoding states for phrases and sentences, over and above the acoustically driven neural response. Our findings provide a comprehensive description of how the brain encodes and separates linguistic structures in the dynamics of neural responses. They imply that phase synchronization and strength of connectivity are readouts for the constituent structure of language. The results provide a novel basis for future neurophysiological research on linguistic structure representation in the brain, and, together with our simulations, support time-based binding as a mechanism of structure encoding in neural dynamics.

Collapse

Pérez A, Davis MH, Ince RAA, Zhang H, Fu Z, Lamarca M, Lambon Ralph MA, Monahan PJ. Timing of brain entrainment to the speech envelope during speaking, listening and self-listening. Cognition 2022;224:105051. [PMID: 35219954 PMCID: PMC9112165 DOI: 10.1016/j.cognition.2022.105051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 01/24/2022] [Accepted: 01/26/2022] [Indexed: 11/17/2022]

Chalas N, Daube C, Kluger DS, Abbasi O, Nitsch R, Gross J. Multivariate analysis of speech envelope tracking reveals coupling beyond auditory cortex. Neuroimage 2022;258:119395. [PMID: 35718023 DOI: 10.1016/j.neuroimage.2022.119395] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/16/2022] [Accepted: 06/14/2022] [Indexed: 11/19/2022] Open

Abstract

The systematic alignment of low-frequency brain oscillations with the acoustic speech envelope signal is well established and has been proposed to be crucial for actively perceiving speech. Previous studies investigating speech-brain coupling in source space are restricted to univariate pairwise approaches between brain and speech signals, and therefore speech tracking information in frequency-specific communication channels might be lacking. To address this, we propose a novel multivariate framework for estimating speech-brain coupling where neural variability from source-derived activity is taken into account along with the rate of envelope's amplitude change (derivative). We applied it in magnetoencephalographic (MEG) recordings while human participants (male and female) listened to one hour of continuous naturalistic speech, showing that a multivariate approach outperforms the corresponding univariate method in low- and high frequencies across frontal, motor, and temporal areas. Systematic comparisons revealed that the gain in low frequencies (0.6 - 0.8 Hz) was related to the envelope's rate of change whereas in higher frequencies (from 0.8 to 10 Hz) it was mostly related to the increased neural variability from source-derived cortical areas. Furthermore, following a non-negative matrix factorization approach we found distinct speech-brain components across time and cortical space related to speech processing. We confirm that speech envelope tracking operates mainly in two timescales (δ and θ frequency bands) and we extend those findings showing shorter coupling delays in auditory-related components and longer delays in higher-association frontal and motor components, indicating temporal differences of speech tracking and providing implications for hierarchical stimulus-driven speech processing.

Collapse

Preisig BC, Riecke L, Hervais-Adelman A. Speech sound categorization: The contribution of non-auditory and auditory cortical regions. Neuroimage 2022;258:119375. [PMID: 35700949 DOI: 10.1016/j.neuroimage.2022.119375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 05/13/2022] [Accepted: 06/10/2022] [Indexed: 11/26/2022] Open

Muncke J, Kuruvila I, Hoppe U. Prediction of Speech Intelligibility by Means of EEG Responses to Sentences in Noise. Front Neurosci 2022;16:876421. [PMID: 35720724 PMCID: PMC9198593 DOI: 10.3389/fnins.2022.876421] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open

Abstract

Objective

Understanding speech in noisy conditions is challenging even for people with mild hearing loss, and intelligibility for an individual person is usually evaluated by using several subjective test methods. In the last few years, a method has been developed to determine a temporal response function (TRF) between speech envelope and simultaneous electroencephalographic (EEG) measurements. By using this TRF it is possible to predict the EEG signal for any speech signal. Recent studies have suggested that the accuracy of this prediction varies with the level of noise added to the speech signal and can predict objectively the individual speech intelligibility. Here we assess the variations of the TRF itself when it is calculated for measurements with different signal-to-noise ratios and apply these variations to predict speech intelligibility.

Methods

For 18 normal hearing subjects the individual threshold of 50% speech intelligibility was determined by using a speech in noise test. Additionally, subjects listened passively to speech material of the speech in noise test at different signal-to-noise ratios close to individual threshold of 50% speech intelligibility while an EEG was recorded. Afterwards the shape of TRFs for each signal-to-noise ratio and subject were compared with the derived intelligibility.

Results

The strongest effect of variations in stimulus signal-to-noise ratio on the TRF shape occurred close to 100 ms after the stimulus presentation, and was located in the left central scalp region. The investigated variations in TRF morphology showed a strong correlation with speech intelligibility, and we were able to predict the individual threshold of 50% speech intelligibility with a mean deviation of less then 1.5 dB.

Conclusion

The intelligibility of speech in noise can be predicted by analyzing the shape of the TRF derived from different stimulus signal-to-noise ratios. Because TRFs are interpretable, in a manner similar to auditory evoked potentials, this method offers new options for clinical diagnostics.

Collapse

Haider CL, Suess N, Hauswald A, Park H, Weisz N. Masking of the mouth area impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker. Neuroimage 2022;252:119044. [PMID: 35240298 DOI: 10.1016/j.neuroimage.2022.119044] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/26/2022] [Accepted: 02/27/2022] [Indexed: 11/29/2022] Open

Norman-Haignere SV, Feather J, Boebinger D, Brunner P, Ritaccio A, McDermott JH, Schalk G, Kanwisher N. A neural population selective for song in human auditory cortex. Curr Biol 2022;32:1470-1484.e12. [PMID: 35196507 PMCID: PMC9092957 DOI: 10.1016/j.cub.2022.01.069] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 10/26/2021] [Accepted: 01/24/2022] [Indexed: 12/18/2022]

Affiliation(s)

Sam V Norman-Haignere Zuckerman Institute, Columbia University, New York, NY, USA; HHMI Fellow of the Life Sciences Research Foundation, Chevy Chase, MD, USA; Laboratoire des Sytèmes Perceptifs, Département d'Études Cognitives, ENS, PSL University, CNRS, Paris, France; Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY, USA; Department of Neuroscience, University of Rochester Medical Center, Rochester, NY, USA; Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
Jenelle Feather Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA; Center for Brains, Minds and Machines, Cambridge, MA, USA
Dana Boebinger Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA; Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA
Peter Brunner Department of Neurology, Albany Medical College, Albany, NY, USA; National Center for Adaptive Neurotechnologies, Albany, NY, USA; Department of Neurosurgery, Washington University School of Medicine, St. Louis, MO, USA
Anthony Ritaccio Department of Neurology, Albany Medical College, Albany, NY, USA; Department of Neurology, Mayo Clinic, Jacksonville, FL, USA
Josh H McDermott Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA; Center for Brains, Minds and Machines, Cambridge, MA, USA; Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA
Gerwin Schalk Department of Neurology, Albany Medical College, Albany, NY, USA
Nancy Kanwisher Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA; Center for Brains, Minds and Machines, Cambridge, MA, USA

Collapse

Di Liberto GM, Hjortkjær J, Mesgarani N. Editorial: Neural Tracking: Closing the Gap Between Neurophysiology and Translational Medicine. Front Neurosci 2022;16:872600. [PMID: 35368278 PMCID: PMC8966872 DOI: 10.3389/fnins.2022.872600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 02/17/2022] [Indexed: 11/25/2022] Open

Gillis M, Decruy L, Vanthornhout J, Francart T. Hearing loss is associated with delayed neural responses to continuous speech. Eur J Neurosci 2022;55:1671-1690. [PMID: 35263814 DOI: 10.1111/ejn.15644] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 02/21/2022] [Accepted: 02/23/2022] [Indexed: 11/28/2022]

Norman-Haignere SV, Long LK, Devinsky O, Doyle W, Irobunda I, Merricks EM, Feldstein NA, McKhann GM, Schevon CA, Flinker A, Mesgarani N. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat Hum Behav 2022;6:455-469. [PMID: 35145280 PMCID: PMC8957490 DOI: 10.1038/s41562-021-01261-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 11/18/2021] [Indexed: 01/11/2023]

Caucheteux C, King JR. Brains and algorithms partially converge in natural language processing. Commun Biol 2022;5:134. [PMID: 35173264 PMCID: PMC8850612 DOI: 10.1038/s42003-022-03036-1] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 12/29/2021] [Indexed: 11/29/2022] Open

Towards real-world neuroscience using mobile EEG and augmented reality. Sci Rep 2022;12:2291. [PMID: 35145166 PMCID: PMC8831466 DOI: 10.1038/s41598-022-06296-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 01/25/2022] [Indexed: 01/10/2023] Open

Taylor C, Hall S, Manivannan S, Mundil N, Border S. The neuroanatomical consequences and pathological implications of bilingualism. J Anat 2022;240:410-427. [PMID: 34486112 PMCID: PMC8742975 DOI: 10.1111/joa.13542] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 07/26/2021] [Accepted: 08/23/2021] [Indexed: 01/17/2023] Open

Teoh ES, Ahmed F, Lalor EC. Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment. J Neurosci 2022;42:682-691. [PMID: 34893546 PMCID: PMC8805628 DOI: 10.1523/jneurosci.1455-20.2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/28/2021] [Accepted: 09/29/2021] [Indexed: 11/21/2022] Open

Abstract

Humans have the remarkable ability to selectively focus on a single talker in the midst of other competing talkers. The neural mechanisms that underlie this phenomenon remain incompletely understood. In particular, there has been longstanding debate over whether attention operates at an early or late stage in the speech processing hierarchy. One way to better understand this is to examine how attention might differentially affect neurophysiological indices of hierarchical acoustic and linguistic speech representations. In this study, we do this by using encoding models to identify neural correlates of speech processing at various levels of representation. Specifically, we recorded EEG from fourteen human subjects (nine female and five male) during a "cocktail party" attention experiment. Model comparisons based on these data revealed phonetic feature processing for attended, but not unattended speech. Furthermore, we show that attention specifically enhances isolated indices of phonetic feature processing, but that such attention effects are not apparent for isolated measures of acoustic processing. These results provide new insights into the effects of attention on different prelexical representations of speech, insights that complement recent anatomic accounts of the hierarchical encoding of attended speech. Furthermore, our findings support the notion that, for attended speech, phonetic features are processed as a distinct stage, separate from the processing of the speech acoustics.SIGNIFICANCE STATEMENT Humans are very good at paying attention to one speaker in an environment with multiple speakers. However, the details of how attended and unattended speech are processed differently by the brain is not completely clear. Here, we explore how attention affects the processing of the acoustic sounds of speech as well as the mapping of those sounds onto categorical phonetic features. We find evidence of categorical phonetic feature processing for attended, but not unattended speech. Furthermore, we find evidence that categorical phonetic feature processing is enhanced by attention, but acoustic processing is not. These findings add an important new layer in our understanding of how the human brain solves the cocktail party problem.

Collapse

Brodbeck C, Bhattasali S, Cruz Heredia AAL, Resnik P, Simon JZ, Lau E. Parallel processing in speech perception with local and global representations of linguistic context. eLife 2022;11:72056. [PMID: 35060904 PMCID: PMC8830882 DOI: 10.7554/elife.72056] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/16/2022] [Indexed: 12/03/2022] Open

Abstract

Speech processing is highly incremental. It is widely accepted that human listeners continuously use the linguistic context to anticipate upcoming concepts, words, and phonemes. However, previous evidence supports two seemingly contradictory models of how a predictive context is integrated with the bottom-up sensory input: Classic psycholinguistic paradigms suggest a two-stage process, in which acoustic input initially leads to local, context-independent representations, which are then quickly integrated with contextual constraints. This contrasts with the view that the brain constructs a single coherent, unified interpretation of the input, which fully integrates available information across representational hierarchies, and thus uses contextual constraints to modulate even the earliest sensory representations. To distinguish these hypotheses, we tested magnetoencephalography responses to continuous narrative speech for signatures of local and unified predictive models. Results provide evidence that listeners employ both types of models in parallel. Two local context models uniquely predict some part of early neural responses, one based on sublexical phoneme sequences, and one based on the phonemes in the current word alone; at the same time, even early responses to phonemes also reflect a unified model that incorporates sentence-level constraints to predict upcoming phonemes. Neural source localization places the anatomical origins of the different predictive models in nonidentical parts of the superior temporal lobes bilaterally, with the right hemisphere showing a relative preference for more local models. These results suggest that speech processing recruits both local and unified predictive models in parallel, reconciling previous disparate findings. Parallel models might make the perceptual system more robust, facilitate processing of unexpected inputs, and serve a function in language acquisition.

Collapse

Piazza EA, Nencheva ML, Lew-Williams C. THE DEVELOPMENT OF COMMUNICATION ACROSS TIMESCALES. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2021;30:459-467. [PMID: 35177881 PMCID: PMC8849573 DOI: 10.1177/09637214211037665] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]

Jessen S, Obleser J, Tune S. Neural tracking in infants - An analytical tool for multisensory social processing in development. Dev Cogn Neurosci 2021;52:101034. [PMID: 34781250 PMCID: PMC8593584 DOI: 10.1016/j.dcn.2021.101034] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 10/09/2021] [Accepted: 11/07/2021] [Indexed: 11/18/2022] Open

Crosse MJ, Zuk NJ, Di Liberto GM, Nidiffer AR, Molholm S, Lalor EC. Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research. Front Neurosci 2021;15:705621. [PMID: 34880719 PMCID: PMC8648261 DOI: 10.3389/fnins.2021.705621] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/21/2021] [Indexed: 01/01/2023] Open

Affiliation(s)

Michael J. Crosse Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland X, The Moonshot Factory, Mountain View, CA, United States Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
Nathaniel J. Zuk Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States
Giovanni M. Di Liberto Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Centre for Biomedical Engineering, School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Aaron R. Nidiffer Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States
Sophie Molholm Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
Edmund C. Lalor Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States

Collapse

Landemard A, Bimbard C, Demené C, Shamma S, Norman-Haignere S, Boubenec Y. Distinct higher-order representations of natural sounds in human and ferret auditory cortex. eLife 2021;10:e65566. [PMID: 34792467 PMCID: PMC8601661 DOI: 10.7554/elife.65566] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 10/22/2021] [Indexed: 11/29/2022] Open

Attaheri A, Choisdealbha ÁN, Di Liberto GM, Rocha S, Brusini P, Mead N, Olawole-Scott H, Boutris P, Gibbon S, Williams I, Grey C, Flanagan S, Goswami U. Delta- and theta-band cortical tracking and phase-amplitude coupling to sung speech by infants. Neuroimage 2021;247:118698. [PMID: 34798233 DOI: 10.1016/j.neuroimage.2021.118698] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 10/15/2021] [Accepted: 10/30/2021] [Indexed: 01/13/2023] Open

Affiliation(s)

Adam Attaheri Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Áine Ní Choisdealbha Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Giovanni M Di Liberto Laboratoire des Systèmes Perceptifs, UMR 8248, CNRS, France; Ecole Normale Supérieure, PSL University, France; Department of Mechanical, Trinity Centre for Biomedical Engineering and Trinity Institute of Neuroscience, Manufacturing and Biomedical Engineering, Trinity College, The University of Dublin, Ireland; School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College Dublin, Ireland.
Sinead Rocha Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Perrine Brusini Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom; Institute of Population Health, Waterhouse Building, Block B, Brownlow Street, Liverpool L69 3GF, United Kingdom.
Natasha Mead Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Helen Olawole-Scott Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Panagiotis Boutris Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Samuel Gibbon Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Isabel Williams Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Christina Grey Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Sheila Flanagan Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.
Usha Goswami Department of Psychology, Centre for Neuroscience in Education, University of Cambridge, Downing Street, Cambridge CB2 3 EB, United Kingdom.

Collapse

Generalizable EEG Encoding Models with Naturalistic Audiovisual Stimuli. J Neurosci 2021;41:8946-8962. [PMID: 34503996 DOI: 10.1523/jneurosci.2891-20.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/24/2021] [Accepted: 08/29/2021] [Indexed: 11/21/2022] Open

Abstract

In natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as "speech tracking." Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from acoustically rich, naturalistic environments with and without background noise can be generalized to more controlled stimuli. If encoding models for acoustically rich, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations of individuals who may not tolerate listening to more controlled and less engaging stimuli for long periods of time. We recorded noninvasive scalp EEG while 17 human participants (8 male/9 female) listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled datasets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to speech in a rich acoustic background were more accurate when including both phonological and acoustic features. Our findings suggest that naturalistic audiovisual stimuli can be used to measure receptive fields that are comparable and generalizable to more controlled audio-only stimuli.SIGNIFICANCE STATEMENT Understanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli-sentences from a speech corpus and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.

Collapse

Lenc T, Merchant H, Keller PE, Honing H, Varlet M, Nozaradan S. Mapping between sound, brain and behaviour: four-level framework for understanding rhythm processing in humans and non-human primates. Philos Trans R Soc Lond B Biol Sci 2021;376:20200325. [PMID: 34420381 PMCID: PMC8380981 DOI: 10.1098/rstb.2020.0325] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/14/2021] [Indexed: 12/16/2022] Open

Lowe MX, Mohsenzadeh Y, Lahner B, Charest I, Oliva A, Teng S. Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations. Cogn Neuropsychol 2021;38:468-489. [PMID: 35729704 PMCID: PMC10589059 DOI: 10.1080/02643294.2022.2085085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 03/31/2022] [Accepted: 05/25/2022] [Indexed: 10/17/2022]

Kiremitçi I, Yilmaz Ö, Çelik E, Shahdloo M, Huth AG, Çukur T. Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment. Cereb Cortex 2021;31:4986-5005. [PMID: 34115102 PMCID: PMC8491717 DOI: 10.1093/cercor/bhab136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open

Devaraju DS, Kemp A, Eddins DA, Shrivastav R, Chandrasekaran B, Hampton Wray A. Effects of Task Demands on Neural Correlates of Acoustic and Semantic Processing in Challenging Listening Conditions. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:3697-3706. [PMID: 34403278 DOI: 10.1044/2021_jslhr-21-00006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Abstract

Purpose Listeners shift their listening strategies between lower level acoustic information and higher level semantic information to prioritize maximum speech intelligibility in challenging listening conditions. Although increasing task demands via acoustic degradation modulates lexical-semantic processing, the neural mechanisms underlying different listening strategies are unclear. The current study examined the extent to which encoding of lower level acoustic cues is modulated by task demand and associations with lexical-semantic processes. Method Electroencephalography was acquired while participants listened to sentences in the presence of four-talker babble that contained either higher or lower probability final words. Task difficulty was modulated by time available to process responses. Cortical tracking of speech-neural correlates of acoustic temporal envelope processing-were estimated using temporal response functions. Results Task difficulty did not affect cortical tracking of temporal envelope of speech under challenging listening conditions. Neural indices of lexical-semantic processing (N400 amplitudes) were larger with increased task difficulty. No correlations were observed between the cortical tracking of temporal envelope of speech and lexical-semantic processes, even after controlling for the effect of individualized signal-to-noise ratios. Conclusions Cortical tracking of the temporal envelope of speech and semantic processing are differentially influenced by task difficulty. While increased task demands modulated higher level semantic processing, cortical tracking of the temporal envelope of speech may be influenced by task difficulty primarily when the demand is manipulated in terms of acoustic properties of the stimulus, consistent with an emerging perspective in speech perception.

Collapse

Di Liberto GM, Marion G, Shamma SA. Accurate Decoding of Imagined and Heard Melodies. Front Neurosci 2021;15:673401. [PMID: 34421512 PMCID: PMC8375770 DOI: 10.3389/fnins.2021.673401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 06/17/2021] [Indexed: 11/16/2022] Open

Wang YC, Sohoglu E, Gilbert RA, Henson RN, Davis MH. Predictive Neural Computations Support Spoken Word Recognition: Evidence from MEG and Competitor Priming. J Neurosci 2021;41:6919-6932. [PMID: 34210777 PMCID: PMC8360690 DOI: 10.1523/jneurosci.1685-20.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 11/24/2022] Open

Abstract

Human listeners achieve quick and effortless speech comprehension through computations of conditional probability using Bayes rule. However, the neural implementation of Bayesian perceptual inference remains unclear. Competitive-selection accounts (e.g., TRACE) propose that word recognition is achieved through direct inhibitory connections between units representing candidate words that share segments (e.g., hygiene and hijack share /haidʒ/). Manipulations that increase lexical uncertainty should increase neural responses associated with word recognition when words cannot be uniquely identified. In contrast, predictive-selection accounts (e.g., Predictive-Coding) propose that spoken word recognition involves comparing heard and predicted speech sounds and using prediction error to update lexical representations. Increased lexical uncertainty in words, such as hygiene and hijack, will increase prediction error and hence neural activity only at later time points when different segments are predicted. We collected MEG data from male and female listeners to test these two Bayesian mechanisms and used a competitor priming manipulation to change the prior probability of specific words. Lexical decision responses showed delayed recognition of target words (hygiene) following presentation of a neighboring prime word (hijack) several minutes earlier. However, this effect was not observed with pseudoword primes (higent) or targets (hijure). Crucially, MEG responses in the STG showed greater neural responses for word-primed words after the point at which they were uniquely identified (after /haidʒ/ in hygiene) but not before while similar changes were again absent for pseudowords. These findings are consistent with accounts of spoken word recognition in which neural computations of prediction error play a central role.SIGNIFICANCE STATEMENT Effective speech perception is critical to daily life and involves computations that combine speech signals with prior knowledge of spoken words (i.e., Bayesian perceptual inference). This study specifies the neural mechanisms that support spoken word recognition by testing two distinct implementations of Bayes perceptual inference. Most established theories propose direct competition between lexical units such that inhibition of irrelevant candidates leads to selection of critical words. Our results instead support predictive-selection theories (e.g., Predictive-Coding): by comparing heard and predicted speech sounds, neural computations of prediction error can help listeners continuously update lexical probabilities, allowing for more rapid word identification.

Collapse

The Music of Silence: Part II: Music Listening Induces Imagery Responses. J Neurosci 2021;41:7449-7460. [PMID: 34341154 DOI: 10.1523/jneurosci.0184-21.2021] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 01/22/2023] Open

Abstract

During music listening, humans routinely acquire the regularities of the acoustic sequences and use them to anticipate and interpret the ongoing melody. Specifically, in line with this predictive framework, it is thought that brain responses during such listening reflect a comparison between the bottom-up sensory responses and top-down prediction signals generated by an internal model that embodies the music exposure and expectations of the listener. To attain a clear view of these predictive responses, previous work has eliminated the sensory inputs by inserting artificial silences (or sound omissions) that leave behind only the corresponding predictions of the thwarted expectations. Here, we demonstrate a new alternate approach in which we decode the predictive electroencephalography (EEG) responses to the silent intervals that are naturally interspersed within the music. We did this as participants (experiment 1, 20 participants, 10 female; experiment 2, 21 participants, 6 female) listened or imagined Bach piano melodies. Prediction signals were quantified and assessed via a computational model of the melodic structure of the music and were shown to exhibit the same response characteristics when measured during listening or imagining. These include an inverted polarity for both silence and imagined responses relative to listening, as well as response magnitude modulations that precisely reflect the expectations of notes and silences in both listening and imagery conditions. These findings therefore provide a unifying view that links results from many previous paradigms, including omission reactions and the expectation modulation of sensory responses, all in the context of naturalistic music listening.SIGNIFICANCE STATEMENT Music perception depends on our ability to learn and detect melodic structures. It has been suggested that our brain does so by actively predicting upcoming music notes, a process inducing instantaneous neural responses as the music confronts these expectations. Here, we studied this prediction process using EEGs recorded while participants listen to and imagine Bach melodies. Specifically, we examined neural signals during the ubiquitous musical pauses (or silent intervals) in a music stream and analyzed them in contrast to the imagery responses. We find that imagined predictive responses are routinely co-opted during ongoing music listening. These conclusions are revealed by a new paradigm using listening and imagery of naturalistic melodies.

Collapse

Tune S, Alavash M, Fiedler L, Obleser J. Neural attentional-filter mechanisms of listening success in middle-aged and older individuals. Nat Commun 2021;12:4533. [PMID: 34312388 PMCID: PMC8313676 DOI: 10.1038/s41467-021-24771-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 07/01/2021] [Indexed: 12/12/2022] Open

Lunner T, Alickovic E, Graversen C, Ng EHN, Wendt D, Keidser G. Three New Outcome Measures That Tap Into Cognitive Processes Required for Real-Life Communication. Ear Hear 2021;41 Suppl 1:39S-47S. [PMID: 33105258 PMCID: PMC7676869 DOI: 10.1097/aud.0000000000000941] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 07/11/2020] [Indexed: 11/29/2022]

O'Sullivan AE, Crosse MJ, Liberto GMD, de Cheveigné A, Lalor EC. Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects. J Neurosci 2021;41:4991-5003. [PMID: 33824190 PMCID: PMC8197638 DOI: 10.1523/jneurosci.0906-20.2021] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 03/16/2021] [Accepted: 03/22/2021] [Indexed: 12/27/2022] Open

Abstract

Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However, it remains unclear how the integration of these cues varies as a function of listening conditions. Here, we sought to provide insight on these questions by examining EEG responses in humans (males and females) to natural audiovisual (AV), audio, and visual speech in quiet and in noise. We represented our speech stimuli in terms of their spectrograms and their phonetic features and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis (CCA). The encoding of both spectrotemporal and phonetic features was shown to be more robust in AV speech responses than what would have been expected from the summation of the audio and visual speech responses, suggesting that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing. We also found evidence to suggest that the integration effects may change with listening conditions; however, this was an exploratory analysis and future work will be required to examine this effect using a within-subject design. These findings demonstrate that integration of audio and visual speech occurs at multiple stages along the speech processing hierarchy.SIGNIFICANCE STATEMENT During conversation, visual cues impact our perception of speech. Integration of auditory and visual speech is thought to occur at multiple stages of speech processing and vary flexibly depending on the listening conditions. Here, we examine audiovisual (AV) integration at two stages of speech processing using the speech spectrogram and a phonetic representation, and test how AV integration adapts to degraded listening conditions. We find significant integration at both of these stages regardless of listening conditions. These findings reveal neural indices of multisensory interactions at different stages of processing and provide support for the multistage integration framework.

Collapse

100

Keshavarzi M, Varano E, Reichenbach T. Cortical Tracking of a Background Speaker Modulates the Comprehension of a Foreground Speech Signal. J Neurosci 2021;41:5093-5101. [PMID: 33926996 PMCID: PMC8197648 DOI: 10.1523/jneurosci.3200-20.2021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 02/23/2021] [Accepted: 04/12/2021] [Indexed: 11/21/2022] Open

Abstract

Understanding speech in background noise is a difficult task. The tracking of speech rhythms such as the rate of syllables and words by cortical activity has emerged as a key neural mechanism for speech-in-noise comprehension. In particular, recent investigations have used transcranial alternating current stimulation (tACS) with the envelope of a speech signal to influence the cortical speech tracking, demonstrating that this type of stimulation modulates comprehension and therefore providing evidence of a functional role of the cortical tracking in speech processing. Cortical activity has been found to track the rhythms of a background speaker as well, but the functional significance of this neural response remains unclear. Here we use a speech-comprehension task with a target speaker in the presence of a distractor voice to show that tACS with the speech envelope of the target voice as well as tACS with the envelope of the distractor speaker both modulate the comprehension of the target speech. Because the envelope of the distractor speech does not carry information about the target speech stream, the modulation of speech comprehension through tACS with this envelope provides evidence that the cortical tracking of the background speaker affects the comprehension of the foreground speech signal. The phase dependency of the resulting modulation of speech comprehension is, however, opposite to that obtained from tACS with the envelope of the target speech signal. This suggests that the cortical tracking of the ignored speech stream and that of the attended speech stream may compete for neural resources.SIGNIFICANCE STATEMENT Loud environments such as busy pubs or restaurants can make conversation difficult. However, they also allow us to eavesdrop into other conversations that occur in the background. In particular, we often notice when somebody else mentions our name, even if we have not been listening to that person. However, the neural mechanisms by which background speech is processed remain poorly understood. Here we use transcranial alternating current stimulation, a technique through which neural activity in the cerebral cortex can be influenced, to show that cortical responses to rhythms in the distractor speech modulate the comprehension of the target speaker. Our results provide evidence that the cortical tracking of background speech rhythms plays a functional role in speech processing.

Collapse