Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Crosse MJ, Di Liberto GM, Lalor EC. Eye Can Hear Clearly Now: Inverse Effectiveness in Natural Audiovisual Speech Processing Relies on Long-Term Crossmodal Temporal Integration. J Neurosci 2016;36:9888-95. [PMID: 27656026 DOI: 10.1523/JNEUROSCI.1396-16.2016] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 08/03/2016] [Indexed: 11/21/2022] Open

For:	Crosse MJ, Di Liberto GM, Lalor EC. Eye Can Hear Clearly Now: Inverse Effectiveness in Natural Audiovisual Speech Processing Relies on Long-Term Crossmodal Temporal Integration. J Neurosci 2016;36:9888-95. [PMID: 27656026 DOI: 10.1523/JNEUROSCI.1396-16.2016] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 08/03/2016] [Indexed: 11/21/2022] Open

Number

Cited by Other Article(s)

Corsini A, Tomassini A, Pastore A, Delis I, Fadiga L, D'Ausilio A. Speech perception difficulty modulates theta-band encoding of articulatory synergies. J Neurophysiol 2024;131:480-491. [PMID: 38323331 DOI: 10.1152/jn.00388.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/04/2024] [Accepted: 01/25/2024] [Indexed: 02/08/2024] Open

Haider CL, Park H, Hauswald A, Weisz N. Neural Speech Tracking Highlights the Importance of Visual Speech in Multi-speaker Situations. J Cogn Neurosci 2024;36:128-142. [PMID: 37977156 DOI: 10.1162/jocn_a_02059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]

Ahmed F, Nidiffer AR, Lalor EC. The effect of gaze on EEG measures of multisensory integration in a cocktail party scenario. Front Hum Neurosci 2023;17:1283206. [PMID: 38162285 PMCID: PMC10754997 DOI: 10.3389/fnhum.2023.1283206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/20/2023] [Indexed: 01/03/2024] Open

Abstract

Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers-an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.

Collapse

Di Liberto GM, Attaheri A, Cantisani G, Reilly RB, Ní Choisdealbha Á, Rocha S, Brusini P, Goswami U. Emergence of the cortical encoding of phonetic features in the first year of life. Nat Commun 2023;14:7789. [PMID: 38040720 PMCID: PMC10692113 DOI: 10.1038/s41467-023-43490-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 11/10/2023] [Indexed: 12/03/2023] Open

Brodbeck C, Das P, Gillis M, Kulasingham JP, Bhattasali S, Gaston P, Resnik P, Simon JZ. Eelbrain, a Python toolkit for time-continuous analysis with temporal response functions. eLife 2023;12:e85012. [PMID: 38018501 PMCID: PMC10783870 DOI: 10.7554/elife.85012] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 11/24/2023] [Indexed: 11/30/2023] Open

Abstract

Even though human experience unfolds continuously in time, it is not strictly linear; instead, it entails cascading processes building hierarchical cognitive structures. For instance, during speech perception, humans transform a continuously varying acoustic signal into phonemes, words, and meaning, and these levels all have distinct but interdependent temporal structures. Time-lagged regression using temporal response functions (TRFs) has recently emerged as a promising tool for disentangling electrophysiological brain responses related to such complex models of perception. Here, we introduce the Eelbrain Python toolkit, which makes this kind of analysis easy and accessible. We demonstrate its use, using continuous speech as a sample paradigm, with a freely available EEG dataset of audiobook listening. A companion GitHub repository provides the complete source code for the analysis, from raw data to group-level statistics. More generally, we advocate a hypothesis-driven approach in which the experimenter specifies a hierarchy of time-continuous representations that are hypothesized to have contributed to brain responses, and uses those as predictor variables for the electrophysiological signal. This is analogous to a multiple regression problem, but with the addition of a time dimension. TRF analysis decomposes the brain signal into distinct responses associated with the different predictor variables by estimating a multivariate TRF (mTRF), quantifying the influence of each predictor on brain responses as a function of time(-lags). This allows asking two questions about the predictor variables: (1) Is there a significant neural representation corresponding to this predictor variable? And if so, (2) what are the temporal characteristics of the neural response associated with it? Thus, different predictor variables can be systematically combined and evaluated to jointly model neural processing at multiple hierarchical levels. We discuss applications of this approach, including the potential for linking algorithmic/representational theories at different cognitive levels to brain responses through computational models with appropriate linking hypotheses.

Collapse

Zhang Y, Ding R, Frassinelli D, Tuomainen J, Klavinskis-Whiting S, Vigliocco G. The role of multimodal cues in second language comprehension. Sci Rep 2023;13:20824. [PMID: 38012193 PMCID: PMC10682458 DOI: 10.1038/s41598-023-47643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 11/16/2023] [Indexed: 11/29/2023] Open

Wang B, Xu X, Niu Y, Wu C, Wu X, Chen J. EEG-based auditory attention decoding with audiovisual speech for hearing-impaired listeners. Cereb Cortex 2023;33:10972-10983. [PMID: 37750333 DOI: 10.1093/cercor/bhad325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/21/2023] [Accepted: 08/22/2023] [Indexed: 09/27/2023] Open

Tan SHJ, Kalashnikova M, Di Liberto GM, Crosse MJ, Burnham D. Seeing a Talking Face Matters: Gaze Behavior and the Auditory-Visual Speech Benefit in Adults' Cortical Tracking of Infant-directed Speech. J Cogn Neurosci 2023;35:1741-1759. [PMID: 37677057 DOI: 10.1162/jocn_a_02044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]

Abstract

In face-to-face conversations, listeners gather visual speech information from a speaker's talking face that enhances their perception of the incoming auditory speech signal. This auditory-visual (AV) speech benefit is evident even in quiet environments but is stronger in situations that require greater listening effort such as when the speech signal itself deviates from listeners' expectations. One example is infant-directed speech (IDS) presented to adults. IDS has exaggerated acoustic properties that are easily discriminable from adult-directed speech (ADS). Although IDS is a speech register that adults typically use with infants, no previous neurophysiological study has directly examined whether adult listeners process IDS differently from ADS. To address this, the current study simultaneously recorded EEG and eye-tracking data from adult participants as they were presented with auditory-only (AO), visual-only, and AV recordings of IDS and ADS. Eye-tracking data were recorded because looking behavior to the speaker's eyes and mouth modulates the extent of AV speech benefit experienced. Analyses of cortical tracking accuracy revealed that cortical tracking of the speech envelope was significant in AO and AV modalities for IDS and ADS. However, the AV speech benefit [i.e., AV > (A + V)] was only present for IDS trials. Gaze behavior analyses indicated differences in looking behavior during IDS and ADS trials. Surprisingly, looking behavior to the speaker's eyes and mouth was not correlated with cortical tracking accuracy. Additional exploratory analyses indicated that attention to the whole display was negatively correlated with cortical tracking accuracy of AO and visual-only trials in IDS. Our results underscore the nuances involved in the relationship between neurophysiological AV speech benefit and looking behavior.

Collapse

Guilleminot P, Graef C, Butters E, Reichenbach T. Audiotactile Stimulation Can Improve Syllable Discrimination through Multisensory Integration in the Theta Frequency Band. J Cogn Neurosci 2023;35:1760-1772. [PMID: 37677062 DOI: 10.1162/jocn_a_02045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]

Jiang Z, An X, Liu S, Yin E, Yan Y, Ming D. Neural oscillations reflect the individual differences in the temporal perception of audiovisual speech. Cereb Cortex 2023;33:10575-10583. [PMID: 37727958 DOI: 10.1093/cercor/bhad304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 09/21/2023] Open

Ahmed F, Nidiffer AR, Lalor EC. The effect of gaze on EEG measures of multisensory integration in a cocktail party scenario. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.23.554451. [PMID: 37662393 PMCID: PMC10473711 DOI: 10.1101/2023.08.23.554451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Abstract

Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers - an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.

Collapse

Jia Z, Xu C, Li J, Gao J, Ding N, Luo B, Zou J. Phase Property of Envelope-Tracking EEG Response Is Preserved in Patients with Disorders of Consciousness. eNeuro 2023;10:ENEURO.0130-23.2023. [PMID: 37500493 PMCID: PMC10420405 DOI: 10.1523/eneuro.0130-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 07/29/2023] Open

Ahmed F, Nidiffer AR, O'Sullivan AE, Zuk NJ, Lalor EC. The integration of continuous audio and visual speech in a cocktail-party environment depends on attention. Neuroimage 2023;274:120143. [PMID: 37121375 DOI: 10.1016/j.neuroimage.2023.120143] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 03/17/2023] [Accepted: 04/27/2023] [Indexed: 05/02/2023] Open

Fu X, Riecke L. Effects of continuous tactile stimulation on auditory-evoked cortical responses depend on the audio-tactile phase. Neuroimage 2023;274:120140. [PMID: 37120042 DOI: 10.1016/j.neuroimage.2023.120140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 04/27/2023] [Indexed: 05/01/2023] Open

Chalas N, Omigie D, Poeppel D, van Wassenhove V. Hierarchically nested networks optimize the analysis of audiovisual speech. iScience 2023;26:106257. [PMID: 36909667 PMCID: PMC9993032 DOI: 10.1016/j.isci.2023.106257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 12/22/2022] [Accepted: 02/17/2023] [Indexed: 02/22/2023] Open

Ryumin D, Ivanko D, Ryumina E. Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices. SENSORS (BASEL, SWITZERLAND) 2023;23:s23042284. [PMID: 36850882 PMCID: PMC9967234 DOI: 10.3390/s23042284] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/06/2023] [Accepted: 02/14/2023] [Indexed: 05/27/2023]

Jiang Z, An X, Liu S, Wang L, Yin E, Yan Y, Ming D. The effect of prestimulus low-frequency neural oscillations on the temporal perception of audiovisual speech. Front Neurosci 2023;17:1067632. [PMID: 36816126 PMCID: PMC9935937 DOI: 10.3389/fnins.2023.1067632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/17/2023] [Indexed: 02/05/2023] Open

Abstract

Objective

Perceptual integration and segregation are modulated by the phase of ongoing neural oscillation whose frequency period is broader than the size of the temporal binding window (TBW). Studies have shown that the abstract beep-flash stimuli with about 100 ms TBW were modulated by the alpha band phase. Therefore, we hypothesize that the temporal perception of speech with about hundreds of milliseconds of TBW might be affected by the delta-theta phase.

Methods

Thus, we conducted a speech-stimuli-based audiovisual simultaneity judgment (SJ) experiment. Twenty human participants (12 females) attended this study, recording 62 channels of EEG.

Results

Behavioral results showed that the visual leading TBWs are broader than the auditory leading ones [273.37 ± 24.24 ms vs. 198.05 ± 19.28 ms, (mean ± sem)]. We used Phase Opposition Sum (POS) to quantify the differences in mean phase angles and phase concentrations between synchronous and asynchronous responses. The POS results indicated that the delta-theta phase was significantly different between synchronous and asynchronous responses in the A50V condition (50% synchronous responses in auditory leading SOA). However, in the V50A condition (50% synchronous responses in visual leading SOA), we only found the delta band effect. In the two conditions, we did not find a consistency of phases over subjects for both perceptual responses by the post hoc Rayleigh test (all ps > 0.05). The Rayleigh test results suggested that the phase might not reflect the neuronal excitability which assumed that the phases within a perceptual response across subjects concentrated on the same angle but were not uniformly distributed. But V-test showed the phase difference between synchronous and asynchronous responses across subjects had a significant phase opposition (all ps < 0.05) which is compatible with the POS result.

Conclusion

These results indicate that the speech temporal perception depends on the alignment of stimulus onset with an optimal phase of the neural oscillation whose frequency period might be broader than the size of TBW. The role of the oscillatory phase might be encoding the temporal information which varies across subjects rather than neuronal excitability. Given the enriched temporal structures of spoken language stimuli, the conclusion that phase encodes temporal information is plausible and valuable for future research.

Collapse

Saalasti S, Alho J, Lahnakoski JM, Bacha-Trams M, Glerean E, Jääskeläinen IP, Hasson U, Sams M. Lipreading a naturalistic narrative in a female population: Neural characteristics shared with listening and reading. Brain Behav 2023;13:e2869. [PMID: 36579557 PMCID: PMC9927859 DOI: 10.1002/brb3.2869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 11/29/2022] [Accepted: 12/06/2022] [Indexed: 12/30/2022] Open

Neurodevelopmental oscillatory basis of speech processing in noise. Dev Cogn Neurosci 2022;59:101181. [PMID: 36549148 PMCID: PMC9792357 DOI: 10.1016/j.dcn.2022.101181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/31/2022] [Accepted: 11/25/2022] [Indexed: 11/27/2022] Open

Suess N, Hauswald A, Reisinger P, Rösch S, Keitel A, Weisz N. Cortical tracking of formant modulations derived from silently presented lip movements and its decline with age. Cereb Cortex 2022;32:4818-4833. [PMID: 35062025 PMCID: PMC9627034 DOI: 10.1093/cercor/bhab518] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 12/15/2021] [Accepted: 12/16/2021] [Indexed: 11/26/2022] Open

Abstract

The integration of visual and auditory cues is crucial for successful processing of speech, especially under adverse conditions. Recent reports have shown that when participants watch muted videos of speakers, the phonological information about the acoustic speech envelope, which is associated with but independent from the speakers' lip movements, is tracked by the visual cortex. However, the speech signal also carries richer acoustic details, for example, about the fundamental frequency and the resonant frequencies, whose visuophonological transformation could aid speech processing. Here, we investigated the neural basis of the visuo-phonological transformation processes of these more fine-grained acoustic details and assessed how they change as a function of age. We recorded whole-head magnetoencephalographic (MEG) data while the participants watched silent normal (i.e., natural) and reversed videos of a speaker and paid attention to their lip movements. We found that the visual cortex is able to track the unheard natural modulations of resonant frequencies (or formants) and the pitch (or fundamental frequency) linked to lip movements. Importantly, only the processing of natural unheard formants decreases significantly with age in the visual and also in the cingulate cortex. This is not the case for the processing of the unheard speech envelope, the fundamental frequency, or the purely visual information carried by lip movements. These results show that unheard spectral fine details (along with the unheard acoustic envelope) are transformed from a mere visual to a phonological representation. Aging affects especially the ability to derive spectral dynamics at formant frequencies. As listening in noisy environments should capitalize on the ability to track spectral fine details, our results provide a novel focus on compensatory processes in such challenging situations.

Collapse

Influence of linguistic properties and hearing impairment on visual speech perception skills in the German language. PLoS One 2022;17:e0275585. [PMID: 36178907 PMCID: PMC9524625 DOI: 10.1371/journal.pone.0275585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 09/20/2022] [Indexed: 11/19/2022] Open

Abstract

Visual input is crucial for understanding speech under noisy conditions, but there are hardly any tools to assess the individual ability to lipread. With this study, we wanted to (1) investigate how linguistic characteristics of language on the one hand and hearing impairment on the other hand have an impact on lipreading abilities and (2) provide a tool to assess lipreading abilities for German speakers. 170 participants (22 prelingually deaf) completed the online assessment, which consisted of a subjective hearing impairment scale and silent videos in which different item categories (numbers, words, and sentences) were spoken. The task for our participants was to recognize the spoken stimuli just by visual inspection. We used different versions of one test and investigated the impact of item categories, word frequency in the spoken language, articulation, sentence frequency in the spoken language, sentence length, and differences between speakers on the recognition score. We found an effect of item categories, articulation, sentence frequency, and sentence length on the recognition score. With respect to hearing impairment we found that higher subjective hearing impairment is associated with higher test score. We did not find any evidence that prelingually deaf individuals show enhanced lipreading skills over people with postlingual acquired hearing impairment. However, we see an interaction with education only in the prelingual deaf, but not in the population with postlingual acquired hearing loss. This points to the fact that there are different factors contributing to enhanced lipreading abilities depending on the onset of hearing impairment (prelingual vs. postlingual). Overall, lipreading skills vary strongly in the general population independent of hearing impairment. Based on our findings we constructed a new and efficient lipreading assessment tool (SaLT) that can be used to test behavioral lipreading abilities in the German speaking population.

Collapse

Ross LA, Molholm S, Butler JS, Bene VAD, Foxe JJ. Neural correlates of multisensory enhancement in audiovisual narrative speech perception: a fMRI investigation. Neuroimage 2022;263:119598. [PMID: 36049699 DOI: 10.1016/j.neuroimage.2022.119598] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/26/2022] [Accepted: 08/28/2022] [Indexed: 11/25/2022] Open

Affiliation(s)

Lars A Ross The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; Department of Imaging Sciences, University of Rochester Medical Center, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.
Sophie Molholm The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA
John S Butler The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; School of Mathematical Sciences, Technological University Dublin, Kevin Street Campus, Dublin, Ireland
Victor A Del Bene The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA; University of Alabama at Birmingham, Heersink School of Medicine, Department of Neurology, Birmingham, Alabama, 35233, USA
John J Foxe The Frederick J. and Marion A. Schindler Cognitive Neurophysiology Laboratory, The Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, New York, 14642, USA; The Cognitive Neurophysiology Laboratory, Departments of Pediatrics and Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, New York, 10461, USA.

Collapse

Yu W, Zeiler S, Kolossa D. Reliability-Based Large-Vocabulary Audio-Visual Speech Recognition. SENSORS (BASEL, SWITZERLAND) 2022;22:5501. [PMID: 35898005 PMCID: PMC9370936 DOI: 10.3390/s22155501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/15/2022] [Accepted: 07/17/2022] [Indexed: 06/15/2023]

Crosse MJ, Foxe JJ, Tarrit K, Freedman EG, Molholm S. Resolution of impaired multisensory processing in autism and the cost of switching sensory modality. Commun Biol 2022;5:601. [PMID: 35773473 PMCID: PMC9246932 DOI: 10.1038/s42003-022-03519-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 05/23/2022] [Indexed: 11/09/2022] Open

Affiliation(s)

Michael J Crosse The Cognitive Neurophysiology Laboratory, Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY, USA. .,The Dominick P. Purpura Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY, USA. .,Trinity Centre for Biomedical Engineering, Department of Mechanical, Manufacturing & Biomedical Engineering, Trinity College Dublin, Dublin, Ireland.
John J Foxe The Cognitive Neurophysiology Laboratory, Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY, USA.,The Dominick P. Purpura Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY, USA.,The Cognitive Neurophysiology Laboratory, Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
Katy Tarrit The Cognitive Neurophysiology Laboratory, Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
Edward G Freedman The Cognitive Neurophysiology Laboratory, Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
Sophie Molholm The Cognitive Neurophysiology Laboratory, Department of Pediatrics, Albert Einstein College of Medicine, Bronx, NY, USA. .,The Dominick P. Purpura Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY, USA. .,The Cognitive Neurophysiology Laboratory, Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA.

Collapse

Jia J, Wang T, Chen S, Ding N, Fang F. Ensemble size perception: Its neural signature and the role of global interaction over individual items. Neuropsychologia 2022;173:108290. [PMID: 35697088 DOI: 10.1016/j.neuropsychologia.2022.108290] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 06/02/2022] [Accepted: 06/07/2022] [Indexed: 10/18/2022]

Bigelow J, Morrill RJ, Olsen T, Hasenstaub AR. Visual modulation of firing and spectrotemporal receptive fields in mouse auditory cortex. CURRENT RESEARCH IN NEUROBIOLOGY 2022;3:100040. [PMID: 36518337 PMCID: PMC9743056 DOI: 10.1016/j.crneur.2022.100040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/26/2022] [Accepted: 05/06/2022] [Indexed: 10/18/2022] Open

Zhang L, Du Y. Lip movements enhance speech representations and effective connectivity in auditory dorsal stream. Neuroimage 2022;257:119311. [PMID: 35589000 DOI: 10.1016/j.neuroimage.2022.119311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 05/09/2022] [Accepted: 05/11/2022] [Indexed: 11/25/2022] Open

Haider CL, Suess N, Hauswald A, Park H, Weisz N. Masking of the mouth area impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker. Neuroimage 2022;252:119044. [PMID: 35240298 DOI: 10.1016/j.neuroimage.2022.119044] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/26/2022] [Accepted: 02/27/2022] [Indexed: 11/29/2022] Open

Jessica Tan SH, Kalashnikova M, Di Liberto GM, Crosse MJ, Burnham D. Seeing a Talking Face Matters: The Relationship between Cortical Tracking of Continuous Auditory-Visual Speech and Gaze Behaviour in Infants, Children and Adults. Neuroimage 2022;256:119217. [PMID: 35436614 DOI: 10.1016/j.neuroimage.2022.119217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 04/09/2022] [Accepted: 04/14/2022] [Indexed: 11/24/2022] Open

Asymmetrical cross-modal influence on neural encoding of auditory and visual features in natural scenes. Neuroimage 2022;255:119182. [PMID: 35395403 DOI: 10.1016/j.neuroimage.2022.119182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 03/24/2022] [Accepted: 04/04/2022] [Indexed: 11/22/2022] Open

Enhancement of speech-in-noise comprehension through vibrotactile stimulation at the syllabic rate. Proc Natl Acad Sci U S A 2022;119:e2117000119. [PMID: 35312362 PMCID: PMC9060510 DOI: 10.1073/pnas.2117000119] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Syllables are important building blocks of speech. They occur at a rate between 4 and 8 Hz, corresponding to the theta frequency range of neural activity in the cerebral cortex. When listening to speech, the theta activity becomes aligned to the syllabic rhythm, presumably aiding in parsing a speech signal into distinct syllables. However, this neural activity cannot only be influenced by sound, but also by somatosensory information. Here, we show that the presentation of vibrotactile signals at the syllabic rate can enhance the comprehension of speech in background noise. We further provide evidence that this multisensory enhancement of speech comprehension reflects the multisensory integration of auditory and tactile information in the auditory cortex.

Speech unfolds over distinct temporal scales, in particular, those related to the rhythm of phonemes, syllables, and words. When a person listens to continuous speech, the syllabic rhythm is tracked by neural activity in the theta frequency range. The tracking plays a functional role in speech processing: Influencing the theta activity through transcranial current stimulation, for instance, can impact speech perception. The theta-band activity in the auditory cortex can also be modulated through the somatosensory system, but the effect on speech processing has remained unclear. Here, we show that vibrotactile feedback presented at the rate of syllables can modulate and, in fact, enhance the comprehension of a speech signal in background noise. The enhancement occurs when vibrotactile pulses occur at the perceptual center of the syllables, whereas a temporal delay between the vibrotactile signals and the speech stream can lead to a lower level of speech comprehension. We further investigate the neural mechanisms underlying the audiotactile integration through electroencephalographic (EEG) recordings. We find that the audiotactile stimulation modulates the neural response to the speech rhythm, as well as the neural response to the vibrotactile pulses. The modulations of these neural activities reflect the behavioral effects on speech comprehension. Moreover, we demonstrate that speech comprehension can be predicted by particular aspects of the neural responses. Our results evidence a role of vibrotactile information for speech processing and may have applications in future auditory prosthesis.

Collapse

Di Liberto GM, Hjortkjær J, Mesgarani N. Editorial: Neural Tracking: Closing the Gap Between Neurophysiology and Translational Medicine. Front Neurosci 2022;16:872600. [PMID: 35368278 PMCID: PMC8966872 DOI: 10.3389/fnins.2022.872600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 02/17/2022] [Indexed: 11/25/2022] Open

Varano E, Vougioukas K, Ma P, Petridis S, Pantic M, Reichenbach T. Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans. Front Neurosci 2022;15:781196. [PMID: 35069100 PMCID: PMC8766421 DOI: 10.3389/fnins.2021.781196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/29/2021] [Indexed: 12/02/2022] Open

Bilinguals Show Proportionally Greater Benefit From Visual Speech Cues and Sentence Context in Their Second Compared to Their First Language. Ear Hear 2021;43:1316-1326. [PMID: 34966162 DOI: 10.1097/aud.0000000000001182] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Abstract

OBJECTIVES

Speech perception in noise is challenging, but evidence suggests that it may be facilitated by visual speech cues (e.g., lip movements) and supportive sentence context in native speakers. Comparatively few studies have investigated speech perception in noise in bilinguals, and little is known about the impact of visual speech cues and supportive sentence context in a first language compared to a second language within the same individual. The current study addresses this gap by directly investigating the extent to which bilinguals benefit from visual speech cues and supportive sentence context under similarly noisy conditions in their first and second language.

DESIGN

Thirty young adult English-French/French-English bilinguals were recruited from the undergraduate psychology program at Concordia University and from the Montreal community. They completed a speech perception in noise task during which they were presented with video-recorded sentences and instructed to repeat the last word of each sentence out loud. Sentences were presented in three different modalities: visual-only, auditory-only, and audiovisual. Additionally, sentences had one of two levels of context: moderate (e.g., "In the woods, the hiker saw a bear.") and low (e.g., "I had not thought about that bear."). Each participant completed this task in both their first and second language; crucially, the level of background noise was calibrated individually for each participant and was the same throughout the first language and second language (L2) portions of the experimental task.

RESULTS

Overall, speech perception in noise was more accurate in bilinguals' first language compared to the second. However, participants benefited from visual speech cues and supportive sentence context to a proportionally greater extent in their second language compared to their first. At the individual level, performance during the speech perception in noise task was related to aspects of bilinguals' experience in their second language (i.e., age of acquisition, relative balance between the first and the second language).

CONCLUSIONS

Bilinguals benefit from visual speech cues and sentence context in their second language during speech in noise and do so to a greater extent than in their first language given the same level of background noise. Together, this indicates that L2 speech perception can be conceptualized within an inverse effectiveness hypothesis framework with a complex interplay of sensory factors (i.e., the quality of the auditory speech signal and visual speech cues) and linguistic factors (i.e., presence or absence of supportive context and L2 experience of the listener).

Collapse

Crosse MJ, Zuk NJ, Di Liberto GM, Nidiffer AR, Molholm S, Lalor EC. Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research. Front Neurosci 2021;15:705621. [PMID: 34880719 PMCID: PMC8648261 DOI: 10.3389/fnins.2021.705621] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 09/21/2021] [Indexed: 01/01/2023] Open

Affiliation(s)

Michael J. Crosse Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland X, The Moonshot Factory, Mountain View, CA, United States Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
Nathaniel J. Zuk Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States
Giovanni M. Di Liberto Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Centre for Biomedical Engineering, School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Aaron R. Nidiffer Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States
Sophie Molholm Department of Pediatrics, Albert Einstein College of Medicine, New York, NY, United States Department of Neuroscience, Albert Einstein College of Medicine, New York, NY, United States
Edmund C. Lalor Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College Dublin, Dublin, Ireland Department of Biomedical Engineering, University of Rochester, Rochester, NY, United States Department of Neuroscience, University of Rochester, Rochester, NY, United States

Collapse

Fleming JT, Maddox RK, Shinn-Cunningham BG. Spatial alignment between faces and voices improves selective attention to audio-visual speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021;150:3085. [PMID: 34717460 DOI: 10.1121/10.0006415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 09/01/2021] [Indexed: 06/13/2023]

Chen L, Liao HI. Microsaccadic Eye Movements but not Pupillary Dilation Response Characterizes the Crossmodal Freezing Effect. Cereb Cortex Commun 2021;1:tgaa072. [PMID: 34296132 PMCID: PMC8153075 DOI: 10.1093/texcom/tgaa072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 11/14/2022] Open

O'Sullivan AE, Crosse MJ, Liberto GMD, de Cheveigné A, Lalor EC. Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects. J Neurosci 2021;41:4991-5003. [PMID: 33824190 PMCID: PMC8197638 DOI: 10.1523/jneurosci.0906-20.2021] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 03/16/2021] [Accepted: 03/22/2021] [Indexed: 12/27/2022] Open

Abstract

Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However, it remains unclear how the integration of these cues varies as a function of listening conditions. Here, we sought to provide insight on these questions by examining EEG responses in humans (males and females) to natural audiovisual (AV), audio, and visual speech in quiet and in noise. We represented our speech stimuli in terms of their spectrograms and their phonetic features and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis (CCA). The encoding of both spectrotemporal and phonetic features was shown to be more robust in AV speech responses than what would have been expected from the summation of the audio and visual speech responses, suggesting that multisensory integration occurs at both spectrotemporal and phonetic stages of speech processing. We also found evidence to suggest that the integration effects may change with listening conditions; however, this was an exploratory analysis and future work will be required to examine this effect using a within-subject design. These findings demonstrate that integration of audio and visual speech occurs at multiple stages along the speech processing hierarchy.SIGNIFICANCE STATEMENT During conversation, visual cues impact our perception of speech. Integration of auditory and visual speech is thought to occur at multiple stages of speech processing and vary flexibly depending on the listening conditions. Here, we examine audiovisual (AV) integration at two stages of speech processing using the speech spectrogram and a phonetic representation, and test how AV integration adapts to degraded listening conditions. We find significant integration at both of these stages regardless of listening conditions. These findings reveal neural indices of multisensory interactions at different stages of processing and provide support for the multistage integration framework.

Collapse

Effects of stimulus intensity on audiovisual integration in aging across the temporal dynamics of processing. Int J Psychophysiol 2021;162:95-103. [PMID: 33529642 DOI: 10.1016/j.ijpsycho.2021.01.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 10/26/2020] [Accepted: 01/24/2021] [Indexed: 11/24/2022]

Mégevand P, Mercier MR, Groppe DM, Zion Golumbic E, Mesgarani N, Beauchamp MS, Schroeder CE, Mehta AD. Crossmodal Phase Reset and Evoked Responses Provide Complementary Mechanisms for the Influence of Visual Speech in Auditory Cortex. J Neurosci 2020;40:8530-8542. [PMID: 33023923 PMCID: PMC7605423 DOI: 10.1523/jneurosci.0555-20.2020] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 07/27/2020] [Accepted: 08/31/2020] [Indexed: 12/26/2022] Open

Abstract

Natural conversation is multisensory: when we can see the speaker's face, visual speech cues improve our comprehension. The neuronal mechanisms underlying this phenomenon remain unclear. The two main alternatives are visually mediated phase modulation of neuronal oscillations (excitability fluctuations) in auditory neurons and visual input-evoked responses in auditory neurons. Investigating this question using naturalistic audiovisual speech with intracranial recordings in humans of both sexes, we find evidence for both mechanisms. Remarkably, auditory cortical neurons track the temporal dynamics of purely visual speech using the phase of their slow oscillations and phase-related modulations in broadband high-frequency activity. Consistent with known perceptual enhancement effects, the visual phase reset amplifies the cortical representation of concomitant auditory speech. In contrast to this, and in line with earlier reports, visual input reduces the amplitude of evoked responses to concomitant auditory input. We interpret the combination of improved phase tracking and reduced response amplitude as evidence for more efficient and reliable stimulus processing in the presence of congruent auditory and visual speech inputs.SIGNIFICANCE STATEMENT Watching the speaker can facilitate our understanding of what is being said. The mechanisms responsible for this influence of visual cues on the processing of speech remain incompletely understood. We studied these mechanisms by recording the electrical activity of the human brain through electrodes implanted surgically inside the brain. We found that visual inputs can operate by directly activating auditory cortical areas, and also indirectly by modulating the strength of cortical responses to auditory input. Our results help to understand the mechanisms by which the brain merges auditory and visual speech into a unitary perception.

Collapse

Destoky F, Bertels J, Niesen M, Wens V, Vander Ghinst M, Leybaert J, Lallier M, Ince RAA, Gross J, De Tiège X, Bourguignon M. Cortical tracking of speech in noise accounts for reading strategies in children. PLoS Biol 2020;18:e3000840. [PMID: 32845876 PMCID: PMC7478533 DOI: 10.1371/journal.pbio.3000840] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 09/08/2020] [Accepted: 08/12/2020] [Indexed: 11/29/2022] Open

Affiliation(s)

Florian Destoky Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Julie Bertels Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Consciousness, Cognition and Computation group, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Maxime Niesen Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Service d'ORL et de chirurgie cervico-faciale, ULB-Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Vincent Wens Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Marc Vander Ghinst Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Jacqueline Leybaert Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium
Marie Lallier BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain
Robin A. A. Ince Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom
Joachim Gross Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom Institute for Biomagnetism and Biosignal analysis, University of Muenster, Muenster, Germany
Xavier De Tiège Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium
Mathieu Bourguignon Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain

Collapse

Fletcher MD, Song H, Perry SW. Electro-haptic stimulation enhances speech recognition in spatially separated noise for cochlear implant users. Sci Rep 2020;10:12723. [PMID: 32728109 PMCID: PMC7391652 DOI: 10.1038/s41598-020-69697-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 07/14/2020] [Indexed: 11/10/2022] Open

de Boer MJ, Başkent D, Cornelissen FW. Eyes on Emotion: Dynamic Gaze Allocation During Emotion Perception From Speech-Like Stimuli. Multisens Res 2020;34:17-47. [PMID: 33706278 DOI: 10.1163/22134808-bja10029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 05/29/2020] [Indexed: 11/19/2022]

Greenlaw KM, Puschmann S, Coffey EBJ. Decoding of Envelope vs. Fundamental Frequency During Complex Auditory Stream Segregation. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2020;1:268-287. [PMID: 37215227 PMCID: PMC10158587 DOI: 10.1162/nol_a_00013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 04/25/2020] [Indexed: 05/24/2023]

Shaw LH, Freedman EG, Crosse MJ, Nicholas E, Chen AM, Braiman MS, Molholm S, Foxe JJ. Operating in a Multisensory Context: Assessing the Interplay Between Multisensory Reaction Time Facilitation and Inter-sensory Task-switching Effects. Neuroscience 2020;436:122-135. [PMID: 32325100 DOI: 10.1016/j.neuroscience.2020.04.013] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 04/03/2020] [Accepted: 04/06/2020] [Indexed: 11/28/2022]

Affiliation(s)

Luke H Shaw The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
Edward G Freedman The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
Michael J Crosse The Cognitive Neurophysiology Laboratory, Department of Pediatrics & Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA
Eric Nicholas The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
Allen M Chen The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
Matthew S Braiman The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
Sophie Molholm The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA; The Cognitive Neurophysiology Laboratory, Department of Pediatrics & Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA
John J Foxe The Cognitive Neurophysiology Laboratory, The Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA; The Cognitive Neurophysiology Laboratory, Department of Pediatrics & Neuroscience, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA.

Collapse

Vanheusden FJ, Kegler M, Ireland K, Georga C, Simpson DM, Reichenbach T, Bell SL. Hearing Aids Do Not Alter Cortical Entrainment to Speech at Audible Levels in Mild-to-Moderately Hearing-Impaired Subjects. Front Hum Neurosci 2020;14:109. [PMID: 32317951 PMCID: PMC7147120 DOI: 10.3389/fnhum.2020.00109] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 03/11/2020] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Cortical entrainment to speech correlates with speech intelligibility and attention to a speech stream in noisy environments. However, there is a lack of data on whether cortical entrainment can help in evaluating hearing aid fittings for subjects with mild to moderate hearing loss. One particular problem that may arise is that hearing aids may alter the speech stimulus during (pre-)processing steps, which might alter cortical entrainment to the speech. Here, the effect of hearing aid processing on cortical entrainment to running speech in hearing impaired subjects was investigated.

METHODOLOGY

Seventeen native English-speaking subjects with mild-to-moderate hearing loss participated in the study. Hearing function and hearing aid fitting were evaluated using standard clinical procedures. Participants then listened to a 25-min audiobook under aided and unaided conditions at 70 dBA sound pressure level (SPL) in quiet conditions. EEG data were collected using a 32-channel system. Cortical entrainment to speech was evaluated using decoders reconstructing the speech envelope from the EEG data. Null decoders, obtained from EEG and the time-reversed speech envelope, were used to assess the chance level reconstructions. Entrainment in the delta- (1-4 Hz) and theta- (4-8 Hz) band, as well as wideband (1-20 Hz) EEG data was investigated.

RESULTS

Significant cortical responses could be detected for all but one subject in all three frequency bands under both aided and unaided conditions. However, no significant differences could be found between the two conditions in the number of responses detected, nor in the strength of cortical entrainment. The results show that the relatively small change in speech input provided by the hearing aid was not sufficient to elicit a detectable change in cortical entrainment.

CONCLUSION

For subjects with mild to moderate hearing loss, cortical entrainment to speech in quiet at an audible level is not affected by hearing aids. These results clear the pathway for exploring the potential to use cortical entrainment to running speech for evaluating hearing aid fitting at lower speech intensities (which could be inaudible when unaided), or using speech in noise conditions.

Collapse

Micheli C, Schepers IM, Ozker M, Yoshor D, Beauchamp MS, Rieger JW. Electrocorticography reveals continuous auditory and visual speech tracking in temporal and occipital cortex. Eur J Neurosci 2020;51:1364-1376. [PMID: 29888819 PMCID: PMC6289876 DOI: 10.1111/ejn.13992] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 05/19/2018] [Accepted: 05/29/2018] [Indexed: 12/11/2022]

Yuan Y, Wayland R, Oh Y. Visual analog of the acoustic amplitude envelope benefits speech perception in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020;147:EL246. [PMID: 32237828 DOI: 10.1121/10.0000737] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 01/29/2020] [Indexed: 06/11/2023]

Shavit-Cohen K, Zion Golumbic E. The Dynamics of Attention Shifts Among Concurrent Speech in a Naturalistic Multi-speaker Virtual Environment. Front Hum Neurosci 2019;13:386. [PMID: 31780911 PMCID: PMC6857110 DOI: 10.3389/fnhum.2019.00386] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 10/16/2019] [Indexed: 12/18/2022] Open

Abstract

Focusing attention on one speaker on the background of other irrelevant speech can be a challenging feat. A longstanding question in attention research is whether and how frequently individuals shift their attention towards task-irrelevant speech, arguably leading to occasional detection of words in a so-called unattended message. However, this has been difficult to gauge empirically, particularly when participants attend to continuous natural speech, due to the lack of appropriate metrics for detecting shifts in internal attention. Here we introduce a new experimental platform for studying the dynamic deployment of attention among concurrent speakers, utilizing a unique combination of Virtual Reality (VR) and Eye-Tracking technology. We created a Virtual Café in which participants sit across from and attend to the narrative of a target speaker. We manipulated the number and location of distractor speakers by placing additional characters throughout the Virtual Café. By monitoring participant's eye-gaze dynamics, we studied the patterns of overt attention-shifts among concurrent speakers as well as the consequences of these shifts on speech comprehension. Our results reveal important individual differences in the gaze-pattern displayed during selective attention to speech. While some participants stayed fixated on a target speaker throughout the entire experiment, approximately 30% of participants frequently shifted their gaze toward distractor speakers or other locations in the environment, regardless of the severity of audiovisual distraction. Critically, preforming frequent gaze-shifts negatively impacted the comprehension of target speech, and participants made more mistakes when looking away from the target speaker. We also found that gaze-shifts occurred primarily during gaps in the acoustic input, suggesting that momentary reductions in acoustic masking prompt attention-shifts between competing speakers, in line with "glimpsing" theories of processing speech in noise. These results open a new window into understanding the dynamics of attention as they wax and wane over time, and the different listening patterns employed for dealing with the influx of sensory input in multisensory environments. Moreover, the novel approach developed here for tracking the locus of momentary attention in a naturalistic virtual-reality environment holds high promise for extending the study of human behavior and cognition and bridging the gap between the laboratory and real-life.

Collapse

Fu Z, Wu X, Chen J. Congruent audiovisual speech enhances auditory attention decoding with EEG. J Neural Eng 2019;16:066033. [PMID: 31505476 DOI: 10.1088/1741-2552/ab4340] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

OBJECTIVE

The auditory attention decoding (AAD) approach can be used to determine the identity of the attended speaker during an auditory selective attention task, by analyzing measurements of electroencephalography (EEG) data. The AAD approach has the potential to guide the design of speech enhancement algorithms in hearing aids, i.e. to identify the speech stream of listener's interest so that hearing aids algorithms can amplify the target speech and attenuate other distracting sounds. This would consequently result in improved speech understanding and communication and reduced cognitive load, etc. The present work aimed to investigate whether additional visual input (i.e. lipreading) would enhance the AAD performance for normal-hearing listeners.

APPROACH

In a two-talker scenario, where auditory stimuli of audiobooks narrated by two speakers were presented, multi-channel EEG signals were recorded while participants were selectively attending to one speaker and ignoring the other one. Speakers' mouth movements were recorded during narrating for providing visual stimuli. Stimulus conditions included audio-only, visual input congruent with either (i.e. attended or unattended) speaker, and visual input incongruent with either speaker. The AAD approach was performed separately for each condition to evaluate the effect of additional visual input on AAD.

MAIN RESULTS

Relative to the audio-only condition, the AAD performance was found improved by visual input only when it was congruent with the attended speech stream, and the improvement was about 14 percentage points on decoding accuracy. Cortical envelope tracking activities in both auditory and visual cortex were demonstrated stronger for the congruent audiovisual speech condition than other conditions. In addition, a higher AAD robustness was revealed for the congruent audiovisual condition, with reduced channel number and trial duration achieving higher accuracy than the audio-only condition.

SIGNIFICANCE

The present work complements previous studies and further manifests the feasibility of the AAD-guided design of hearing aids in daily face-to-face conversations. The present work also has a directive significance for designing a low-density EEG setup for the AAD approach.

Collapse