1
|
Fantoni M, Federici A, Camponogara I, Handjaras G, Martinelli A, Bednaya E, Ricciardi E, Pavani F, Bottari D. The impact of face masks on face-to-face neural tracking of speech: Auditory and visual obstacles. Heliyon 2024; 10:e34860. [PMID: 39157360 PMCID: PMC11328033 DOI: 10.1016/j.heliyon.2024.e34860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 07/17/2024] [Accepted: 07/17/2024] [Indexed: 08/20/2024] Open
Abstract
Face masks provide fundamental protection against the transmission of respiratory viruses but hamper communication. We estimated auditory and visual obstacles generated by face masks on communication by measuring the neural tracking of speech. To this end, we recorded the EEG while participants were exposed to naturalistic audio-visual speech, embedded in 5-talker noise, in three contexts: (i) no-mask (audio-visual information was fully available), (ii) virtual mask (occluded lips, but intact audio), and (iii) real mask (occluded lips and degraded audio). Neural tracking of lip movements and of the sound envelope of speech was measured through backward modeling, that is, by reconstructing stimulus properties from neural activity. Behaviorally, face masks increased perceived listening difficulty and phonological errors in speech content retrieval. At the neural level, we observed that the occlusion of the mouth abolished lip tracking and dampened neural tracking of the speech envelope at the earliest processing stages. By contrast, degraded acoustic information related to face mask filtering altered neural tracking of speech envelope at later processing stages. Finally, a consistent link emerged between the increment of perceived listening difficulty and the drop in reconstruction performance of speech envelope when attending to a speaker wearing a face mask. Results clearly dissociated the visual and auditory impact of face masks on the neural tracking of speech. While the visual obstacle related to face masks hampered the ability to predict and integrate audio-visual speech, the auditory filter generated by face masks impacted neural processing stages typically associated with auditory selective attention. The link between perceived difficulty and neural tracking drop also provides evidence of the impact of face masks on the metacognitive levels subtending face-to-face communication.
Collapse
Affiliation(s)
- M. Fantoni
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - A. Federici
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - G. Handjaras
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | | | - E. Bednaya
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - E. Ricciardi
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| | - F. Pavani
- Centro Interdipartimentale Mente/Cervello–CIMEC, University of Trento, Italy
- Centro Interuniversitario di Ricerca “Cognizione Linguaggio e Sordità”–CIRCLeS, University of Trento, Italy
| | - D. Bottari
- MoMiLab, IMT School for Advanced Studies Lucca, Lucca, Italy
| |
Collapse
|
2
|
Gutierrez-Sigut E, Lamarche VM, Rowley K, Lago EF, Pardo-Guijarro MJ, Saenz I, Frigola B, Frigola S, Aliaga D, Goldberg L. How do face masks impact communication amongst deaf/HoH people? Cogn Res Princ Implic 2022; 7:81. [PMID: 36063244 PMCID: PMC9443624 DOI: 10.1186/s41235-022-00431-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 08/11/2022] [Indexed: 11/10/2022] Open
Abstract
Face coverings have been key in reducing the spread of COVID-19. At the same time, they have hindered interpersonal communication, particularly for those who rely on speechreading to aid communication. The available research indicated that deaf/hard of hearing (HoH) people experienced great difficulty communicating with people wearing masks and negative effects on wellbeing. Here we extended these findings by exploring which factors predict deaf/HoH people’s communication difficulties, loss of information, and wellbeing. We also explored the factors predicting perceived usefulness of transparent face coverings and alternative ways of communicating. We report the findings from an accessible survey study, released in two written and three signed languages. Responses from 395 deaf/HoH UK and Spanish residents were collected online at a time when masks were mandatory. We investigated whether onset and level of deafness, knowledge of sign language, speechreading fluency, and country of residence predicted communication difficulties, wellbeing, and degree to which transparent face coverings were considered useful. Overall, deaf/HoH people and their relatives used masks most of the time despite greater communication difficulties. Late-onset deaf people were the group that experienced more difficulties in communication, and also reported lower wellbeing. However, both early- and late-onset deaf people reported missing more information and feeling more disconnected from society than HoH people. Finally, signers valued transparent face shields more positively than non-signers. The latter suggests that, while seeing the lips is positive to everyone, signers appreciate seeing the whole facial expression. Importantly, our data also revealed the importance of visual communication other than speechreading to facilitate face-to-face interactions. Late-onset deaf people experienced more difficulties in communication and low wellbeing. Severely/profoundly deaf people missed more information and felt disconnected from society. Signers preferred completely transparent face coverings. More frequent use of masks doesn’t necessarily imply more difficulty communicating. Visual communication, pro-social behaviour, and societal structure might help easing communication.
Collapse
|
3
|
Cieśla K, Wolak T, Lorens A, Mentzel M, Skarżyński H, Amedi A. Effects of training and using an audio-tactile sensory substitution device on speech-in-noise understanding. Sci Rep 2022; 12:3206. [PMID: 35217676 PMCID: PMC8881456 DOI: 10.1038/s41598-022-06855-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 01/28/2022] [Indexed: 11/09/2022] Open
Abstract
Understanding speech in background noise is challenging. Wearing face-masks, imposed by the COVID19-pandemics, makes it even harder. We developed a multi-sensory setup, including a sensory substitution device (SSD) that can deliver speech simultaneously through audition and as vibrations on the fingertips. The vibrations correspond to low frequencies extracted from the speech input. We trained two groups of non-native English speakers in understanding distorted speech in noise. After a short session (30-45 min) of repeating sentences, with or without concurrent matching vibrations, we showed comparable mean group improvement of 14-16 dB in Speech Reception Threshold (SRT) in two test conditions, i.e., when the participants were asked to repeat sentences only from hearing and also when matching vibrations on fingertips were present. This is a very strong effect, if one considers that a 10 dB difference corresponds to doubling of the perceived loudness. The number of sentence repetitions needed for both types of training to complete the task was comparable. Meanwhile, the mean group SNR for the audio-tactile training (14.7 ± 8.7) was significantly lower (harder) than for the auditory training (23.9 ± 11.8), which indicates a potential facilitating effect of the added vibrations. In addition, both before and after training most of the participants (70-80%) showed better performance (by mean 4-6 dB) in speech-in-noise understanding when the audio sentences were accompanied with matching vibrations. This is the same magnitude of multisensory benefit that we reported, with no training at all, in our previous study using the same experimental procedures. After training, performance in this test condition was also best in both groups (SRT ~ 2 dB). The least significant effect of both training types was found in the third test condition, i.e. when participants were repeating sentences accompanied with non-matching tactile vibrations and the performance in this condition was also poorest after training. The results indicate that both types of training may remove some level of difficulty in sound perception, which might enable a more proper use of speech inputs delivered via vibrotactile stimulation. We discuss the implications of these novel findings with respect to basic science. In particular, we show that even in adulthood, i.e. long after the classical "critical periods" of development have passed, a new pairing between a certain computation (here, speech processing) and an atypical sensory modality (here, touch) can be established and trained, and that this process can be rapid and intuitive. We further present possible applications of our training program and the SSD for auditory rehabilitation in patients with hearing (and sight) deficits, as well as healthy individuals in suboptimal acoustic situations.
Collapse
Affiliation(s)
- K Cieśla
- The Baruch Ivcher Institute for Brain, Cognition & Technology, The Baruch Ivcher School of Psychology and the Ruth and Meir Rosental Brain Imaging Center, Reichman University, Herzliya, Israel. .,World Hearing Centre, Institute of Physiology and Pathology of Hearing, Warsaw, Poland.
| | - T Wolak
- World Hearing Centre, Institute of Physiology and Pathology of Hearing, Warsaw, Poland
| | - A Lorens
- World Hearing Centre, Institute of Physiology and Pathology of Hearing, Warsaw, Poland
| | - M Mentzel
- The Baruch Ivcher Institute for Brain, Cognition & Technology, The Baruch Ivcher School of Psychology and the Ruth and Meir Rosental Brain Imaging Center, Reichman University, Herzliya, Israel
| | - H Skarżyński
- World Hearing Centre, Institute of Physiology and Pathology of Hearing, Warsaw, Poland
| | - A Amedi
- The Baruch Ivcher Institute for Brain, Cognition & Technology, The Baruch Ivcher School of Psychology and the Ruth and Meir Rosental Brain Imaging Center, Reichman University, Herzliya, Israel
| |
Collapse
|
4
|
Banks B, Gowen E, Munro KJ, Adank P. Eye Gaze and Perceptual Adaptation to Audiovisual Degraded Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:3432-3445. [PMID: 34463528 DOI: 10.1044/2021_jslhr-21-00106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Visual cues from a speaker's face may benefit perceptual adaptation to degraded speech, but current evidence is limited. We aimed to replicate results from previous studies to establish the extent to which visual speech cues can lead to greater adaptation over time, extending existing results to a real-time adaptation paradigm (i.e., without a separate training period). A second aim was to investigate whether eye gaze patterns toward the speaker's mouth were related to better perception, hypothesizing that listeners who looked more at the speaker's mouth would show greater adaptation. Method A group of listeners (n = 30) was presented with 90 noise-vocoded sentences in audiovisual format, whereas a control group (n = 29) was presented with the audio signal only. Recognition accuracy was measured throughout and eye tracking was used to measure fixations toward the speaker's eyes and mouth in the audiovisual group. Results Previous studies were partially replicated: The audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall. Longer fixations on the speaker's mouth in the audiovisual group were related to better overall accuracy. An exploratory analysis further demonstrated that the duration of fixations to the speaker's mouth decreased over time. Conclusions The results suggest that visual cues may not benefit adaptation to degraded speech as much as previously thought. Longer fixations on a speaker's mouth may play a role in successfully decoding visual speech cues; however, this will need to be confirmed in future research to fully understand how patterns of eye gaze are related to audiovisual speech recognition. All materials, data, and code are available at https://osf.io/2wqkf/.
Collapse
Affiliation(s)
- Briony Banks
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
| | - Emma Gowen
- Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
| | - Kevin J Munro
- Manchester Centre for Audiology and Deafness, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
- Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, United Kingdom
| | - Patti Adank
- Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| |
Collapse
|
5
|
Bernstein JGW, Venezia JH, Grant KW. Auditory and auditory-visual frequency-band importance functions for consonant recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3712. [PMID: 32486805 DOI: 10.1121/10.0001301] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 05/05/2020] [Indexed: 06/11/2023]
Abstract
The relative importance of individual frequency regions for speech intelligibility has been firmly established for broadband auditory-only (AO) conditions. Yet, speech communication often takes place face-to-face. This study tested the hypothesis that under auditory-visual (AV) conditions, where visual information is redundant with high-frequency auditory cues, lower frequency regions will increase in relative importance compared to AO conditions. Frequency band-importance functions for consonants were measured for eight hearing-impaired and four normal-hearing listeners. Speech was filtered into four 1/3-octave bands each separated by an octave to minimize energetic masking. On each trial, the signal-to-noise ratio (SNR) in each band was selected randomly from a 10-dB range. AO and AV band-importance functions were estimated using three logistic-regression analyses: a primary model relating performance to the four independent SNRs; a control model that also included band-interaction terms; and a different set of four control models, each examining one band at a time. For both listener groups, the relative importance of the low-frequency bands increased under AV conditions, consistent with earlier studies using isolated speech bands. All three analyses showed similar results, indicating the absence of cross-band interactions. These results suggest that accurate prediction of AV speech intelligibility may require different frequency-importance functions than for AO conditions.
Collapse
Affiliation(s)
- Joshua G W Bernstein
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889, USA
| | - Jonathan H Venezia
- Veterans Affairs Loma Linda Healthcare System, 11201 Benton Street, Loma Linda, California 92357, USA
| | - Ken W Grant
- National Military Audiology and Speech Pathology Center, Walter Reed National Military Medical Center, 4954 North Palmer Road, Bethesda, Maryland 20889, USA
| |
Collapse
|