1
|
Gaultier C, Goehring T. Recovering speech intelligibility with deep learning and multiple microphones in noisy-reverberant situations for people using cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3833-3847. [PMID: 38884525 DOI: 10.1121/10.0026218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 05/10/2024] [Indexed: 06/18/2024]
Abstract
For cochlear implant (CI) listeners, holding a conversation in noisy and reverberant environments is often challenging. Deep-learning algorithms can potentially mitigate these difficulties by enhancing speech in everyday listening environments. This study compared several deep-learning algorithms with access to one, two unilateral, or six bilateral microphones that were trained to recover speech signals by jointly removing noise and reverberation. The noisy-reverberant speech and an ideal noise reduction algorithm served as lower and upper references, respectively. Objective signal metrics were compared with results from two listening tests, including 15 typical hearing listeners with CI simulations and 12 CI listeners. Large and statistically significant improvements in speech reception thresholds of 7.4 and 10.3 dB were found for the multi-microphone algorithms. For the single-microphone algorithm, there was an improvement of 2.3 dB but only for the CI listener group. The objective signal metrics correctly predicted the rank order of results for CI listeners, and there was an overall agreement for most effects and variances between results for CI simulations and CI listeners. These algorithms hold promise to improve speech intelligibility for CI listeners in environments with noise and reverberation and benefit from a boost in performance when using features extracted from multiple microphones.
Collapse
Affiliation(s)
- Clément Gaultier
- Cambridge Hearing Group, Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Tobias Goehring
- Cambridge Hearing Group, Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| |
Collapse
|
2
|
Malone AK, Hungerford ME, Smith SB, Chang NYN, Uchanski RM, Oh YH, Lewis RF, Hullar TE. Age-Related Changes in Temporal Binding Involving Auditory and Vestibular Inputs. Semin Hear 2024; 45:110-122. [PMID: 38370520 PMCID: PMC10872654 DOI: 10.1055/s-0043-1770137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024] Open
Abstract
Maintaining balance involves the combination of sensory signals from the visual, vestibular, proprioceptive, and auditory systems. However, physical and biological constraints ensure that these signals are perceived slightly asynchronously. The brain only recognizes them as simultaneous when they occur within a period of time called the temporal binding window (TBW). Aging can prolong the TBW, leading to temporal uncertainty during multisensory integration. This effect might contribute to imbalance in the elderly but has not been examined with respect to vestibular inputs. Here, we compared the vestibular-related TBW in 13 younger and 12 older subjects undergoing 0.5 Hz sinusoidal rotations about the earth-vertical axis. An alternating dichotic auditory stimulus was presented at the same frequency but with the phase varied to determine the temporal range over which the two stimuli were perceived as simultaneous at least 75% of the time, defined as the TBW. The mean TBW among younger subjects was 286 ms (SEM ± 56 ms) and among older subjects was 560 ms (SEM ± 52 ms). TBW was related to vestibular sensitivity among younger but not older subjects, suggesting that a prolonged TBW could be a mechanism for imbalance in the elderly person independent of changes in peripheral vestibular function.
Collapse
Affiliation(s)
| | - Michelle E. Hungerford
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, Oregon
- Department of Otolaryngology—Head and Neck Surgery, Oregon Health and Science University, Portland, Oregon
| | - Spencer B. Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas, Austin, Texas
| | - Nai-Yuan N. Chang
- Department of Oral and Maxillofacial Surgery, Oregon Health and Science University, Portland, Oregon
| | - Rosalie M. Uchanski
- Department of Otolaryngology - Head and Neck Surgery, Washington University in St. Louis, St. Louis, Missouri
| | - Yong-Hee Oh
- University of Louisville, Louisville, Kentucky
| | - Richard F. Lewis
- Departments of Otolaryngology and Neurology, Harvard Medical School, Boston, Massachusetts
| | - Timothy E. Hullar
- VA RR&D National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, Oregon
- Department of Otolaryngology—Head and Neck Surgery, Oregon Health and Science University, Portland, Oregon
| |
Collapse
|
3
|
Gordon-Salant S, Schwartz MS, Oppler KA, Yeni-Komshian GH. Detection and Recognition of Asynchronous Auditory/Visual Speech: Effects of Age, Hearing Loss, and Talker Accent. Front Psychol 2022; 12:772867. [PMID: 35153900 PMCID: PMC8832148 DOI: 10.3389/fpsyg.2021.772867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
This investigation examined age-related differences in auditory-visual (AV) integration as reflected on perceptual judgments of temporally misaligned AV English sentences spoken by native English and native Spanish talkers. In the detection task, it was expected that slowed auditory temporal processing of older participants, relative to younger participants, would be manifest as a shift in the range over which participants would judge asynchronous stimuli as synchronous (referred to as the "AV simultaneity window"). The older participants were also expected to exhibit greater declines in speech recognition for asynchronous AV stimuli than younger participants. Talker accent was hypothesized to influence listener performance, with older listeners exhibiting a greater narrowing of the AV simultaneity window and much poorer recognition of asynchronous AV foreign-accented speech compared to younger listeners. Participant groups included younger and older participants with normal hearing and older participants with hearing loss. Stimuli were video recordings of sentences produced by native English and native Spanish talkers. The video recordings were altered in 50 ms steps by delaying either the audio or video onset. Participants performed a detection task in which they judged whether the sentences were synchronous or asynchronous, and performed a recognition task for multiple synchronous and asynchronous conditions. Both the detection and recognition tasks were conducted at the individualized signal-to-noise ratio (SNR) corresponding to approximately 70% correct speech recognition performance for synchronous AV sentences. Older listeners with and without hearing loss generally showed wider AV simultaneity windows than younger listeners, possibly reflecting slowed auditory temporal processing in auditory lead conditions and reduced sensitivity to asynchrony in auditory lag conditions. However, older and younger listeners were affected similarly by misalignment of auditory and visual signal onsets on the speech recognition task. This suggests that older listeners are negatively impacted by temporal misalignments for speech recognition, even when they do not notice that the stimuli are asynchronous. Overall, the findings show that when listener performance is equated for simultaneous AV speech signals, age effects are apparent in detection judgments but not in recognition of asynchronous speech.
Collapse
Affiliation(s)
- Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland, College Park, MD, United States
| | | | | | | |
Collapse
|
4
|
Sandhya, Vinay, V M. Perception of Incongruent Audiovisual Speech: Distribution of Modality-Specific Responses. Am J Audiol 2021; 30:968-979. [PMID: 34499528 DOI: 10.1044/2021_aja-20-00213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Multimodal sensory integration in audiovisual (AV) speech perception is a naturally occurring phenomenon. Modality-specific responses such as auditory left, auditory right, and visual responses to dichotic incongruent AV speech stimuli help in understanding AV speech processing through each input modality. It is observed that distribution of activity in the frontal motor areas involved in speech production has been shown to correlate with how subjects perceive the same syllable differently or perceive different syllables. This study investigated the distribution of modality-specific responses to dichotic incongruent AV speech stimuli by simultaneously presenting consonant-vowel (CV) syllables with different places of articulation to the participant's left and right ears and visually. DESIGN A dichotic experimental design was adopted. Six stop CV syllables /pa/, /ta/, /ka/, /ba/, /da/, and /ga/ were assembled to create dichotic incongruent AV speech material. Participants included 40 native speakers of Norwegian (20 women, M age = 22.6 years, SD = 2.43 years; 20 men, M age = 23.7 years, SD = 2.08 years). RESULTS Findings of this study showed that, under dichotic listening conditions, velar CV syllables resulted in the highest scores in the respective ears, and this might be explained by stimulus dominance of velar consonants, as shown in previous studies. However, this study, with dichotic auditory stimuli accompanied by an incongruent video segment, demonstrated that the presentation of a visually distinct video segment possibly draws attention to the video segment in some participants, thereby reducing the overall recognition of the dominant syllable. Furthermore, the findings here suggest the possibility of lesser response times to incongruent AV stimuli in females compared with males. CONCLUSION The identification of the left audio, right audio, and visual segments in dichotic incongruent AV stimuli depends on place of articulation, stimulus dominance, and voice onset time of the CV syllables.
Collapse
Affiliation(s)
- Sandhya
- Department of Neuromedicine and Movement Science, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Vinay
- Department of Neuromedicine and Movement Science, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Manchaiah, V
- Department of Speech and Hearing Sciences, Lamar University, Beaumont, TX
| |
Collapse
|
5
|
Generalizable EEG Encoding Models with Naturalistic Audiovisual Stimuli. J Neurosci 2021; 41:8946-8962. [PMID: 34503996 DOI: 10.1523/jneurosci.2891-20.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/24/2021] [Accepted: 08/29/2021] [Indexed: 11/21/2022] Open
Abstract
In natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as "speech tracking." Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from acoustically rich, naturalistic environments with and without background noise can be generalized to more controlled stimuli. If encoding models for acoustically rich, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations of individuals who may not tolerate listening to more controlled and less engaging stimuli for long periods of time. We recorded noninvasive scalp EEG while 17 human participants (8 male/9 female) listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled datasets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to speech in a rich acoustic background were more accurate when including both phonological and acoustic features. Our findings suggest that naturalistic audiovisual stimuli can be used to measure receptive fields that are comparable and generalizable to more controlled audio-only stimuli.SIGNIFICANCE STATEMENT Understanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli-sentences from a speech corpus and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.
Collapse
|
6
|
Abstract
OBJECTIVES When auditory and visual speech information are presented together, listeners obtain an audiovisual (AV) benefit or a speech understanding improvement compared with auditory-only (AO) or visual-only (VO) presentations. Cochlear-implant (CI) listeners, who receive degraded speech input and therefore understand speech using primarily temporal information, seem to readily use visual cues and can achieve a larger AV benefit than normal-hearing (NH) listeners. It is unclear, however, if the AV benefit remains relatively large for CI listeners when trying to understand foreign-accented speech when compared with unaccented speech. Accented speech can introduce changes to temporal auditory cues and visual cues, which could decrease the usefulness of AV information. Furthermore, we sought to determine if the AV benefit was relatively larger in CI compared with NH listeners for both unaccented and accented speech. DESIGN AV benefit was investigated for unaccented and Spanish-accented speech by presenting English sentences in AO, VO, and AV conditions to 15 CI and 15 age- and performance-matched NH listeners. Performance matching between NH and CI listeners was achieved by varying the number of channels of a noise vocoder for the NH listeners. Because of the differences in age and hearing history of the CI listeners, the effects of listener-related variables on speech understanding performance and AV benefit were also examined. RESULTS AV benefit was observed for both unaccented and accented conditions and for both CI and NH listeners. The two groups showed similar performance for the AO and AV conditions, and the normalized AV benefit was relatively smaller for the accented than the unaccented conditions. In the CI listeners, older age was associated with significantly poorer performance with the accented speaker compared with the unaccented speaker. The negative impact of age was somewhat reduced by a significant improvement in performance with access to AV information. CONCLUSIONS When auditory speech information is degraded by CI sound processing, visual cues can be used to improve speech understanding, even in the presence of a Spanish accent. The AV benefit of the CI listeners closely matched that of the NH listeners presented with vocoded speech, which was unexpected given that CI listeners appear to rely more on visual information to communicate. This result is perhaps due to the one-to-one age and performance matching of the listeners. While aging decreased CI listener performance with the accented speaker, access to visual cues boosted performance and could partially overcome the age-related speech understanding deficits for the older CI listeners.
Collapse
|
7
|
Llorach G, Kirschner F, Grimm G, Zokoll MA, Wagener KC, Hohmann V. Development and evaluation of video recordings for the OLSA matrix sentence test. Int J Audiol 2021; 61:311-321. [PMID: 34109902 DOI: 10.1080/14992027.2021.1930205] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
OBJECTIVE The aim was to create and validate an audiovisual version of the German matrix sentence test (MST), which uses the existing audio-only speech material. DESIGN Video recordings were recorded and dubbed with the audio of the existing German MST. The current study evaluates the MST in conditions including audio and visual modalities, speech in quiet and noise, and open and closed-set response formats. SAMPLE One female talker recorded repetitions of the German MST sentences. Twenty-eight young normal-hearing participants completed the evaluation study. RESULTS The audiovisual benefit in quiet was 7.0 dB in sound pressure level (SPL). In noise, the audiovisual benefit was 4.9 dB in signal-to-noise ratio (SNR). Speechreading scores ranged from 0% to 84% speech reception in visual-only sentences (mean = 50%). Audiovisual speech reception thresholds (SRTs) had a larger standard deviation than audio-only SRTs. Audiovisual SRTs improved successively with increasing number of lists performed. The final video recordings are openly available. CONCLUSIONS The video material achieved similar results as the literature in terms of gross speech intelligibility, despite the inherent asynchronies of dubbing. Due to ceiling effects, adaptive procedures targeting 80% intelligibility should be used. At least one or two training lists should be performed.
Collapse
Affiliation(s)
- Gerard Llorach
- Hörzentrum Oldenburg GmbH, Oldenburg, Germany.,Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.,Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Frederike Kirschner
- Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.,Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Giso Grimm
- Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.,Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Melanie A Zokoll
- Hörzentrum Oldenburg GmbH, Oldenburg, Germany.,Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Kirsten C Wagener
- Hörzentrum Oldenburg GmbH, Oldenburg, Germany.,Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.,Hörtech gGmbH, Oldenburg, Germany
| | - Volker Hohmann
- Hörzentrum Oldenburg GmbH, Oldenburg, Germany.,Cluster of Excellence Hearing4All, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany.,Auditory Signal Processing, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
8
|
Dias JW, McClaskey CM, Harris KC. Audiovisual speech is more than the sum of its parts: Auditory-visual superadditivity compensates for age-related declines in audible and lipread speech intelligibility. Psychol Aging 2021; 36:520-530. [PMID: 34124922 PMCID: PMC8427734 DOI: 10.1037/pag0000613] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Multisensory input can improve perception of ambiguous unisensory information. For example, speech heard in noise can be more accurately identified when listeners see a speaker's articulating face. Importantly, these multisensory effects can be superadditive to listeners' ability to process unisensory speech, such that audiovisual speech identification is better than the sum of auditory-only and visual-only speech identification. Age-related declines in auditory and visual speech perception have been hypothesized to be concomitant with stronger cross-sensory influences on audiovisual speech identification, but little evidence exists to support this. Currently, studies do not account for the multisensory superadditive benefit of auditory-visual input in their metrics of the auditory or visual influence on audiovisual speech perception. Here we treat multisensory superadditivity as independent from unisensory auditory and visual processing. In the current investigation, older and younger adults identified auditory, visual, and audiovisual speech in noisy listening conditions. Performance across these conditions was used to compute conventional metrics of the auditory and visual influence on audiovisual speech identification and a metric of auditory-visual superadditivity. Consistent with past work, auditory and visual speech identification declined with age, audiovisual speech identification was preserved, and no age-related differences in the auditory or visual influence on audiovisual speech identification were observed. However, we found that auditory-visual superadditivity improved with age. The novel findings suggest that multisensory superadditivity is independent of unisensory processing. As auditory and visual speech identification decline with age, compensatory changes in multisensory superadditivity may preserve audiovisual speech identification in older adults. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- James W Dias
- Department of Otolaryngology-Head and Neck Surgery
| | | | | |
Collapse
|
9
|
Stawicki M, Majdak P, Başkent D. Ventriloquist Illusion Produced With Virtual Acoustic Spatial Cues and Asynchronous Audiovisual Stimuli in Both Young and Older Individuals. Multisens Res 2019; 32:745-770. [DOI: 10.1163/22134808-20191430] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 09/03/2019] [Indexed: 11/19/2022]
Abstract
Abstract
Ventriloquist illusion, the change in perceived location of an auditory stimulus when a synchronously presented but spatially discordant visual stimulus is added, has been previously shown in young healthy populations to be a robust paradigm that mainly relies on automatic processes. Here, we propose ventriloquist illusion as a potential simple test to assess audiovisual (AV) integration in young and older individuals. We used a modified version of the illusion paradigm that was adaptive, nearly bias-free, relied on binaural stimulus representation using generic head-related transfer functions (HRTFs) instead of multiple loudspeakers, and tested with synchronous and asynchronous presentation of AV stimuli (both tone and speech). The minimum audible angle (MAA), the smallest perceptible difference in angle between two sound sources, was compared with or without the visual stimuli in young and older adults with no or minimal sensory deficits. The illusion effect, measured by means of MAAs implemented with HRTFs, was observed with both synchronous and asynchronous visual stimulus, but only with tone and not speech stimulus. The patterns were similar between young and older individuals, indicating the versatility of the modified ventriloquist illusion paradigm.
Collapse
Affiliation(s)
- Marnix Stawicki
- 1Department of Otorhinolaryngology / Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- 2Graduate School of Medical Sciences, Research School of Behavioral and Cognitive Neurosciences (BCN), University of Groningen, Groningen, The Netherlands
| | - Piotr Majdak
- 3Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Deniz Başkent
- 1Department of Otorhinolaryngology / Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- 2Graduate School of Medical Sciences, Research School of Behavioral and Cognitive Neurosciences (BCN), University of Groningen, Groningen, The Netherlands
| |
Collapse
|
10
|
Compensatory Plasticity in the Lateral Extrastriate Visual Cortex Preserves Audiovisual Temporal Processing following Adult-Onset Hearing Loss. Neural Plast 2019; 2019:7946987. [PMID: 31223309 PMCID: PMC6541963 DOI: 10.1155/2019/7946987] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 03/19/2019] [Indexed: 11/17/2022] Open
Abstract
Partial hearing loss can cause neurons in the auditory and audiovisual cortices to increase their responsiveness to visual stimuli; however, behavioral studies in hearing-impaired humans and rats have found that the perceptual ability to accurately judge the relative timing of auditory and visual stimuli is largely unaffected. To investigate the neurophysiological basis of how audiovisual temporal acuity may be preserved in the presence of hearing loss-induced crossmodal plasticity, we exposed adult rats to loud noise and two weeks later performed in vivo electrophysiological recordings in two neighboring regions within the lateral extrastriate visual (V2L) cortex—a multisensory zone known to be responsive to audiovisual stimuli (V2L-Mz) and a predominantly auditory zone (V2L-Az). To examine the cortical layer-specific effects at the level of postsynaptic potentials, a current source density (CSD) analysis was applied to the local field potential (LFP) data recorded in response to auditory and visual stimuli presented at various stimulus onset asynchronies (SOAs). As predicted, differential effects were observed in the neighboring cortical regions' postnoise exposure. Most notably, an analysis of the strength of multisensory response interactions revealed that V2L-Mz lost its sensitivity to the relative timing of the auditory and visual stimuli, due to an increased responsiveness to visual stimulation that produced a prominent audiovisual response irrespective of the SOA. In contrast, not only did the V2L-Az in noise-exposed rats become more responsive to visual stimuli but neurons in this region also inherited the capacity to process audiovisual stimuli with the temporal precision and specificity that was previously restricted to the V2L-Mz. Thus, the present study provides the first demonstration that audiovisual temporal processing can be preserved following moderate hearing loss via compensatory plasticity in the higher-order sensory cortices that is ultimately characterized by a functional transition in the cortical region capable of temporal sensitivity.
Collapse
|
11
|
Schormans AL, Typlt M, Allman BL. Adult-Onset Hearing Impairment Induces Layer-Specific Cortical Reorganization: Evidence of Crossmodal Plasticity and Central Gain Enhancement. Cereb Cortex 2019; 29:1875-1888. [PMID: 29668848 PMCID: PMC6458918 DOI: 10.1093/cercor/bhy067] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 02/22/2018] [Indexed: 11/14/2022] Open
Abstract
Adult-onset hearing impairment can lead to hyperactivity in the auditory pathway (i.e., central gain enhancement) as well as increased cortical responsiveness to nonauditory stimuli (i.e., crossmodal plasticity). However, it remained unclear to what extent hearing loss-induced hyperactivity is relayed beyond the auditory cortex, and thus, whether central gain enhancement competes or coexists with crossmodal plasticity throughout the distinct layers of the audiovisual cortex. To that end, we investigated the effects of partial hearing loss on laminar processing in the auditory, visual and audiovisual cortices of adult rats using extracellular electrophysiological recordings performed 2 weeks after loud noise exposure. Current-source density analyses revealed that central gain enhancement was not relayed to the audiovisual cortex (V2L), and was instead restricted to the granular layer of the higher order auditory area, AuD. In contrast, crossmodal plasticity was evident across multiple cortical layers within V2L, and also manifested in AuD. Surprisingly, despite this coexistence of central gain enhancement and crossmodal plasticity, noise exposure did not disrupt the responsiveness of these neighboring cortical regions to combined audiovisual stimuli. Overall, we have shown for the first time that adult-onset hearing impairment causes a complex assortment of intramodal and crossmodal changes across the layers of higher order sensory cortices.
Collapse
Affiliation(s)
- Ashley L Schormans
- Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Marei Typlt
- Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Brian L Allman
- Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
12
|
Hearing-impaired listeners show increased audiovisual benefit when listening to speech in noise. Neuroimage 2019; 196:261-268. [PMID: 30978494 DOI: 10.1016/j.neuroimage.2019.04.017] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 04/02/2019] [Accepted: 04/04/2019] [Indexed: 11/22/2022] Open
Abstract
Recent studies provide evidence for changes in audiovisual perception as well as for adaptive cross-modal auditory cortex plasticity in older individuals with high-frequency hearing impairments (presbycusis). We here investigated whether these changes facilitate the use of visual information, leading to an increased audiovisual benefit of hearing-impaired individuals when listening to speech in noise. We used a naturalistic design in which older participants with a varying degree of high-frequency hearing loss attended to running auditory or audiovisual speech in noise and detected rare target words. Passages containing only visual speech served as a control condition. Simultaneously acquired scalp electroencephalography (EEG) data were used to study cortical speech tracking. Target word detection accuracy was significantly increased in the audiovisual as compared to the auditory listening condition. The degree of this audiovisual enhancement was positively related to individual high-frequency hearing loss and subjectively reported listening effort in challenging daily life situations, which served as a subjective marker of hearing problems. On the neural level, the early cortical tracking of the speech envelope was enhanced in the audiovisual condition. Similar to the behavioral findings, individual differences in the magnitude of the enhancement were positively associated with listening effort ratings. Our results therefore suggest that hearing-impaired older individuals make increased use of congruent visual information to compensate for the degraded auditory input.
Collapse
|
13
|
Schormans AL, Allman BL. Behavioral Plasticity of Audiovisual Perception: Rapid Recalibration of Temporal Sensitivity but Not Perceptual Binding Following Adult-Onset Hearing Loss. Front Behav Neurosci 2018; 12:256. [PMID: 30429780 PMCID: PMC6220077 DOI: 10.3389/fnbeh.2018.00256] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 10/11/2018] [Indexed: 11/13/2022] Open
Abstract
The ability to accurately integrate or bind stimuli from more than one sensory modality is highly dependent on the features of the stimuli, such as their intensity and relative timing. Previous studies have demonstrated that the ability to perceptually bind stimuli is impaired in various clinical conditions such as autism, dyslexia, schizophrenia, as well as aging. However, it remains unknown if adult-onset hearing loss, separate from aging, influences audiovisual temporal acuity. In the present study, rats were trained using appetitive operant conditioning to perform an audiovisual temporal order judgment (TOJ) task or synchrony judgment (SJ) task in order to investigate the nature and extent that audiovisual temporal acuity is affected by adult-onset hearing loss, with a specific focus on the time-course of perceptual changes following loud noise exposure. In our first series of experiments, we found that audiovisual temporal acuity in normal-hearing rats was influenced by sound intensity, such that when a quieter sound was presented, the rats were biased to perceive the audiovisual stimuli as asynchronous (SJ task), or as though the visual stimulus was presented first (TOJ task). Psychophysical testing demonstrated that noise-induced hearing loss did not alter the rats' temporal sensitivity 2-3 weeks post-noise exposure, despite rats showing an initial difficulty in differentiating the temporal order of audiovisual stimuli. Furthermore, consistent with normal-hearing rats, the timing at which the stimuli were perceived as simultaneous (i.e., the point of subjective simultaneity, PSS) remained sensitive to sound intensity following hearing loss. Contrary to the TOJ task, hearing loss resulted in persistent impairments in asynchrony detection during the SJ task, such that a greater proportion of trials were now perceived as synchronous. Moreover, psychophysical testing found that noise-exposed rats had altered audiovisual synchrony perception, consistent with impaired audiovisual perceptual binding (e.g., an increase in the temporal window of integration on the right side of simultaneity; right temporal binding window (TBW)). Ultimately, our collective results show for the first time that adult-onset hearing loss leads to behavioral plasticity of audiovisual perception, characterized by a rapid recalibration of temporal sensitivity but a persistent impairment in the perceptual binding of audiovisual stimuli.
Collapse
Affiliation(s)
- Ashley L Schormans
- Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada
| | - Brian L Allman
- Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada
| |
Collapse
|
14
|
Butera IM, Stevenson RA, Mangus BD, Woynaroski TG, Gifford RH, Wallace MT. Audiovisual Temporal Processing in Postlingually Deafened Adults with Cochlear Implants. Sci Rep 2018; 8:11345. [PMID: 30054512 PMCID: PMC6063927 DOI: 10.1038/s41598-018-29598-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 07/09/2018] [Indexed: 11/17/2022] Open
Abstract
For many cochlear implant (CI) users, visual cues are vitally important for interpreting the impoverished auditory speech information that an implant conveys. Although the temporal relationship between auditory and visual stimuli is crucial for how this information is integrated, audiovisual temporal processing in CI users is poorly understood. In this study, we tested unisensory (auditory alone, visual alone) and multisensory (audiovisual) temporal processing in postlingually deafened CI users (n = 48) and normal-hearing controls (n = 54) using simultaneity judgment (SJ) and temporal order judgment (TOJ) tasks. We varied the timing onsets between the auditory and visual components of either a syllable/viseme or a simple flash/beep pairing, and participants indicated either which stimulus appeared first (TOJ) or if the pair occurred simultaneously (SJ). Results indicate that temporal binding windows-the interval within which stimuli are likely to be perceptually 'bound'-are not significantly different between groups for either speech or non-speech stimuli. However, the point of subjective simultaneity for speech was less visually leading in CI users, who interestingly, also had improved visual-only TOJ thresholds. Further signal detection analysis suggests that this SJ shift may be due to greater visual bias within the CI group, perhaps reflecting heightened attentional allocation to visual cues.
Collapse
Affiliation(s)
- Iliza M Butera
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA.
| | - Ryan A Stevenson
- Department of Psychology, University of Western Ontario, London, ON, Canada
- Brain and Mind Institute, University of Western Ontario, London, ON, Canada
| | - Brannon D Mangus
- Murfreesboro Medical Clinic and Surgicenter, Murfreesboro, TN, USA
| | - Tiffany G Woynaroski
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - René H Gifford
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mark T Wallace
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
15
|
Shayman CS, Seo JH, Oh Y, Lewis RF, Peterka RJ, Hullar TE. Relationship between vestibular sensitivity and multisensory temporal integration. J Neurophysiol 2018; 120:1572-1577. [PMID: 30020839 DOI: 10.1152/jn.00379.2018] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
A single event can generate asynchronous sensory cues due to variable encoding, transmission, and processing delays. To be interpreted as being associated in time, these cues must occur within a limited time window, referred to as a "temporal binding window" (TBW). We investigated the hypothesis that vestibular deficits could disrupt temporal visual-vestibular integration by determining the relationships between vestibular threshold and TBW in participants with normal vestibular function and with vestibular hypofunction. Vestibular perceptual thresholds to yaw rotation were characterized and compared with the TBWs obtained from participants who judged whether a suprathreshold rotation occurred before or after a brief visual stimulus. Vestibular thresholds ranged from 0.7 to 16.5 deg/s and TBWs ranged from 13.8 to 395 ms. Among all participants, TBW and vestibular thresholds were well correlated ( R2 = 0.674, P < 0.001), with vestibular-deficient patients having higher thresholds and wider TBWs. Participants reported that the rotation onset needed to lead the light flash by an average of 80 ms for the visual and vestibular cues to be perceived as occurring simultaneously. The wide TBWs in vestibular-deficient participants compared with normal functioning participants indicate that peripheral sensory loss can lead to abnormal multisensory integration. A reduced ability to temporally combine sensory cues appropriately may provide a novel explanation for some symptoms reported by patients with vestibular deficits. Even among normal functioning participants, a high correlation between TBW and vestibular thresholds was observed, suggesting that these perceptual measurements are sensitive to small differences in vestibular function. NEW & NOTEWORTHY While spatial visual-vestibular integration has been well characterized, the temporal integration of these cues is not well understood. The relationship between sensitivity to whole body rotation and duration of the temporal window of visual-vestibular integration was examined using psychophysical techniques. These parameters were highly correlated for those with normal vestibular function and for patients with vestibular hypofunction. Reduced temporal integration performance in patients with vestibular hypofunction may explain some symptoms associated with vestibular loss.
Collapse
Affiliation(s)
- Corey S Shayman
- Department of Otolaryngology-Head and Neck Surgery, Oregon Health and Science University , Portland, Oregon
| | - Jae-Hyun Seo
- Department of Otolaryngology-Head and Neck Surgery, Oregon Health and Science University , Portland, Oregon.,Department of Otolaryngology-Head and Neck Surgery, The Catholic University of Korea, Seoul, Republic of Korea
| | - Yonghee Oh
- Department of Otolaryngology-Head and Neck Surgery, Oregon Health and Science University , Portland, Oregon
| | - Richard F Lewis
- Department of Otolaryngology, Harvard Medical School , Boston, Massachusetts.,Department of Neurology, Harvard Medical School , Boston, Massachusetts.,Jenks Vestibular Physiology Laboratory, Massachusetts Eye and Ear Infirmary, Boston, Massachusetts
| | - Robert J Peterka
- National Center for Rehabilitative Auditory Research-VA Portland Health Care System , Portland, Oregon.,Department of Neurology, Oregon Health and Science University , Portland, Oregon
| | - Timothy E Hullar
- Department of Otolaryngology-Head and Neck Surgery, Oregon Health and Science University , Portland, Oregon
| |
Collapse
|
16
|
Stevenson RA, Sheffield SW, Butera IM, Gifford RH, Wallace MT. Multisensory Integration in Cochlear Implant Recipients. Ear Hear 2018; 38:521-538. [PMID: 28399064 DOI: 10.1097/aud.0000000000000435] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Speech perception is inherently a multisensory process involving integration of auditory and visual cues. Multisensory integration in cochlear implant (CI) recipients is a unique circumstance in that the integration occurs after auditory deprivation and the provision of hearing via the CI. Despite the clear importance of multisensory cues for perception, in general, and for speech intelligibility, specifically, the topic of multisensory perceptual benefits in CI users has only recently begun to emerge as an area of inquiry. We review the research that has been conducted on multisensory integration in CI users to date and suggest a number of areas needing further research. The overall pattern of results indicates that many CI recipients show at least some perceptual gain that can be attributable to multisensory integration. The extent of this gain, however, varies based on a number of factors, including age of implantation and specific task being assessed (e.g., stimulus detection, phoneme perception, word recognition). Although both children and adults with CIs obtain audiovisual benefits for phoneme, word, and sentence stimuli, neither group shows demonstrable gain for suprasegmental feature perception. Additionally, only early-implanted children and the highest performing adults obtain audiovisual integration benefits similar to individuals with normal hearing. Increasing age of implantation in children is associated with poorer gains resultant from audiovisual integration, suggesting a sensitive period in development for the brain networks that subserve these integrative functions, as well as length of auditory experience. This finding highlights the need for early detection of and intervention for hearing loss, not only in terms of auditory perception, but also in terms of the behavioral and perceptual benefits of audiovisual processing. Importantly, patterns of auditory, visual, and audiovisual responses suggest that underlying integrative processes may be fundamentally different between CI users and typical-hearing listeners. Future research, particularly in low-level processing tasks such as signal detection will help to further assess mechanisms of multisensory integration for individuals with hearing loss, both with and without CIs.
Collapse
Affiliation(s)
- Ryan A Stevenson
- 1Department of Psychology, University of Western Ontario, London, Ontario, Canada; 2Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada; 3Walter Reed National Military Medical Center, Audiology and Speech Pathology Center, London, Ontario, Canada; 4Vanderbilt Brain Institute, Nashville, Tennesse; 5Vanderbilt Kennedy Center, Nashville, Tennesse; 6Department of Psychology, Vanderbilt University, Nashville, Tennesse; 7Department of Psychiatry, Vanderbilt University Medical Center, Nashville, Tennesse; and 8Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennesse
| | | | | | | | | |
Collapse
|
17
|
Brooks CJ, Chan YM, Anderson AJ, McKendrick AM. Audiovisual Temporal Perception in Aging: The Role of Multisensory Integration and Age-Related Sensory Loss. Front Hum Neurosci 2018; 12:192. [PMID: 29867415 PMCID: PMC5954093 DOI: 10.3389/fnhum.2018.00192] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 04/20/2018] [Indexed: 11/26/2022] Open
Abstract
Within each sensory modality, age-related deficits in temporal perception contribute to the difficulties older adults experience when performing everyday tasks. Since perceptual experience is inherently multisensory, older adults also face the added challenge of appropriately integrating or segregating the auditory and visual cues present in our dynamic environment into coherent representations of distinct objects. As such, many studies have investigated how older adults perform when integrating temporal information across audition and vision. This review covers both direct judgments about temporal information (the sound-induced flash illusion, temporal order, perceived synchrony, and temporal rate discrimination) and judgments regarding stimuli containing temporal information (the audiovisual bounce effect and speech perception). Although an age-related increase in integration has been demonstrated on a variety of tasks, research specifically investigating the ability of older adults to integrate temporal auditory and visual cues has produced disparate results. In this short review, we explore what factors could underlie these divergent findings. We conclude that both task-specific differences and age-related sensory loss play a role in the reported disparity in age-related effects on the integration of auditory and visual temporal information.
Collapse
Affiliation(s)
- Cassandra J Brooks
- Department of Optometry and Vision Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Yu Man Chan
- Department of Optometry and Vision Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Andrew J Anderson
- Department of Optometry and Vision Sciences, The University of Melbourne, Melbourne, VIC, Australia
| | - Allison M McKendrick
- Department of Optometry and Vision Sciences, The University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
18
|
Does hearing aid use affect audiovisual integration in mild hearing impairment? Exp Brain Res 2018; 236:1161-1179. [PMID: 29453491 DOI: 10.1007/s00221-018-5206-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 02/11/2018] [Indexed: 10/18/2022]
Abstract
There is converging evidence for altered audiovisual integration abilities in hearing-impaired individuals and those with profound hearing loss who are provided with cochlear implants, compared to normal-hearing adults. Still, little is known on the effects of hearing aid use on audiovisual integration in mild hearing loss, although this constitutes one of the most prevalent conditions in the elderly and, yet, often remains untreated in its early stages. This study investigated differences in the strength of audiovisual integration between elderly hearing aid users and those with the same degree of mild hearing loss who were not using hearing aids, the non-users, by measuring their susceptibility to the sound-induced flash illusion. We also explored the corresponding window of integration by varying the stimulus onset asynchronies. To examine general group differences that are not attributable to specific hearing aid settings but rather reflect overall changes associated with habitual hearing aid use, the group of hearing aid users was tested unaided while individually controlling for audibility. We found greater audiovisual integration together with a wider window of integration in hearing aid users compared to their age-matched untreated peers. Signal detection analyses indicate that a change in perceptual sensitivity as well as in bias may underlie the observed effects. Our results and comparisons with other studies in normal-hearing older adults suggest that both mild hearing impairment and hearing aid use seem to affect audiovisual integration, possibly in the sense that hearing aid use may reverse the effects of hearing loss on audiovisual integration. We suggest that these findings may be particularly important for auditory rehabilitation and call for a longitudinal study.
Collapse
|
19
|
Spatio-temporal patterns of event-related potentials related to audiovisual synchrony judgments in older adults. Neurobiol Aging 2017; 55:38-48. [DOI: 10.1016/j.neurobiolaging.2017.03.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Revised: 02/24/2017] [Accepted: 03/08/2017] [Indexed: 11/21/2022]
|
20
|
Gordon-Salant S, Yeni-Komshian GH, Fitzgibbons PJ, Willison HM, Freund MS. Recognition of asynchronous auditory-visual speech by younger and older listeners: A preliminary study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:151. [PMID: 28764460 PMCID: PMC5507703 DOI: 10.1121/1.4992026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 03/20/2017] [Accepted: 06/23/2017] [Indexed: 05/15/2023]
Abstract
This study examined the effects of age and hearing loss on recognition of speech presented when the auditory and visual speech information was misaligned in time (i.e., asynchronous). Prior research suggests that older listeners are less sensitive than younger listeners in detecting the presence of asynchronous speech for auditory-lead conditions, but recognition of speech in auditory-lead conditions has not yet been examined. Recognition performance was assessed for sentences and words presented in the auditory-visual modalities with varying degrees of auditory lead and lag. Detection of auditory-visual asynchrony for sentences was assessed to verify that listeners detected these asynchronies. The listeners were younger and older normal-hearing adults and older hearing-impaired adults. Older listeners (regardless of hearing status) exhibited a significant decline in performance in auditory-lead conditions relative to visual lead, unlike younger listeners whose recognition performance was relatively stable across asynchronies. Recognition performance was not correlated with asynchrony detection. However, one of the two cognitive measures assessed, processing speed, was identified in multiple regression analyses as contributing significantly to the variance in auditory-visual speech recognition scores. The findings indicate that, particularly in auditory-lead conditions, listener age has an impact on the ability to recognize asynchronous auditory-visual speech signals.
Collapse
Affiliation(s)
- Sandra Gordon-Salant
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Grace H Yeni-Komshian
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Peter J Fitzgibbons
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Hannah M Willison
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Maya S Freund
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
21
|
Shahin AJ, Shen S, Kerlin JR. Tolerance for audiovisual asynchrony is enhanced by the spectrotemporal fidelity of the speaker's mouth movements and speech. LANGUAGE, COGNITION AND NEUROSCIENCE 2017; 32:1102-1118. [PMID: 28966930 PMCID: PMC5617130 DOI: 10.1080/23273798.2017.1283428] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2016] [Accepted: 01/07/2017] [Indexed: 06/07/2023]
Abstract
We examined the relationship between tolerance for audiovisual onset asynchrony (AVOA) and the spectrotemporal fidelity of the spoken words and the speaker's mouth movements. In two experiments that only varied in the temporal order of sensory modality, visual speech leading (exp1) or lagging (exp2) acoustic speech, participants watched intact and blurred videos of a speaker uttering trisyllabic words and nonwords that were noise vocoded with 4-, 8-, 16-, and 32-channels. They judged whether the speaker's mouth movements and the speech sounds were in-sync or out-of-sync. Individuals perceived synchrony (tolerated AVOA) on more trials when the acoustic speech was more speech-like (8 channels and higher vs. 4 channels), and when visual speech was intact than blurred (exp1 only). These findings suggest that enhanced spectrotemporal fidelity of the audiovisual (AV) signal prompts the brain to widen the window of integration promoting the fusion of temporally distant AV percepts.
Collapse
Affiliation(s)
- Antoine J Shahin
- Center for Mind and Brain, University of California, Davis, CA, 95618
| | - Stanley Shen
- Center for Mind and Brain, University of California, Davis, CA, 95618
| | - Jess R Kerlin
- Center for Mind and Brain, University of California, Davis, CA, 95618
| |
Collapse
|
22
|
Francisco AA, Jesse A, Groen MA, McQueen JM. A General Audiovisual Temporal Processing Deficit in Adult Readers With Dyslexia. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:144-158. [PMID: 28056152 DOI: 10.1044/2016_jslhr-h-15-0375] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 05/26/2016] [Indexed: 05/14/2023]
Abstract
PURPOSE Because reading is an audiovisual process, reading impairment may reflect an audiovisual processing deficit. The aim of the present study was to test the existence and scope of such a deficit in adult readers with dyslexia. METHOD We tested 39 typical readers and 51 adult readers with dyslexia on their sensitivity to the simultaneity of audiovisual speech and nonspeech stimuli, their time window of audiovisual integration for speech (using incongruent /aCa/ syllables), and their audiovisual perception of phonetic categories. RESULTS Adult readers with dyslexia showed less sensitivity to audiovisual simultaneity than typical readers for both speech and nonspeech events. We found no differences between readers with dyslexia and typical readers in the temporal window of integration for audiovisual speech or in the audiovisual perception of phonetic categories. CONCLUSIONS The results suggest an audiovisual temporal deficit in dyslexia that is not specific to speech-related events. But the differences found for audiovisual temporal sensitivity did not translate into a deficit in audiovisual speech perception. Hence, there seems to be a hiatus between simultaneity judgment and perception, suggesting a multisensory system that uses different mechanisms across tasks. Alternatively, it is possible that the audiovisual deficit in dyslexia is only observable when explicit judgments about audiovisual simultaneity are required.
Collapse
Affiliation(s)
- Ana A Francisco
- Behavioural Science Institute, Radboud University, Nijmegen, the Netherlands
| | - Alexandra Jesse
- Department of Psychological and Brain Sciences, University of Massachusetts, Amherst
| | - Margriet A Groen
- Behavioural Science Institute, Radboud University, Nijmegen, the Netherlands
| | - James M McQueen
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the NetherlandsMax Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
| |
Collapse
|
23
|
Puschmann S, Thiel CM. Changed crossmodal functional connectivity in older adults with hearing loss. Cortex 2016; 86:109-122. [PMID: 27930898 DOI: 10.1016/j.cortex.2016.10.014] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 09/01/2016] [Accepted: 10/19/2016] [Indexed: 12/21/2022]
Abstract
Previous work compellingly demonstrates a crossmodal plastic reorganization of auditory cortex in deaf individuals, leading to increased neural responses to non-auditory sensory input. Recent data indicate that crossmodal adaptive plasticity is not restricted to severe hearing impairments, but may also occur as a result of high-frequency hearing loss in older adults and affect audiovisual processing in these subjects. We here used functional magnetic resonance imaging (fMRI) to study the effect of hearing loss in older adults on auditory cortex response patterns as well as on functional connectivity between auditory and visual cortex during audiovisual processing. Older participants with a varying degree of high frequency hearing loss performed an auditory stimulus categorization task, in which they had to categorize frequency-modulated (FM) tones presented alone or in the context of matching or non-matching visual motion. A motion only condition served as control for a visual take-over of auditory cortex. While the individual hearing status did not affect auditory cortex responses to auditory, visual, or audiovisual stimuli, we observed a significant hearing loss-related increase in functional connectivity between auditory cortex and the right motion-sensitive visual area MT+ when processing matching audiovisual input. Hearing loss also modulated resting state connectivity between right area MT+ and parts of the left auditory cortex, suggesting the existence of permanent, task-independent changes in coupling between visual and auditory sensory areas with an increasing degree of hearing loss. Our data thus indicate that hearing loss impacts on functional connectivity between sensory cortices in older adults.
Collapse
Affiliation(s)
- Sebastian Puschmann
- Biological Psychology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany.
| | - Christiane M Thiel
- Biological Psychology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany; Research Center Neurosensory Science, Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany
| |
Collapse
|
24
|
Crossmodal plasticity in auditory, visual and multisensory cortical areas following noise-induced hearing loss in adulthood. Hear Res 2016; 343:92-107. [PMID: 27387138 DOI: 10.1016/j.heares.2016.06.017] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 06/21/2016] [Accepted: 06/27/2016] [Indexed: 11/21/2022]
Abstract
Complete or partial hearing loss results in an increased responsiveness of neurons in the core auditory cortex of numerous species to visual and/or tactile stimuli (i.e., crossmodal plasticity). At present, however, it remains uncertain how adult-onset partial hearing loss affects higher-order cortical areas that normally integrate audiovisual information. To that end, extracellular electrophysiological recordings were performed under anesthesia in noise-exposed rats two weeks post-exposure (0.8-20 kHz at 120 dB SPL for 2 h) and age-matched controls to characterize the nature and extent of crossmodal plasticity in the dorsal auditory cortex (AuD), an area outside of the auditory core, as well as in the neighboring lateral extrastriate visual cortex (V2L), an area known to contribute to audiovisual processing. Computer-generated auditory (noise burst), visual (light flash) and combined audiovisual stimuli were delivered, and the associated spiking activity was used to determine the response profile of each neuron sampled (i.e., unisensory, subthreshold multisensory or bimodal). In both the AuD cortex and the multisensory zone of the V2L cortex, the maximum firing rates were unchanged following noise exposure, and there was a relative increase in the proportion of neurons responsive to visual stimuli, with a concomitant decrease in the number of neurons that were solely responsive to auditory stimuli despite adjusting the sound intensity to account for each rat's hearing threshold. These neighboring cortical areas differed, however, in how noise-induced hearing loss affected audiovisual processing; the total proportion of multisensory neurons significantly decreased in the V2L cortex (control 38.8 ± 3.3% vs. noise-exposed 27.1 ± 3.4%), and dramatically increased in the AuD cortex (control 23.9 ± 3.3% vs. noise-exposed 49.8 ± 6.1%). Thus, following noise exposure, the cortical area showing the greatest relative degree of multisensory convergence transitioned ventrally, away from the audiovisual area, V2L, toward the predominantly auditory area, AuD. Overall, the collective findings of the present study support the suggestion that crossmodal plasticity induced by adult-onset hearing impairment manifests in higher-order cortical areas as a transition in the functional border of the audiovisual cortex.
Collapse
|
25
|
Moradi S, Lidestam B, Rönnberg J. Comparison of Gated Audiovisual Speech Identification in Elderly Hearing Aid Users and Elderly Normal-Hearing Individuals: Effects of Adding Visual Cues to Auditory Speech Stimuli. Trends Hear 2016; 20:20/0/2331216516653355. [PMID: 27317667 PMCID: PMC5562342 DOI: 10.1177/2331216516653355] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The present study compared elderly hearing aid (EHA) users (n = 20) with elderly normal-hearing (ENH) listeners (n = 20) in terms of isolation points (IPs, the shortest time required for correct identification of a speech stimulus) and accuracy of audiovisual gated speech stimuli (consonants, words, and final words in highly and less predictable sentences) presented in silence. In addition, we compared the IPs of audiovisual speech stimuli from the present study with auditory ones extracted from a previous study, to determine the impact of the addition of visual cues. Both participant groups achieved ceiling levels in terms of accuracy in the audiovisual identification of gated speech stimuli; however, the EHA group needed longer IPs for the audiovisual identification of consonants and words. The benefit of adding visual cues to auditory speech stimuli was more evident in the EHA group, as audiovisual presentation significantly shortened the IPs for consonants, words, and final words in less predictable sentences; in the ENH group, audiovisual presentation only shortened the IPs for consonants and words. In conclusion, although the audiovisual benefit was greater for EHA group, this group had inferior performance compared with the ENH group in terms of IPs when supportive semantic context was lacking. Consequently, EHA users needed the initial part of the audiovisual speech signal to be longer than did their counterparts with normal hearing to reach the same level of accuracy in the absence of a semantic context.
Collapse
Affiliation(s)
- Shahram Moradi
- Linnaeus Centre HEAD, Department of Behavioral Sciences and Learning, Linköping University, Sweden
| | - Björn Lidestam
- Department of Behavioral Sciences and Learning, Linköping University, Linköping, Sweden
| | - Jerker Rönnberg
- Linnaeus Centre HEAD, Department of Behavioral Sciences and Learning, Linköping University, Sweden
| |
Collapse
|
26
|
The effect of visual cues on top-down restoration of temporally interrupted speech, with and without further degradations. Hear Res 2015; 328:24-33. [PMID: 26117407 DOI: 10.1016/j.heares.2015.06.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 06/15/2015] [Accepted: 06/22/2015] [Indexed: 11/21/2022]
Abstract
In complex listening situations, cognitive restoration mechanisms are commonly used to enhance perception of degraded speech with inaudible segments. Profoundly hearing-impaired people with a cochlear implant (CI) show less benefit from such mechanisms. However, both normal hearing (NH) listeners and CI users do benefit from visual speech cues in these listening situations. In this study we investigated if an accompanying video of the speaker can enhance the intelligibility of interrupted sentences and the phonemic restoration benefit, measured by an increase in intelligibility when the silent intervals are filled with noise. Similar to previous studies, restoration benefit was observed with interrupted speech without spectral degradations (Experiment 1), but was absent in acoustic simulations of CIs (Experiment 2) and was present again in simulations of electric-acoustic stimulation (Experiment 3). In all experiments, the additional speech information provided by the complementary visual cues lead to overall higher intelligibility, however, these cues did not influence the occurrence or extent of the phonemic restoration benefit of filler noise. Results imply that visual cues do not show a synergistic effect with the filler noise, as adding them equally increased the intelligibility of interrupted sentences with or without the filler noise.
Collapse
|
27
|
Alm M, Behne D. Age mitigates the correlation between cognitive processing speed and audio-visual asynchrony detection in speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:2816-2826. [PMID: 25373981 DOI: 10.1121/1.4896464] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Cognitive processing speed, hearing acuity, and audio-visual (AV) experience have been suggested to influence AV asynchrony detection. Whereas the influence of hearing acuity and AV experience have been explored to some extent, the influence of cognitive processing speed on perceived AV asynchrony has not been directly tested. Therefore, the current study investigates the relationship between cognitive processing speed and AV asynchrony detection in speech and, with hearing acuity controlled, assesses whether age-related AV experience mitigates the strength of this relationship. The cognitive processing speed and AV asynchrony detection by 20 young adults (20-30 years) and 20 middle-aged adults (50-60 years) were measured using auditory, visual and AV recognition reaction time tasks, and an AV synchrony judgment task. Strong correlations between audio, visual, and AV reaction times and AV synchrony window size were found for young adults, but not for middle-aged adults. These findings suggest that although cognitive processing speed influences AV asynchrony detection in speech, the strength of the relationship is seemingly reduced by AV experience.
Collapse
Affiliation(s)
- Magnus Alm
- Department of Psychology, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | - Dawn Behne
- Department of Psychology, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| |
Collapse
|
28
|
Age-related hearing loss increases cross-modal distractibility. Hear Res 2014; 316:28-36. [DOI: 10.1016/j.heares.2014.07.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Revised: 07/11/2014] [Accepted: 07/16/2014] [Indexed: 12/11/2022]
|
29
|
Gordon-Salant S. Aging, Hearing Loss, and Speech Recognition: Stop Shouting, I Can’t Understand You. PERSPECTIVES ON AUDITORY RESEARCH 2014. [DOI: 10.1007/978-1-4614-9102-6_12] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
30
|
Magnotti JF, Ma WJ, Beauchamp MS. Causal inference of asynchronous audiovisual speech. Front Psychol 2013; 4:798. [PMID: 24294207 PMCID: PMC3826594 DOI: 10.3389/fpsyg.2013.00798] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 10/10/2013] [Indexed: 11/13/2022] Open
Abstract
During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions about the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post-hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston TX, USA
| | | | | |
Collapse
|
31
|
Alm M, Behne D. Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:3001-3010. [PMID: 24116435 DOI: 10.1121/1.4820798] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Previous research indicates that perception of audio-visual (AV) synchrony changes in adulthood. Possible explanations for these age differences include a decline in hearing acuity, a decline in cognitive processing speed, and increased experience with AV binding. The current study aims to isolate the effect of AV experience by comparing synchrony judgments from 20 young adults (20 to 30 yrs) and 20 normal-hearing middle-aged adults (50 to 60 yrs), an age range for which a decline of cognitive processing speed is expected to be minimal. When presented with AV stop consonant syllables with asynchronies ranging from 440 ms audio-lead to 440 ms visual-lead, middle-aged adults showed significantly less tolerance for audio-lead than young adults. Middle-aged adults also showed a greater shift in their point of subjective simultaneity than young adults. Natural audio-lead asynchronies are arguably more predictable than natural visual-lead asynchronies, and this predictability may render audio-lead thresholds more prone to experience-related fine-tuning.
Collapse
Affiliation(s)
- Magnus Alm
- Department of Psychology, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | | |
Collapse
|
32
|
Internet video telephony allows speech reading by deaf individuals and improves speech perception by cochlear implant users. PLoS One 2013; 8:e54770. [PMID: 23359119 PMCID: PMC3554620 DOI: 10.1371/journal.pone.0054770] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 12/14/2012] [Indexed: 11/19/2022] Open
Abstract
Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
Collapse
|
33
|
Valkenier B, Duyne JY, Andringa TC, Baskent D. Audiovisual perception of congruent and incongruent Dutch front vowels. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2012; 55:1788-1801. [PMID: 22992710 DOI: 10.1044/1092-4388(2012/11-0227)] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PURPOSE Auditory perception of vowels in background noise is enhanced when combined with visually perceived speech features. The objective of this study was to investigate whether the influence of visual cues on vowel perception extends to incongruent vowels, in a manner similar to the McGurk effect observed with consonants. METHOD Identification of Dutch front vowels /i, y, e, Y/ that share all features other than height and lip-rounding was measured for congruent and incongruent audiovisual conditions. The audio channel was systematically degraded by adding noise, increasing the reliance on visual cues. RESULTS The height feature was more robustly carried over through the auditory channel and the lip-rounding feature through the visual channel. Hence, congruent audiovisual presentation enhanced identification, while incongruent presentation led to perceptual fusions and thus decreased identification. CONCLUSIONS Visual cues influence the identification of congruent as well as incongruent audiovisual vowels. Incongruent visual information results in perceptual fusions, demonstrating that the McGurk effect can be instigated by long phonemes such as vowels. This result extends to the incongruent presentation of the visually less reliably perceived height. The findings stress the importance of audiovisual congruency in communication devices, such as cochlear implants and videoconferencing tools, where the auditory signal could be degraded.
Collapse
|