1
|
Salagovic CA, Stevenson RA, Butler BE. Behavioral Response Modeling to Resolve Listener- and Stimulus-Related Influences on Audiovisual Speech Integration in Cochlear Implant Users. Ear Hear 2024:00003446-990000000-00372. [PMID: 39660814 DOI: 10.1097/aud.0000000000001607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2024]
Abstract
OBJECTIVES Speech intelligibility is supported by the sound of a talker's voice and visual cues related to articulatory movements. The relative contribution of auditory and visual cues to an integrated audiovisual percept varies depending on a listener's environment and sensory acuity. Cochlear implant users rely more on visual cues than those with acoustic hearing to help compensate for the fact that the auditory signal produced by their implant is poorly resolved relative to that of the typically developed cochlea. The relative weight placed on auditory and visual speech cues can be measured by presenting discordant cues across the two modalities and assessing the resulting percept (the McGurk effect). The current literature is mixed with regards to how cochlear implant users respond to McGurk stimuli; some studies suggest they report hearing syllables that represent a fusion of the auditory and visual cues more frequently than typical hearing controls while others report less frequent fusion. However, several of these studies compared implant users to younger control samples despite evidence that the likelihood and strength of audiovisual integration increase with age. Thus, the present study sought to clarify the impacts of hearing status and age on multisensory speech integration using a combination of behavioral analyses and response modeling. DESIGN Cochlear implant users (mean age = 58.9 years), age-matched controls (mean age = 61.5 years), and younger controls (mean age = 25.9 years) completed an online audiovisual speech task. Participants were shown and/or heard four different talkers producing syllables in auditory-alone, visual-alone, and incongruent audiovisual conditions. After each trial, participants reported the syllable they heard or saw from a list of four possible options. RESULTS The younger and older control groups performed similarly in both unisensory conditions. The cochlear implant users performed significantly better than either control group in the visual-alone condition. When responding to the incongruent audiovisual trials, cochlear implant users and age-matched controls experienced significantly more fusion than younger controls. When fusion was not experienced, younger controls were more likely to report the auditorily presented syllable than either implant users or age-matched controls. Conversely, implant users were more likely to report the visually presented syllable than either age-matched controls or younger controls. Modeling of the relationship between stimuli and behavioral responses revealed that younger controls had lower disparity thresholds (i.e., were less likely to experience a fused audiovisual percept) than either the implant users or older controls, while implant users had higher levels of sensory noise (i.e., more variability in the way a given stimulus pair is perceived across multiple presentations) than age-matched controls. CONCLUSIONS Our findings suggest that age and cochlear implantation may have independent effects on McGurk effect perception. Noisy encoding of disparity modeling confirms that age is a strong predictor of an individual's prior likelihood of experiencing audiovisual integration but suggests that hearing status modulates this relationship due to differences in sensory noise during speech encoding. Together, these findings demonstrate that different groups of listeners can arrive at similar levels of performance in different ways, and highlight the need for careful consideration of stimulus- and group-related effects on multisensory speech perception.
Collapse
Affiliation(s)
- Cailey A Salagovic
- Graduate Program in Psychology, University of Western Ontario, London, Ontario, Canada
| | - Ryan A Stevenson
- Department of Psychology, University of Western Ontario, London, Ontario, Canada
- Western Institute for Neuroscience, University of Western Ontario, London, Ontario, Canada
| | - Blake E Butler
- Department of Psychology, University of Western Ontario, London, Ontario, Canada
- Western Institute for Neuroscience, University of Western Ontario, London, Ontario, Canada
- National Centre for Audiology, University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
2
|
Magnotti JF, Lado A, Beauchamp MS. The noisy encoding of disparity model predicts perception of the McGurk effect in native Japanese speakers. Front Neurosci 2024; 18:1421713. [PMID: 38988770 PMCID: PMC11233445 DOI: 10.3389/fnins.2024.1421713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/28/2024] [Indexed: 07/12/2024] Open
Abstract
In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory "pa" + visual "ka" or voiced auditory "ba" + visual "ga") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Anastasia Lado
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
3
|
Layer N, Abdel-Latif KHA, Radecke JO, Müller V, Weglage A, Lang-Roth R, Walger M, Sandmann P. Effects of noise and noise reduction on audiovisual speech perception in cochlear implant users: An ERP study. Clin Neurophysiol 2023; 154:141-156. [PMID: 37611325 DOI: 10.1016/j.clinph.2023.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 06/19/2023] [Accepted: 07/14/2023] [Indexed: 08/25/2023]
Abstract
OBJECTIVE Hearing with a cochlear implant (CI) is difficult in noisy environments, but the use of noise reduction algorithms, specifically ForwardFocus, can improve speech intelligibility. The current event-related potentials (ERP) study examined the electrophysiological correlates of this perceptual improvement. METHODS Ten bimodal CI users performed a syllable-identification task in auditory and audiovisual conditions, with syllables presented from the front and stationary noise presented from the sides. Brainstorm was used for spatio-temporal evaluation of ERPs. RESULTS CI users revealed an audiovisual benefit as reflected by shorter response times and greater activation in temporal and occipital regions at P2 latency. However, in auditory and audiovisual conditions, background noise hampered speech processing, leading to longer response times and delayed auditory-cortex-activation at N1 latency. Nevertheless, activating ForwardFocus resulted in shorter response times, reduced listening effort and enhanced superior-frontal-cortex-activation at P2 latency, particularly in audiovisual conditions. CONCLUSIONS ForwardFocus enhances speech intelligibility in audiovisual speech conditions by potentially allowing the reallocation of attentional resources to relevant auditory speech cues. SIGNIFICANCE This study shows for CI users that background noise and ForwardFocus differentially affect spatio-temporal cortical response patterns, both in auditory and audiovisual speech conditions.
Collapse
Affiliation(s)
- Natalie Layer
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany.
| | | | - Jan-Ole Radecke
- Dept. of Psychiatry and Psychotherapy, University of Lübeck, Germany; Center for Brain, Behaviour and Metabolism (CBBM), University of Lübeck, Germany
| | - Verena Müller
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany
| | - Anna Weglage
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany
| | - Ruth Lang-Roth
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany
| | - Martin Walger
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany; Jean-Uhrmacher-Institute for Clinical ENT Research, University of Cologne, Germany
| | - Pascale Sandmann
- University of Cologne, Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Germany; Department of Otolaryngology, Head and Neck Surgery, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
4
|
Ma KST, Schnupp JWH. The unity hypothesis revisited: can the male/female incongruent McGurk effect be disrupted by familiarization and priming? Front Psychol 2023; 14:1106562. [PMID: 37705948 PMCID: PMC10495566 DOI: 10.3389/fpsyg.2023.1106562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 08/10/2023] [Indexed: 09/15/2023] Open
Abstract
The unity assumption hypothesis contends that higher-level factors, such as a perceiver's belief and prior experience, modulate multisensory integration. The McGurk illusion exemplifies such integration. When a visual velar consonant /ga/ is dubbed with an auditory bilabial /ba/, listeners unify the discrepant signals with knowledge that open lips cannot produce /ba/ and a fusion percept /da/ is perceived. Previous research claimed to have falsified the unity assumption hypothesis by demonstrating the McGurk effect occurs even when a face is dubbed with a voice of the opposite sex, and thus violates expectations from prior experience. But perhaps stronger counter-evidence is needed to prevent perceptual unity than just an apparent incongruence between unfamiliar faces and voices. Here we investigated whether the McGurk illusion with male/female incongruent stimuli can be disrupted by familiarization and priming with an appropriate pairing of face and voice. In an online experiment, the susceptibility of participants to the McGurk illusion was tested with stimuli containing either a male or female face with a voice of incongruent gender. The number of times participants experienced a McGurk illusion was measured before and after a familiarization block, which familiarized them with the true pairings of face and voice. After familiarization and priming, the susceptibility to the McGurk effects decreased significantly on average. The findings support the notion that unity assumptions modulate intersensory bias, and confirm and extend previous studies using male/female incongruent McGurk stimuli.
Collapse
Affiliation(s)
- Kennis S. T. Ma
- The School of Psychology & Counselling, The Open University (UK), Milton Keynes, United Kingdom
| | - Jan W. H. Schnupp
- Department of Neuroscience, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| |
Collapse
|
5
|
Dorsi J, Ostrand R, Rosenblum LD. Semantic priming from McGurk words: Priming depends on perception. Atten Percept Psychophys 2023; 85:1219-1237. [PMID: 37155085 DOI: 10.3758/s13414-023-02689-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2023] [Indexed: 05/10/2023]
Abstract
The McGurk effect is an illusion in which visible articulations alter the perception of auditory speech (e.g., video 'da' dubbed with audio 'ba' may be heard as 'da'). To test the timing of the multisensory processes that underlie the McGurk effect, Ostrand et al. Cognition 151, 96-107, 2016 used incongruent stimuli, such as auditory 'bait' + visual 'date' as primes in a lexical decision task. These authors reported that the auditory word, but not the perceived (visual) word, induced semantic priming, suggesting that the auditory signal alone can provide the input for lexical access, before multisensory integration is complete. Here, we conceptually replicate the design of Ostrand et al. (2016), using different stimuli chosen to optimize the success of the McGurk illusion. In contrast to the results of Ostrand et al. (2016), we find that the perceived (i.e., visual) word of the incongruent stimulus usually induced semantic priming. We further find that the strength of this priming corresponded to the magnitude of the McGurk effect for each word combination. These findings suggest, in contrast to the findings of Ostrand et al. (2016), that lexical access makes use of integrated multisensory information which is perceived by the listener. These findings further suggest that which unimodal signal of a multisensory stimulus is used in lexical access is dependent on the perception of that stimulus.
Collapse
Affiliation(s)
- Josh Dorsi
- Department of Psychology, University of California, Riverside, 900 University Ave, Riverside, CA, 92521, USA.
- Penn State University, College of Medicine, State College, PA, USA.
| | | | - Lawrence D Rosenblum
- Department of Psychology, University of California, Riverside, 900 University Ave, Riverside, CA, 92521, USA
| |
Collapse
|
6
|
Schulze M, Aslan B, Farrher E, Grinberg F, Shah N, Schirmer M, Radbruch A, Stöcker T, Lux S, Philipsen A. Network-Based Differences in Top-Down Multisensory Integration between Adult ADHD and Healthy Controls-A Diffusion MRI Study. Brain Sci 2023; 13:388. [PMID: 36979198 PMCID: PMC10046412 DOI: 10.3390/brainsci13030388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 02/20/2023] [Accepted: 02/21/2023] [Indexed: 02/26/2023] Open
Abstract
BACKGROUND Attention-deficit-hyperactivity disorder (ADHD) is a neurodevelopmental disorder neurobiologically conceptualized as a network disorder in white and gray matter. A relatively new branch in ADHD research is sensory processing. Here, altered sensory processing i.e., sensory hypersensitivity, is reported, especially in the auditory domain. However, our perception is driven by a complex interplay across different sensory modalities. Our brain is specialized in binding those different sensory modalities to a unified percept-a process called multisensory integration (MI) that is mediated through fronto-temporal and fronto-parietal networks. MI has been recently described to be impaired for complex stimuli in adult patients with ADHD. The current study relates MI in adult ADHD with diffusion-weighted imaging. Connectome-based and graph-theoretic analysis was applied to investigate a possible relationship between the ability to integrate multimodal input and network-based ADHD pathophysiology. METHODS Multishell, high-angular resolution diffusion-weighted imaging was performed on twenty-five patients with ADHD (six females, age: 30.08 (SD: 9.3) years) and twenty-four healthy controls (nine females; age: 26.88 (SD: 6.3) years). Structural connectome was created and graph theory was applied to investigate ADHD pathophysiology. Additionally, MI scores, i.e., the percentage of successful multisensory integration derived from the McGurk paradigm, were groupwise correlated with the structural connectome. RESULTS Structural connectivity was elevated in patients with ADHD in network hubs mirroring altered default-mode network activity typically reported for patients with ADHD. Compared to controls, MI was associated with higher connectivity in ADHD between Heschl's gyrus and auditory parabelt regions along with altered fronto-temporal network integrity. CONCLUSION Alterations in structural network integrity in adult ADHD can be extended to multisensory behavior. MI and the respective network integration in ADHD might represent the maturational cortical delay that extends to adulthood with respect to sensory processing.
Collapse
Affiliation(s)
- Marcel Schulze
- Department of Psychiatry and Psychotherapy, University of Bonn, 53113 Bonn, Germany
- Faculty of Psychology and Sports Science, Bielefeld University, 33615 Bielefeld, Germany
| | - Behrem Aslan
- Department of Psychiatry and Psychotherapy, University of Bonn, 53113 Bonn, Germany
| | - Ezequiel Farrher
- Institute of Neuroscience and Medicine 4, INM-4, Forschungszentrum Jülich, 52425 Jülich, Germany
| | - Farida Grinberg
- Institute of Neuroscience and Medicine 4, INM-4, Forschungszentrum Jülich, 52425 Jülich, Germany
| | - Nadim Shah
- Institute of Neuroscience and Medicine 4, INM-4, Forschungszentrum Jülich, 52425 Jülich, Germany
- Department of Neurology, RWTH Aachen University, 50264 Aachen, Germany
- JARA-BRAIN-Translational Medicine, 52056 Aachen, Germany
- Institute of Neuroscience and Medicine 11, INM–11, JARA, Forschungszentrum Jülich, 52425 Jülich, Germany
| | - Markus Schirmer
- Clinic for Neuroradiology, University Hospital Bonn, 53127 Bonn, Germany
- J. Philip Kistler Stroke Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Alexander Radbruch
- Clinic for Neuroradiology, University Hospital Bonn, 53127 Bonn, Germany
| | - Tony Stöcker
- German Center for Neurodegenerative Diseases (DZNE), 53127 Bonn, Germany
| | - Silke Lux
- Department of Psychiatry and Psychotherapy, University of Bonn, 53113 Bonn, Germany
| | - Alexandra Philipsen
- Department of Psychiatry and Psychotherapy, University of Bonn, 53113 Bonn, Germany
| |
Collapse
|
7
|
Butera IM, Stevenson RA, Gifford RH, Wallace MT. Visually biased Perception in Cochlear Implant Users: A Study of the McGurk and Sound-Induced Flash Illusions. Trends Hear 2023; 27:23312165221076681. [PMID: 37377212 PMCID: PMC10334005 DOI: 10.1177/23312165221076681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 12/08/2021] [Accepted: 01/10/2021] [Indexed: 06/29/2023] Open
Abstract
The reduction in spectral resolution by cochlear implants oftentimes requires complementary visual speech cues to facilitate understanding. Despite substantial clinical characterization of auditory-only speech measures, relatively little is known about the audiovisual (AV) integrative abilities that most cochlear implant (CI) users rely on for daily speech comprehension. In this study, we tested AV integration in 63 CI users and 69 normal-hearing (NH) controls using the McGurk and sound-induced flash illusions. To our knowledge, this study is the largest to-date measuring the McGurk effect in this population and the first that tests the sound-induced flash illusion (SIFI). When presented with conflicting AV speech stimuli (i.e., the phoneme "ba" dubbed onto the viseme "ga"), we found that 55 CI users (87%) reported a fused percept of "da" or "tha" on at least one trial. After applying an error correction based on unisensory responses, we found that among those susceptible to the illusion, CI users experienced lower fusion than controls-a result that was concordant with results from the SIFI where the pairing of a single circle flashing on the screen with multiple beeps resulted in fewer illusory flashes for CI users. While illusion perception in these two tasks appears to be uncorrelated among CI users, we identified a negative correlation in the NH group. Because neither illusion appears to provide further explanation of variability in CI outcome measures, further research is needed to determine how these findings relate to CI users' speech understanding, particularly in ecological listening conditions that are naturally multisensory.
Collapse
Affiliation(s)
- Iliza M. Butera
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
| | - Ryan A. Stevenson
- Department of Psychology, University of
Western Ontario, London, ON, Canada
- Brain and Mind Institute, University of
Western Ontario, London, ON, Canada
| | - René H. Gifford
- Department of Hearing and Speech
Sciences, Vanderbilt University, Nashville, TN, USA
| | - Mark T. Wallace
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing and Speech
Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt
University Medical Center, Nashville, TN, USA
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
8
|
Burkhardt P, Müller V, Meister H, Weglage A, Lang-Roth R, Walger M, Sandmann P. Age effects on cognitive functions and speech-in-noise processing: An event-related potential study with cochlear-implant users and normal-hearing listeners. Front Neurosci 2022; 16:1005859. [PMID: 36620447 PMCID: PMC9815545 DOI: 10.3389/fnins.2022.1005859] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 11/15/2022] [Indexed: 12/24/2022] Open
Abstract
A cochlear implant (CI) can partially restore hearing in individuals with profound sensorineural hearing loss. However, electrical hearing with a CI is limited and highly variable. The current study aimed to better understand the different factors contributing to this variability by examining how age affects cognitive functions and cortical speech processing in CI users. Electroencephalography (EEG) was applied while two groups of CI users (young and elderly; N = 13 each) and normal-hearing (NH) listeners (young and elderly; N = 13 each) performed an auditory sentence categorization task, including semantically correct and incorrect sentences presented either with or without background noise. Event-related potentials (ERPs) representing earlier, sensory-driven processes (N1-P2 complex to sentence onset) and later, cognitive-linguistic integration processes (N400 to semantically correct/incorrect sentence-final words) were compared between the different groups and speech conditions. The results revealed reduced amplitudes and prolonged latencies of auditory ERPs in CI users compared to NH listeners, both at earlier (N1, P2) and later processing stages (N400 effect). In addition to this hearing-group effect, CI users and NH listeners showed a comparable background-noise effect, as indicated by reduced hit rates and reduced (P2) and delayed (N1/P2) ERPs in conditions with background noise. Moreover, we observed an age effect in CI users and NH listeners, with young individuals showing improved specific cognitive functions (working memory capacity, cognitive flexibility and verbal learning/retrieval), reduced latencies (N1/P2), decreased N1 amplitudes and an increased N400 effect when compared to the elderly. In sum, our findings extend previous research by showing that the CI users' speech processing is impaired not only at earlier (sensory) but also at later (semantic integration) processing stages, both in conditions with and without background noise. Using objective ERP measures, our study provides further evidence of strong age effects on cortical speech processing, which can be observed in both the NH listeners and the CI users. We conclude that elderly individuals require more effortful processing at sensory stages of speech processing, which however seems to be at the cost of the limited resources available for the later semantic integration processes.
Collapse
Affiliation(s)
- Pauline Burkhardt
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany,*Correspondence: Pauline Burkhardt, ; orcid.org/0000-0001-9850-9881
| | - Verena Müller
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Hartmut Meister
- Jean-Uhrmacher-Institute for Clinical ENT-Research, University of Cologne, Cologne, Germany
| | - Anna Weglage
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Ruth Lang-Roth
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Martin Walger
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany,Jean-Uhrmacher-Institute for Clinical ENT-Research, University of Cologne, Cologne, Germany
| | - Pascale Sandmann
- Department of Otorhinolaryngology, Head and Neck Surgery, Audiology and Pediatric Audiology, Cochlear Implant Center, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| |
Collapse
|
9
|
Xiu B, Paul BT, Chen JM, Le TN, Lin VY, Dimitrijevic A. Neural responses to naturalistic audiovisual speech are related to listening demand in cochlear implant users. Front Hum Neurosci 2022; 16:1043499. [DOI: 10.3389/fnhum.2022.1043499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/21/2022] [Indexed: 11/09/2022] Open
Abstract
There is a weak relationship between clinical and self-reported speech perception outcomes in cochlear implant (CI) listeners. Such poor correspondence may be due to differences in clinical and “real-world” listening environments and stimuli. Speech in the real world is often accompanied by visual cues, background environmental noise, and is generally in a conversational context, all factors that could affect listening demand. Thus, our objectives were to determine if brain responses to naturalistic speech could index speech perception and listening demand in CI users. Accordingly, we recorded high-density electroencephalogram (EEG) while CI users listened/watched a naturalistic stimulus (i.e., the television show, “The Office”). We used continuous EEG to quantify “speech neural tracking” (i.e., TRFs, temporal response functions) to the show’s soundtrack and 8–12 Hz (alpha) brain rhythms commonly related to listening effort. Background noise at three different signal-to-noise ratios (SNRs), +5, +10, and +15 dB were presented to vary the difficulty of following the television show, mimicking a natural noisy environment. The task also included an audio-only (no video) condition. After each condition, participants subjectively rated listening demand and the degree of words and conversations they felt they understood. Fifteen CI users reported progressively higher degrees of listening demand and less words and conversation with increasing background noise. Listening demand and conversation understanding in the audio-only condition was comparable to that of the highest noise condition (+5 dB). Increasing background noise affected speech neural tracking at a group level, in addition to eliciting strong individual differences. Mixed effect modeling showed that listening demand and conversation understanding were correlated to early cortical speech tracking, such that high demand and low conversation understanding occurred with lower amplitude TRFs. In the high noise condition, greater listening demand was negatively correlated to parietal alpha power, where higher demand was related to lower alpha power. No significant correlations were observed between TRF/alpha and clinical speech perception scores. These results are similar to previous findings showing little relationship between clinical speech perception and quality-of-life in CI users. However, physiological responses to complex natural speech may provide an objective measure of aspects of quality-of-life measures like self-perceived listening demand.
Collapse
|
10
|
Butera IM, Larson ED, DeFreese AJ, Lee AK, Gifford RH, Wallace MT. Functional localization of audiovisual speech using near infrared spectroscopy. Brain Topogr 2022; 35:416-430. [PMID: 35821542 PMCID: PMC9334437 DOI: 10.1007/s10548-022-00904-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 05/19/2022] [Indexed: 11/21/2022]
Abstract
Visual cues are especially vital for hearing impaired individuals such as cochlear implant (CI) users to understand speech in noise. Functional Near Infrared Spectroscopy (fNIRS) is a light-based imaging technology that is ideally suited for measuring the brain activity of CI users due to its compatibility with both the ferromagnetic and electrical components of these implants. In a preliminary step toward better elucidating the behavioral and neural correlates of audiovisual (AV) speech integration in CI users, we designed a speech-in-noise task and measured the extent to which 24 normal hearing individuals could integrate the audio of spoken monosyllabic words with the corresponding visual signals of a female speaker. In our behavioral task, we found that audiovisual pairings provided average improvements of 103% and 197% over auditory-alone listening conditions in -6 and -9 dB signal-to-noise ratios consisting of multi-talker background noise. In an fNIRS task using similar stimuli, we measured activity during auditory-only listening, visual-only lipreading, and AV listening conditions. We identified cortical activity in all three conditions over regions of middle and superior temporal cortex typically associated with speech processing and audiovisual integration. In addition, three channels active during the lipreading condition showed uncorrected correlations associated with behavioral measures of audiovisual gain as well as with the McGurk effect. Further work focusing primarily on the regions of interest identified in this study could test how AV speech integration may differ for CI users who rely on this mechanism for daily communication.
Collapse
Affiliation(s)
- Iliza M Butera
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA.
| | - Eric D Larson
- Institute for Learning & Brain Sciences, University of Washington, Seattle Washington, USA
| | - Andrea J DeFreese
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
| | - Adrian Kc Lee
- Institute for Learning & Brain Sciences, University of Washington, Seattle Washington, USA
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, USA
| | - René H Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
| | - Mark T Wallace
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
11
|
Event-related potentials reveal early visual-tactile integration in the deaf. PSIHOLOGIJA 2022. [DOI: 10.2298/psi210407003l] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
This study examined visual-tactile perceptual integration in deaf and normal hearing individuals. Participants were presented with photos of faces or pictures of an oval in either a visual mode or a visual-tactile mode in a recognition learning task. Event-related potentials (ERPs) were recorded when participants recognized real faces and pictures of ovals in learning stage. Results from the parietal-occipital region showed that photos of faces accompanied with vibration elicited more positive-going ERP responses than photos of faces without vibration as indicated in the components of P1 and N170 in both deaf and hearing individuals. However, pictures of ovals accompanied with vibration produced more positive-going ERP responses than pictures of ovals without vibration in N170, which was only found in deaf individuals. A reversed pattern was shown in the temporal region indicating that real faces with vibration elicited less positive ERPs than photos of faces without vibration in both N170 and N300 for deaf, but such pattern did not appear in N170 and N300 for normal hearing. The results suggest that multisensory integration across the visual and tactile modality involves more fundamental perceptual regions than auditory regions. Moreover, auditory deprivation played an essential role at the perceptual encoding stage of the multisensory integration.
Collapse
|
12
|
Wahn B, Schmitz L, Kingstone A, Böckler-Raettig A. When eyes beat lips: speaker gaze affects audiovisual integration in the McGurk illusion. PSYCHOLOGICAL RESEARCH 2021; 86:1930-1943. [PMID: 34854983 PMCID: PMC9363401 DOI: 10.1007/s00426-021-01618-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 11/10/2021] [Indexed: 11/26/2022]
Abstract
Eye contact is a dynamic social signal that captures attention and plays a critical role in human communication. In particular, direct gaze often accompanies communicative acts in an ostensive function: a speaker directs her gaze towards the addressee to highlight the fact that this message is being intentionally communicated to her. The addressee, in turn, integrates the speaker’s auditory and visual speech signals (i.e., her vocal sounds and lip movements) into a unitary percept. It is an open question whether the speaker’s gaze affects how the addressee integrates the speaker’s multisensory speech signals. We investigated this question using the classic McGurk illusion, an illusory percept created by presenting mismatching auditory (vocal sounds) and visual information (speaker’s lip movements). Specifically, we manipulated whether the speaker (a) moved his eyelids up/down (i.e., open/closed his eyes) prior to speaking or did not show any eye motion, and (b) spoke with open or closed eyes. When the speaker’s eyes moved (i.e., opened or closed) before an utterance, and when the speaker spoke with closed eyes, the McGurk illusion was weakened (i.e., addressees reported significantly fewer illusory percepts). In line with previous research, this suggests that motion (opening or closing), as well as the closed state of the speaker’s eyes, captured addressees’ attention, thereby reducing the influence of the speaker’s lip movements on the addressees’ audiovisual integration process. Our findings reaffirm the power of speaker gaze to guide attention, showing that its dynamics can modulate low-level processes such as the integration of multisensory speech signals.
Collapse
Affiliation(s)
- Basil Wahn
- Department of Psychology, Leibniz Universität Hannover, Hannover, Germany.
| | - Laura Schmitz
- Institute of Sports Science, Leibniz Universität Hannover, Hannover, Germany
| | - Alan Kingstone
- Department of Psychology, University of British Columbia, Vancouver, BC, Canada
| | | |
Collapse
|
13
|
Jenson D. Audiovisual incongruence differentially impacts left and right hemisphere sensorimotor oscillations: Potential applications to production. PLoS One 2021; 16:e0258335. [PMID: 34618866 PMCID: PMC8496780 DOI: 10.1371/journal.pone.0258335] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 09/26/2021] [Indexed: 11/21/2022] Open
Abstract
Speech production gives rise to distinct auditory and somatosensory feedback signals which are dynamically integrated to enable online monitoring and error correction, though it remains unclear how the sensorimotor system supports the integration of these multimodal signals. Capitalizing on the parity of sensorimotor processes supporting perception and production, the current study employed the McGurk paradigm to induce multimodal sensory congruence/incongruence. EEG data from a cohort of 39 typical speakers were decomposed with independent component analysis to identify bilateral mu rhythms; indices of sensorimotor activity. Subsequent time-frequency analyses revealed bilateral patterns of event related desynchronization (ERD) across alpha and beta frequency ranges over the time course of perceptual events. Right mu activity was characterized by reduced ERD during all cases of audiovisual incongruence, while left mu activity was attenuated and protracted in McGurk trials eliciting sensory fusion. Results were interpreted to suggest distinct hemispheric contributions, with right hemisphere mu activity supporting a coarse incongruence detection process and left hemisphere mu activity reflecting a more granular level of analysis including phonological identification and incongruence resolution. Findings are also considered in regard to incongruence detection and resolution processes during production.
Collapse
Affiliation(s)
- David Jenson
- Department of Speech and Hearing Sciences, Washington State University, Spokane, Washington, United States of America
| |
Collapse
|
14
|
Schulze M, Aslan B, Stöcker T, Stirnberg R, Lux S, Philipsen A. Disentangling early versus late audiovisual integration in adult ADHD: a combined behavioural and resting-state connectivity study. J Psychiatry Neurosci 2021; 46:E528-E537. [PMID: 34548387 PMCID: PMC8526154 DOI: 10.1503/jpn.210017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 05/27/2021] [Accepted: 06/21/2021] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Studies investigating sensory processing in attention-deficit/hyperactivity disorder (ADHD) have shown altered visual and auditory processing. However, evidence is lacking for audiovisual interplay - namely, multisensory integration. As well, neuronal dysregulation at rest (e.g., aberrant within- or between-network functional connectivity) may account for difficulties with integration across the senses in ADHD. We investigated whether sensory processing was altered at the multimodal level in adult ADHD and included resting-state functional connectivity to illustrate a possible overlap between deficient network connectivity and the ability to integrate stimuli. METHODS We tested 25 patients with ADHD and 24 healthy controls using 2 illusionary paradigms: the sound-induced flash illusion and the McGurk illusion. We applied the Mann-Whitney U test to assess statistical differences between groups. We acquired resting-state functional MRIs on a 3.0 T Siemens magnetic resonance scanner, using a highly accelerated 3-dimensional echo planar imaging sequence. RESULTS For the sound-induced flash illusion, susceptibility and reaction time were not different between the 2 groups. For the McGurk illusion, susceptibility was significantly lower for patients with ADHD, and reaction times were significantly longer. At a neuronal level, resting-state functional connectivity in the ADHD group was more highly regulated in polymodal regions that play a role in binding unimodal sensory inputs from different modalities and enabling sensory-to-cognition integration. LIMITATIONS We did not explicitly screen for autism spectrum disorder, which has high rates of comorbidity with ADHD and also involves impairments in multisensory integration. Although the patients were carefully screened by our outpatient department, we could not rule out the possibility of autism spectrum disorder in some participants. CONCLUSION Unimodal hypersensitivity seems to have no influence on the integration of basal stimuli, but it might have negative consequences for the multisensory integration of complex stimuli. This finding was supported by observations of higher resting-state functional connectivity between unimodal sensory areas and polymodal multisensory integration convergence zones for complex stimuli.
Collapse
Affiliation(s)
- Marcel Schulze
- From the Department of Psychiatry and Psychotherapy, University of Bonn, Bonn, Germany (Schulze, Aslan, Lux, Philipsen); Biopsychology and Cognitive Neuroscience, Faculty of Psychology and Sports Science, Bielefeld University, Bielefeld, Germany (Schulze); the German Centre for Neurodegenerative Diseases (DZNE), Bonn, Germany (Stöcker, Stirnberg); and the Department of Physics and Astronomy, University of Bonn, Bonn, Germany (Stöcker)
| | | | | | | | | | | |
Collapse
|
15
|
Audio-visual integration in noise: Influence of auditory and visual stimulus degradation on eye movements and perception of the McGurk effect. Atten Percept Psychophys 2020; 82:3544-3557. [PMID: 32533526 PMCID: PMC7788022 DOI: 10.3758/s13414-020-02042-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Seeing a talker’s face can aid audiovisual (AV) integration when speech is presented in noise. However, few studies have simultaneously manipulated auditory and visual degradation. We aimed to establish how degrading the auditory and visual signal affected AV integration. Where people look on the face in this context is also of interest; Buchan, Paré and Munhall (Brain Research, 1242, 162–171, 2008) found fixations on the mouth increased in the presence of auditory noise whilst Wilson, Alsius, Paré and Munhall (Journal of Speech, Language, and Hearing Research, 59(4), 601–615, 2016) found mouth fixations decreased with decreasing visual resolution. In Condition 1, participants listened to clear speech, and in Condition 2, participants listened to vocoded speech designed to simulate the information provided by a cochlear implant. Speech was presented in three levels of auditory noise and three levels of visual blurring. Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses. Participants fixated the mouth more on trials when the McGurk effect was perceived. Adding auditory noise led to people fixating the mouth more, while visual degradation led to people fixating the mouth less. Combined, the results suggest that modality preference and where people look during AV integration of incongruent syllables varies according to the quality of information available.
Collapse
|
16
|
Magnotti JF, Dzeda KB, Wegner-Clemens K, Rennig J, Beauchamp MS. Weak observer-level correlation and strong stimulus-level correlation between the McGurk effect and audiovisual speech-in-noise: A causal inference explanation. Cortex 2020; 133:371-383. [PMID: 33221701 DOI: 10.1016/j.cortex.2020.10.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 08/05/2020] [Accepted: 10/05/2020] [Indexed: 11/25/2022]
Abstract
The McGurk effect is a widely used measure of multisensory integration during speech perception. Two observations have raised questions about the validity of the effect as a tool for understanding speech perception. First, there is high variability in perception of the McGurk effect across different stimuli and observers. Second, across observers there is low correlation between McGurk susceptibility and recognition of visual speech paired with auditory speech-in-noise, another common measure of multisensory integration. Using the framework of the causal inference of multisensory speech (CIMS) model, we explored the relationship between the McGurk effect, syllable perception, and sentence perception in seven experiments with a total of 296 different participants. Perceptual reports revealed a relationship between the efficacy of different McGurk stimuli created from the same talker and perception of the auditory component of the McGurk stimuli presented in isolation, both with and without added noise. The CIMS model explained this strong stimulus-level correlation using the principles of noisy sensory encoding followed by optimal cue combination within a common representational space across speech types. Because the McGurk effect (but not speech-in-noise) requires the resolution of conflicting cues between modalities, there is an additional source of individual variability that can explain the weak observer-level correlation between McGurk and noisy speech. Power calculations show that detecting this weak correlation requires studies with many more participants than those conducted to-date. Perception of the McGurk effect and other types of speech can be explained by a common theoretical framework that includes causal inference, suggesting that the McGurk effect is a valid and useful experimental tool.
Collapse
|
17
|
Zhou X, Innes-Brown H, McKay CM. Audio-visual integration in cochlear implant listeners and the effect of age difference. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4144. [PMID: 31893708 DOI: 10.1121/1.5134783] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Accepted: 10/30/2019] [Indexed: 06/10/2023]
Abstract
This study aimed to investigate differences in audio-visual (AV) integration between cochlear implant (CI) listeners and normal-hearing (NH) adults. A secondary aim was to investigate the effect of age differences by examining AV integration in groups of older and younger NH adults. Seventeen CI listeners, 13 similarly aged NH adults, and 16 younger NH adults were recruited. Two speech identification experiments were conducted to evaluate AV integration of speech cues. In the first experiment, reaction times in audio-alone (A-alone), visual-alone (V-alone), and AV conditions were measured during a speeded task in which participants were asked to identify a target sound /aSa/ among 11 alternatives. A race model was applied to evaluate AV integration. In the second experiment, identification accuracies were measured using a closed set of consonants and an open set of consonant-nucleus-consonant words. The authors quantified AV integration using a combination of a probability model and a cue integration model (which model participants' AV accuracy by assuming no or optimal integration, respectively). The results found that experienced CI listeners showed no better AV integration than their similarly aged NH adults. Further, there was no significant difference in AV integration between the younger and older NH adults.
Collapse
Affiliation(s)
- Xin Zhou
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| | - Hamish Innes-Brown
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| | - Colette M McKay
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| |
Collapse
|
18
|
Ritter C, Vongpaisal T. Multimodal and Spectral Degradation Effects on Speech and Emotion Recognition in Adult Listeners. Trends Hear 2019; 22:2331216518804966. [PMID: 30378469 PMCID: PMC6236866 DOI: 10.1177/2331216518804966] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
For cochlear implant (CI) users, degraded spectral input hampers the
understanding of prosodic vocal emotion, especially in difficult listening
conditions. Using a vocoder simulation of CI hearing, we examined the extent to
which informative multimodal cues in a talker’s spoken expressions improve
normal hearing (NH) adults’ speech and emotion perception under different levels
of spectral degradation (two, three, four, and eight spectral bands).
Participants repeated the words verbatim and identified emotions (among four
alternative options: happy, sad, angry, and neutral) in meaningful sentences
that are semantically congruent with the expression of the intended emotion.
Sentences were presented in their natural speech form and in speech sampled
through a noise-band vocoder in sound (auditory-only) and video
(auditory–visual) recordings of a female talker. Visual information had a more
pronounced benefit in enhancing speech recognition in the lower spectral band
conditions. Spectral degradation, however, did not interfere with emotion
recognition performance when dynamic visual cues in a talker’s expression are
provided as participants scored at ceiling levels across all spectral band
conditions. Our use of familiar sentences that contained congruent semantic and
prosodic information have high ecological validity, which likely optimized
listener performance under simulated CI hearing and may better predict CI users’
outcomes in everyday listening contexts.
Collapse
Affiliation(s)
- Chantel Ritter
- 1 Department of Psychology, MacEwan University, Alberta, Canada
| | - Tara Vongpaisal
- 1 Department of Psychology, MacEwan University, Alberta, Canada
| |
Collapse
|
19
|
Magnotti JF, Smith KB, Salinas M, Mays J, Zhu LL, Beauchamp MS. A causal inference explanation for enhancement of multisensory integration by co-articulation. Sci Rep 2018; 8:18032. [PMID: 30575791 PMCID: PMC6303389 DOI: 10.1038/s41598-018-36772-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 11/22/2018] [Indexed: 11/09/2022] Open
Abstract
The McGurk effect is a popular assay of multisensory integration in which participants report the illusory percept of "da" when presented with incongruent auditory "ba" and visual "ga" (AbaVga). While the original publication describing the effect found that 98% of participants perceived it, later studies reported much lower prevalence, ranging from 17% to 81%. Understanding the source of this variability is important for interpreting the panoply of studies that examine McGurk prevalence between groups, including clinical populations such as individuals with autism or schizophrenia. The original publication used stimuli consisting of multiple repetitions of a co-articulated syllable (three repetitions, AgagaVbaba). Later studies used stimuli without repetition or co-articulation (AbaVga) and used congruent syllables from the same talker as a control. In three experiments, we tested how stimulus repetition, co-articulation, and talker repetition affect McGurk prevalence. Repetition with co-articulation increased prevalence by 20%, while repetition without co-articulation and talker repetition had no effect. A fourth experiment compared the effect of the on-line testing used in the first three experiments with the in-person testing used in the original publication; no differences were observed. We interpret our results in the framework of causal inference: co-articulation increases the evidence that auditory and visual speech tokens arise from the same talker, increasing tolerance for content disparity and likelihood of integration. The results provide a principled explanation for how co-articulation aids multisensory integration and can explain the high prevalence of the McGurk effect in the initial publication.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA.
| | - Kristen B Smith
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Marcelo Salinas
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Jacqunae Mays
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA
| | - Lin L Zhu
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA
| | - Michael S Beauchamp
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, USA.
- Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
20
|
Published estimates of group differences in multisensory integration are inflated. PLoS One 2018; 13:e0202908. [PMID: 30231054 PMCID: PMC6145544 DOI: 10.1371/journal.pone.0202908] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 08/10/2018] [Indexed: 11/19/2022] Open
Abstract
A common measure of multisensory integration is the McGurk effect, an illusion in which incongruent auditory and visual speech are integrated to produce an entirely different percept. Published studies report that participants who differ in age, gender, culture, native language, or traits related to neurological or psychiatric disorders also differ in their susceptibility to the McGurk effect. These group-level differences are used as evidence for fundamental alterations in sensory processing between populations. Using empirical data and statistical simulations tested under a range of conditions, we show that published estimates of group differences in the McGurk effect are inflated when only statistically significant (p < 0.05) results are published. With a sample size typical of published studies, a group difference of 10% would be reported as 31%. As a consequence of this inflation, follow-up studies often fail to replicate published reports of large between-group differences. Inaccurate estimates of effect sizes and replication failures are especially problematic in studies of clinical populations involving expensive and time-consuming interventions, such as training paradigms to improve sensory processing. Reducing effect size inflation and increasing replicability requires increasing the number of participants by an order of magnitude compared with current practice.
Collapse
|
21
|
Stevenson RA, Sheffield SW, Butera IM, Gifford RH, Wallace MT. Multisensory Integration in Cochlear Implant Recipients. Ear Hear 2018; 38:521-538. [PMID: 28399064 DOI: 10.1097/aud.0000000000000435] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Speech perception is inherently a multisensory process involving integration of auditory and visual cues. Multisensory integration in cochlear implant (CI) recipients is a unique circumstance in that the integration occurs after auditory deprivation and the provision of hearing via the CI. Despite the clear importance of multisensory cues for perception, in general, and for speech intelligibility, specifically, the topic of multisensory perceptual benefits in CI users has only recently begun to emerge as an area of inquiry. We review the research that has been conducted on multisensory integration in CI users to date and suggest a number of areas needing further research. The overall pattern of results indicates that many CI recipients show at least some perceptual gain that can be attributable to multisensory integration. The extent of this gain, however, varies based on a number of factors, including age of implantation and specific task being assessed (e.g., stimulus detection, phoneme perception, word recognition). Although both children and adults with CIs obtain audiovisual benefits for phoneme, word, and sentence stimuli, neither group shows demonstrable gain for suprasegmental feature perception. Additionally, only early-implanted children and the highest performing adults obtain audiovisual integration benefits similar to individuals with normal hearing. Increasing age of implantation in children is associated with poorer gains resultant from audiovisual integration, suggesting a sensitive period in development for the brain networks that subserve these integrative functions, as well as length of auditory experience. This finding highlights the need for early detection of and intervention for hearing loss, not only in terms of auditory perception, but also in terms of the behavioral and perceptual benefits of audiovisual processing. Importantly, patterns of auditory, visual, and audiovisual responses suggest that underlying integrative processes may be fundamentally different between CI users and typical-hearing listeners. Future research, particularly in low-level processing tasks such as signal detection will help to further assess mechanisms of multisensory integration for individuals with hearing loss, both with and without CIs.
Collapse
Affiliation(s)
- Ryan A Stevenson
- 1Department of Psychology, University of Western Ontario, London, Ontario, Canada; 2Brain and Mind Institute, University of Western Ontario, London, Ontario, Canada; 3Walter Reed National Military Medical Center, Audiology and Speech Pathology Center, London, Ontario, Canada; 4Vanderbilt Brain Institute, Nashville, Tennesse; 5Vanderbilt Kennedy Center, Nashville, Tennesse; 6Department of Psychology, Vanderbilt University, Nashville, Tennesse; 7Department of Psychiatry, Vanderbilt University Medical Center, Nashville, Tennesse; and 8Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennesse
| | | | | | | | | |
Collapse
|
22
|
Magnotti JF, Basu Mallick D, Beauchamp MS. Reducing Playback Rate of Audiovisual Speech Leads to a Surprising Decrease in the McGurk Effect. Multisens Res 2018; 31:19-38. [DOI: 10.1163/22134808-00002586] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 06/03/2017] [Indexed: 11/19/2022]
Abstract
We report the unexpected finding that slowing video playback decreases perception of the McGurk effect. This reduction is counter-intuitive because the illusion depends on visual speech influencing the perception of auditory speech, and slowing speech should increase the amount of visual information available to observers. We recorded perceptual data from 110 subjects viewing audiovisual syllables (either McGurk or congruent control stimuli) played back at one of three rates: the rate used by the talker during recording (the natural rate), a slow rate (50% of natural), or a fast rate (200% of natural). We replicated previous studies showing dramatic variability in McGurk susceptibility at the natural rate, ranging from 0–100% across subjects and from 26–76% across the eight McGurk stimuli tested. Relative to the natural rate, slowed playback reduced the frequency of McGurk responses by 11% (79% of subjects showed a reduction) and reduced congruent accuracy by 3% (25% of subjects showed a reduction). Fast playback rate had little effect on McGurk responses or congruent accuracy. To determine whether our results are consistent with Bayesian integration, we constructed a Bayes-optimal model that incorporated two assumptions: individuals combine auditory and visual information according to their reliability, and changing playback rate affects sensory reliability. The model reproduced both our findings of large individual differences and the playback rate effect. This work illustrates that surprises remain in the McGurk effect and that Bayesian integration provides a useful framework for understanding audiovisual speech perception.
Collapse
Affiliation(s)
- John F. Magnotti
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, USA
| | | | - Michael S. Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, USA
| |
Collapse
|
23
|
Stropahl M, Debener S. Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration. Neuroimage Clin 2017; 16:514-523. [PMID: 28971005 PMCID: PMC5609862 DOI: 10.1016/j.nicl.2017.09.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/15/2017] [Accepted: 09/02/2017] [Indexed: 11/28/2022]
Abstract
There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users (n = 18), untreated mild to moderately hearing impaired individuals (n = 18) and normal hearing controls (n = 17). Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the auditory system even at early stages of hearing loss.
Collapse
Affiliation(s)
- Maren Stropahl
- Neuropsychology Lab, Department of Psychology, European Medical School, Carl von Ossietzky University Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, European Medical School, Carl von Ossietzky University Oldenburg, Germany
- Cluster of Excellence Hearing4all Oldenburg, Germany
| |
Collapse
|