1
|
Magnotti JF, Lado A, Beauchamp MS. The noisy encoding of disparity model predicts perception of the McGurk effect in native Japanese speakers. Front Neurosci 2024; 18:1421713. [PMID: 38988770 PMCID: PMC11233445 DOI: 10.3389/fnins.2024.1421713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/28/2024] [Indexed: 07/12/2024] Open
Abstract
In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory "pa" + visual "ka" or voiced auditory "ba" + visual "ga") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Anastasia Lado
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
2
|
Harwood V, Baron A, Kleinman D, Campanelli L, Irwin J, Landi N. Event-Related Potentials in Assessing Visual Speech Cues in the Broader Autism Phenotype: Evidence from a Phonemic Restoration Paradigm. Brain Sci 2023; 13:1011. [PMID: 37508944 PMCID: PMC10377560 DOI: 10.3390/brainsci13071011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 06/26/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023] Open
Abstract
Audiovisual speech perception includes the simultaneous processing of auditory and visual speech. Deficits in audiovisual speech perception are reported in autistic individuals; however, less is known regarding audiovisual speech perception within the broader autism phenotype (BAP), which includes individuals with elevated, yet subclinical, levels of autistic traits. We investigate the neural indices of audiovisual speech perception in adults exhibiting a range of autism-like traits using event-related potentials (ERPs) in a phonemic restoration paradigm. In this paradigm, we consider conditions where speech articulators (mouth and jaw) are present (AV condition) and obscured by a pixelated mask (PX condition). These two face conditions were included in both passive (simply viewing a speaking face) and active (participants were required to press a button for a specific consonant-vowel stimulus) experiments. The results revealed an N100 ERP component which was present for all listening contexts and conditions; however, it was attenuated in the active AV condition where participants were able to view the speaker's face, including the mouth and jaw. The P300 ERP component was present within the active experiment only, and significantly greater within the AV condition compared to the PX condition. This suggests increased neural effort for detecting deviant stimuli when visible articulation was present and visual influence on perception. Finally, the P300 response was negatively correlated with autism-like traits, suggesting that higher autistic traits were associated with generally smaller P300 responses in the active AV and PX conditions. The conclusions support the finding that atypical audiovisual processing may be characteristic of the BAP in adults.
Collapse
Affiliation(s)
- Vanessa Harwood
- Department of Communicative Disorders, University of Rhode Island, Kingston, RI 02881, USA
| | - Alisa Baron
- Department of Communicative Disorders, University of Rhode Island, Kingston, RI 02881, USA
| | | | - Luca Campanelli
- Department of Communicative Disorders, University of Alabama, Tuscaloosa, AL 35487, USA
| | - Julia Irwin
- Haskins Laboratories, New Haven, CT 06519, USA
- Department of Psychology, Southern Connecticut State University, New Haven, CT 06515, USA
| | - Nicole Landi
- Haskins Laboratories, New Haven, CT 06519, USA
- Department of Psychological Sciences, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
3
|
Magnotti JF, Dzeda KB, Wegner-Clemens K, Rennig J, Beauchamp MS. Weak observer-level correlation and strong stimulus-level correlation between the McGurk effect and audiovisual speech-in-noise: A causal inference explanation. Cortex 2020; 133:371-383. [PMID: 33221701 DOI: 10.1016/j.cortex.2020.10.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 08/05/2020] [Accepted: 10/05/2020] [Indexed: 11/25/2022]
Abstract
The McGurk effect is a widely used measure of multisensory integration during speech perception. Two observations have raised questions about the validity of the effect as a tool for understanding speech perception. First, there is high variability in perception of the McGurk effect across different stimuli and observers. Second, across observers there is low correlation between McGurk susceptibility and recognition of visual speech paired with auditory speech-in-noise, another common measure of multisensory integration. Using the framework of the causal inference of multisensory speech (CIMS) model, we explored the relationship between the McGurk effect, syllable perception, and sentence perception in seven experiments with a total of 296 different participants. Perceptual reports revealed a relationship between the efficacy of different McGurk stimuli created from the same talker and perception of the auditory component of the McGurk stimuli presented in isolation, both with and without added noise. The CIMS model explained this strong stimulus-level correlation using the principles of noisy sensory encoding followed by optimal cue combination within a common representational space across speech types. Because the McGurk effect (but not speech-in-noise) requires the resolution of conflicting cues between modalities, there is an additional source of individual variability that can explain the weak observer-level correlation between McGurk and noisy speech. Power calculations show that detecting this weak correlation requires studies with many more participants than those conducted to-date. Perception of the McGurk effect and other types of speech can be explained by a common theoretical framework that includes causal inference, suggesting that the McGurk effect is a valid and useful experimental tool.
Collapse
|
4
|
Feng G, Zhou B, Zhou W, Beauchamp MS, Magnotti JF. A Laboratory Study of the McGurk Effect in 324 Monozygotic and Dizygotic Twins. Front Neurosci 2019; 13:1029. [PMID: 31636529 PMCID: PMC6787151 DOI: 10.3389/fnins.2019.01029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 09/10/2019] [Indexed: 11/13/2022] Open
Abstract
Multisensory integration of information from the talker's voice and the talker's mouth facilitates human speech perception. A popular assay of audiovisual integration is the McGurk effect, an illusion in which incongruent visual speech information categorically changes the percept of auditory speech. There is substantial interindividual variability in susceptibility to the McGurk effect. To better understand possible sources of this variability, we examined the McGurk effect in 324 native Mandarin speakers, consisting of 73 monozygotic (MZ) and 89 dizygotic (DZ) twin pairs. When tested with 9 different McGurk stimuli, some participants never perceived the illusion and others always perceived it. Within participants, perception was similar across time (r = 0.55 at a 2-year retest in 150 participants) suggesting that McGurk susceptibility reflects a stable trait rather than short-term perceptual fluctuations. To examine the effects of shared genetics and prenatal environment, we compared McGurk susceptibility between MZ and DZ twins. Both twin types had significantly greater correlation than unrelated pairs (r = 0.28 for MZ twins and r = 0.21 for DZ twins) suggesting that the genes and environmental factors shared by twins contribute to individual differences in multisensory speech perception. Conversely, the existence of substantial differences within twin pairs (even MZ co-twins) and the overall low percentage of explained variance (5.5%) argues against a deterministic view of individual differences in multisensory integration.
Collapse
Affiliation(s)
- Guo Feng
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- Psychological Research and Counseling Center, Southwest Jiaotong University, Chengdu, China
| | - Bin Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Wen Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Michael S. Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| | - John F. Magnotti
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
5
|
Brown VA, Hedayati M, Zanger A, Mayn S, Ray L, Dillman-Hasso N, Strand JF. What accounts for individual differences in susceptibility to the McGurk effect? PLoS One 2018; 13:e0207160. [PMID: 30418995 PMCID: PMC6231656 DOI: 10.1371/journal.pone.0207160] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 10/25/2018] [Indexed: 11/29/2022] Open
Abstract
The McGurk effect is a classic audiovisual speech illusion in which discrepant auditory and visual syllables can lead to a fused percept (e.g., an auditory /bɑ/ paired with a visual /gɑ/ often leads to the perception of /dɑ/). The McGurk effect is robust and easily replicated in pooled group data, but there is tremendous variability in the extent to which individual participants are susceptible to it. In some studies, the rate at which individuals report fusion responses ranges from 0% to 100%. Despite its widespread use in the audiovisual speech perception literature, the roots of the wide variability in McGurk susceptibility are largely unknown. This study evaluated whether several perceptual and cognitive traits are related to McGurk susceptibility through correlational analyses and mixed effects modeling. We found that an individual's susceptibility to the McGurk effect was related to their ability to extract place of articulation information from the visual signal (i.e., a more fine-grained analysis of lipreading ability), but not to scores on tasks measuring attentional control, processing speed, working memory capacity, or auditory perceptual gradiency. These results provide support for the claim that a small amount of the variability in susceptibility to the McGurk effect is attributable to lipreading skill. In contrast, cognitive and perceptual abilities that are commonly used predictors in individual differences studies do not appear to underlie susceptibility to the McGurk effect.
Collapse
Affiliation(s)
- Violet A. Brown
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Maryam Hedayati
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Annie Zanger
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Sasha Mayn
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Lucia Ray
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Naseem Dillman-Hasso
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Julia F. Strand
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| |
Collapse
|