1
|
Magnotti JF, Lado A, Beauchamp MS. The noisy encoding of disparity model predicts perception of the McGurk effect in native Japanese speakers. Front Neurosci 2024; 18:1421713. [PMID: 38988770 PMCID: PMC11233445 DOI: 10.3389/fnins.2024.1421713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/28/2024] [Indexed: 07/12/2024] Open
Abstract
In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory "pa" + visual "ka" or voiced auditory "ba" + visual "ga") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Anastasia Lado
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
2
|
Zou T, Li L, Huang X, Deng C, Wang X, Gao Q, Chen H, Li R. Dynamic causal modeling analysis reveals the modulation of motor cortex and integration in superior temporal gyrus during multisensory speech perception. Cogn Neurodyn 2024; 18:931-946. [PMID: 38826672 PMCID: PMC11143173 DOI: 10.1007/s11571-023-09945-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 02/03/2023] [Accepted: 02/10/2023] [Indexed: 03/06/2023] Open
Abstract
The processing of speech information from various sensory modalities is crucial for human communication. Both left posterior superior temporal gyrus (pSTG) and motor cortex importantly involve in the multisensory speech perception. However, the dynamic integration of primary sensory regions to pSTG and the motor cortex remain unclear. Here, we implemented a behavioral experiment of classical McGurk effect paradigm and acquired the task functional magnetic resonance imaging (fMRI) data during synchronized audiovisual syllabic perception from 63 normal adults. We conducted dynamic causal modeling (DCM) analysis to explore the cross-modal interactions among the left pSTG, left precentral gyrus (PrG), left middle superior temporal gyrus (mSTG), and left fusiform gyrus (FuG). Bayesian model selection favored a winning model that included modulations of connections to PrG (mSTG → PrG, FuG → PrG), from PrG (PrG → mSTG, PrG → FuG), and to pSTG (mSTG → pSTG, FuG → pSTG). Moreover, the coupling strength of the above connections correlated with behavioral McGurk susceptibility. In addition, significant differences were found in the coupling strength of these connections between strong and weak McGurk perceivers. Strong perceivers modulated less inhibitory visual influence, allowed less excitatory auditory information flowing into PrG, but integrated more audiovisual information in pSTG. Taken together, our findings show that the PrG and pSTG interact dynamically with primary cortices during audiovisual speech, and support the motor cortex plays a specifically functional role in modulating the gain and salience between auditory and visual modalities. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-023-09945-z.
Collapse
Affiliation(s)
- Ting Zou
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Liyuan Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xinju Huang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Chijun Deng
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Xuyang Wang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Qing Gao
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Huafu Chen
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| | - Rong Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Laboratory for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 People’s Republic of China
| |
Collapse
|
3
|
Dong C, Noppeney U, Wang S. Perceptual uncertainty explains activation differences between audiovisual congruent speech and McGurk stimuli. Hum Brain Mapp 2024; 45:e26653. [PMID: 38488460 DOI: 10.1002/hbm.26653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 02/20/2024] [Accepted: 02/26/2024] [Indexed: 03/19/2024] Open
Abstract
Face-to-face communication relies on the integration of acoustic speech signals with the corresponding facial articulations. In the McGurk illusion, an auditory /ba/ phoneme presented simultaneously with a facial articulation of a /ga/ (i.e., viseme), is typically fused into an illusory 'da' percept. Despite its widespread use as an index of audiovisual speech integration, critics argue that it arises from perceptual processes that differ categorically from natural speech recognition. Conversely, Bayesian theoretical frameworks suggest that both the illusory McGurk and the veridical audiovisual congruent speech percepts result from probabilistic inference based on noisy sensory signals. According to these models, the inter-sensory conflict in McGurk stimuli may only increase observers' perceptual uncertainty. This functional magnetic resonance imaging (fMRI) study presented participants (20 male and 24 female) with audiovisual congruent, McGurk (i.e., auditory /ba/ + visual /ga/), and incongruent (i.e., auditory /ga/ + visual /ba/) stimuli along with their unisensory counterparts in a syllable categorization task. Behaviorally, observers' response entropy was greater for McGurk compared to congruent audiovisual stimuli. At the neural level, McGurk stimuli increased activations in a widespread neural system, extending from the inferior frontal sulci (IFS) to the pre-supplementary motor area (pre-SMA) and insulae, typically involved in cognitive control processes. Crucially, in line with Bayesian theories these activation increases were fully accounted for by observers' perceptual uncertainty as measured by their response entropy. Our findings suggest that McGurk and congruent speech processing rely on shared neural mechanisms, thereby supporting the McGurk illusion as a valid measure of natural audiovisual speech perception.
Collapse
Affiliation(s)
- Chenjie Dong
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
- Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, the Netherlands
| | - Uta Noppeney
- Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, the Netherlands
| | - Suiping Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
| |
Collapse
|
4
|
Lee HH, Groves K, Ripollés P, Carrasco M. Audiovisual integration in the McGurk effect is impervious to music training. Sci Rep 2024; 14:3262. [PMID: 38332159 PMCID: PMC10853564 DOI: 10.1038/s41598-024-53593-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 02/01/2024] [Indexed: 02/10/2024] Open
Abstract
The McGurk effect refers to an audiovisual speech illusion where the discrepant auditory and visual syllables produce a fused percept between the visual and auditory component. However, little is known about how individual differences contribute to the McGurk effect. Here, we examined whether music training experience-which involves audiovisual integration-can modulate the McGurk effect. Seventy-three participants completed the Goldsmiths Musical Sophistication Index (Gold-MSI) questionnaire to evaluate their music expertise on a continuous scale. Gold-MSI considers participants' daily-life exposure to music learning experiences (formal and informal), instead of merely classifying people into different groups according to how many years they have been trained in music. Participants were instructed to report, via a 3-alternative forced choice task, "what a person said": /Ba/, /Ga/ or /Da/. The experiment consisted of 96 audiovisual congruent trials and 96 audiovisual incongruent (McGurk) trials. We observed no significant correlations between the susceptibility of the McGurk effect and the different subscales of the Gold-MSI (active engagement, perceptual abilities, music training, singing abilities, emotion) or the general musical sophistication composite score. Together, these findings suggest that music training experience does not modulate audiovisual integration in speech as reflected by the McGurk effect.
Collapse
Affiliation(s)
- Hsing-Hao Lee
- Department of Psychology, New York University, New York, USA.
| | - Karleigh Groves
- Department of Psychology, New York University, New York, USA
- Center for Language, Music, and Emotion (CLaME), New York University, New York, USA
- Music and Audio Research Lab (MARL), New York University, New York, USA
| | - Pablo Ripollés
- Department of Psychology, New York University, New York, USA
- Center for Language, Music, and Emotion (CLaME), New York University, New York, USA
- Music and Audio Research Lab (MARL), New York University, New York, USA
| | - Marisa Carrasco
- Department of Psychology, New York University, New York, USA
- Center for Neural Science, New York University, New York, USA
| |
Collapse
|
5
|
Choi I, Demir I, Oh S, Lee SH. Multisensory integration in the mammalian brain: diversity and flexibility in health and disease. Philos Trans R Soc Lond B Biol Sci 2023; 378:20220338. [PMID: 37545309 PMCID: PMC10404930 DOI: 10.1098/rstb.2022.0338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 04/30/2023] [Indexed: 08/08/2023] Open
Abstract
Multisensory integration (MSI) occurs in a variety of brain areas, spanning cortical and subcortical regions. In traditional studies on sensory processing, the sensory cortices have been considered for processing sensory information in a modality-specific manner. The sensory cortices, however, send the information to other cortical and subcortical areas, including the higher association cortices and the other sensory cortices, where the multiple modality inputs converge and integrate to generate a meaningful percept. This integration process is neither simple nor fixed because these brain areas interact with each other via complicated circuits, which can be modulated by numerous internal and external conditions. As a result, dynamic MSI makes multisensory decisions flexible and adaptive in behaving animals. Impairments in MSI occur in many psychiatric disorders, which may result in an altered perception of the multisensory stimuli and an abnormal reaction to them. This review discusses the diversity and flexibility of MSI in mammals, including humans, primates and rodents, as well as the brain areas involved. It further explains how such flexibility influences perceptual experiences in behaving animals in both health and disease. This article is part of the theme issue 'Decision and control processes in multisensory perception'.
Collapse
Affiliation(s)
- Ilsong Choi
- Center for Synaptic Brain Dysfunctions, Institute for Basic Science (IBS), Daejeon 34141, Republic of Korea
| | - Ilayda Demir
- Department of biological sciences, KAIST, Daejeon 34141, Republic of Korea
| | - Seungmi Oh
- Department of biological sciences, KAIST, Daejeon 34141, Republic of Korea
| | - Seung-Hee Lee
- Center for Synaptic Brain Dysfunctions, Institute for Basic Science (IBS), Daejeon 34141, Republic of Korea
- Department of biological sciences, KAIST, Daejeon 34141, Republic of Korea
| |
Collapse
|
6
|
Odell SR, Zito N, Clark D, Mathew D. Stability of olfactory behavior syndromes in the Drosophila larva. Sci Rep 2023; 13:2398. [PMID: 36765192 PMCID: PMC9918538 DOI: 10.1038/s41598-023-29523-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/06/2023] [Indexed: 02/12/2023] Open
Abstract
Individuals of many animal populations exhibit idiosyncratic behaviors. One measure of idiosyncratic behavior is a behavior syndrome, defined as the stability of one or more behavior traits in an individual across different situations. While behavior syndromes have been described in various animal systems, their properties and the circuit mechanisms that generate them are poorly understood. We thus have an incomplete understanding of how circuit properties influence animal behavior. Here, we characterize olfactory behavior syndromes in the Drosophila larva. We show that larvae exhibit idiosyncrasies in their olfactory behavior over short time scales. They are influenced by the larva's satiety state and odor environment. Additionally, we identified a group of antennal lobe local neurons that influence the larva's idiosyncratic behavior. These findings reveal previously unsuspected influences on idiosyncratic behavior. They further affirm the idea that idiosyncrasies are not simply statistical phenomena but manifestations of neural mechanisms. In light of these findings, we discuss more broadly the importance of idiosyncrasies to animal survival and how they might be studied.
Collapse
Affiliation(s)
- Seth R Odell
- Integrative Neuroscience Program, University of Nevada, Reno, NV, 89557, USA
| | - Nicholas Zito
- Integrative Neuroscience Program, University of Nevada, Reno, NV, 89557, USA
| | - David Clark
- Integrative Neuroscience Program, University of Nevada, Reno, NV, 89557, USA
| | - Dennis Mathew
- Integrative Neuroscience Program, University of Nevada, Reno, NV, 89557, USA. .,Department of Biology, University of Nevada, Reno, NV, 89557, USA.
| |
Collapse
|