1
|
Sato M. Audiovisual speech asynchrony asymmetrically modulates neural binding. Neuropsychologia 2024; 198:108866. [PMID: 38518889 DOI: 10.1016/j.neuropsychologia.2024.108866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 01/09/2024] [Accepted: 03/19/2024] [Indexed: 03/24/2024]
Abstract
Previous psychophysical and neurophysiological studies in young healthy adults have provided evidence that audiovisual speech integration occurs with a large degree of temporal tolerance around true simultaneity. To further determine whether audiovisual speech asynchrony modulates auditory cortical processing and neural binding in young healthy adults, N1/P2 auditory evoked responses were compared using an additive model during a syllable categorization task, without or with an audiovisual asynchrony ranging from 240 ms visual lead to 240 ms auditory lead. Consistent with previous psychophysical findings, the observed results converge in favor of an asymmetric temporal integration window. Three main findings were observed: 1) predictive temporal and phonetic cues from pre-phonatory visual movements before the acoustic onset appeared essential for neural binding to occur, 2) audiovisual synchrony, with visual pre-phonatory movements predictive of the onset of the acoustic signal, was a prerequisite for N1 latency facilitation, and 3) P2 amplitude suppression and latency facilitation occurred even when visual pre-phonatory movements were not predictive of the acoustic onset but the syllable to come. Taken together, these findings help further clarify how audiovisual speech integration partly operates through two stages of visually-based temporal and phonetic predictions.
Collapse
Affiliation(s)
- Marc Sato
- Laboratoire Parole et Langage, Centre National de la Recherche Scientifique, Aix-Marseille Université, Aix-en-Provence, France.
| |
Collapse
|
2
|
Tremblay P, Sato M. Movement-related cortical potential and speech-induced suppression during speech production in younger and older adults. BRAIN AND LANGUAGE 2024; 253:105415. [PMID: 38692095 DOI: 10.1016/j.bandl.2024.105415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 05/03/2024]
Abstract
With age, the speech system undergoes important changes that render speech production more laborious, slower and often less intelligible. And yet, the neural mechanisms that underlie these age-related changes remain unclear. In this EEG study, we examined two important mechanisms in speech motor control: pre-speech movement-related cortical potential (MRCP), which reflects speech motor planning, and speaking-induced suppression (SIS), which indexes auditory predictions of speech motor commands, in 20 healthy young and 20 healthy older adults. Participants undertook a vowel production task which was followed by passive listening of their own recorded vowels. Our results revealed extensive differences in MRCP in older compared to younger adults. Further, while longer latencies were observed in older adults on N1 and P2, in contrast, the SIS was preserved. The observed reduced MRCP appears as a potential explanatory mechanism for the known age-related slowing of speech production, while preserved SIS suggests intact motor-to-auditory integration.
Collapse
Affiliation(s)
- Pascale Tremblay
- Université Laval, Faculté de Médecine, Département de Réadaptation, Quebec City G1V 0A6, Canada; CERVO Brain Research Center, Quebec City G1J 2G3, Canada.
| | - Marc Sato
- Laboratoire Parole et Langage, Centre National de la Recherche Scientifique, Aix-Marseille Université, Aix-en-Provence, France
| |
Collapse
|
3
|
Sato M. Competing influence of visual speech on auditory neural adaptation. BRAIN AND LANGUAGE 2023; 247:105359. [PMID: 37951157 DOI: 10.1016/j.bandl.2023.105359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 09/25/2023] [Accepted: 11/06/2023] [Indexed: 11/13/2023]
Abstract
Visual information from a speaker's face enhances auditory neural processing and speech recognition. To determine whether auditory memory can be influenced by visual speech, the degree of auditory neural adaptation of an auditory syllable preceded by an auditory, visual, or audiovisual syllable was examined using EEG. Consistent with previous findings and additional adaptation of auditory neurons tuned to acoustic features, stronger adaptation of N1, P2 and N2 auditory evoked responses was observed when the auditory syllable was preceded by an auditory compared to a visual syllable. However, although stronger than when preceded by a visual syllable, lower adaptation was observed when the auditory syllable was preceded by an audiovisual compared to an auditory syllable. In addition, longer N1 and P2 latencies were then observed. These results further demonstrate that visual speech acts on auditory memory but suggest competing visual influences in the case of audiovisual stimulation.
Collapse
Affiliation(s)
- Marc Sato
- Laboratoire Parole et Langage, Centre National de la Recherche Scientifique, UMR 7309 CNRS & Aix-Marseille Université, Aix-Marseille Université, 5 avenue Pasteur, Aix-en-Provence, France.
| |
Collapse
|
4
|
Hansmann D, Derrick D, Theys C. Hearing, seeing, and feeling speech: the neurophysiological correlates of trimodal speech perception. Front Hum Neurosci 2023; 17:1225976. [PMID: 37706173 PMCID: PMC10495990 DOI: 10.3389/fnhum.2023.1225976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 08/08/2023] [Indexed: 09/15/2023] Open
Abstract
Introduction To perceive speech, our brains process information from different sensory modalities. Previous electroencephalography (EEG) research has established that audio-visual information provides an advantage compared to auditory-only information during early auditory processing. In addition, behavioral research showed that auditory speech perception is not only enhanced by visual information but also by tactile information, transmitted by puffs of air arriving at the skin and aligned with speech. The current EEG study aimed to investigate whether the behavioral benefits of bimodal audio-aerotactile and trimodal audio-visual-aerotactile speech presentation are reflected in cortical auditory event-related neurophysiological responses. Methods To examine the influence of multimodal information on speech perception, 20 listeners conducted a two-alternative forced-choice syllable identification task at three different signal-to-noise levels. Results Behavioral results showed increased syllable identification accuracy when auditory information was complemented with visual information, but did not show the same effect for the addition of tactile information. Similarly, EEG results showed an amplitude suppression for the auditory N1 and P2 event-related potentials for the audio-visual and audio-visual-aerotactile modalities compared to auditory and audio-aerotactile presentations of the syllable/pa/. No statistically significant difference was present between audio-aerotactile and auditory-only modalities. Discussion Current findings are consistent with past EEG research showing a visually induced amplitude suppression during early auditory processing. In addition, the significant neurophysiological effect of audio-visual but not audio-aerotactile presentation is in line with the large benefit of visual information but comparatively much smaller effect of aerotactile information on auditory speech perception previously identified in behavioral research.
Collapse
Affiliation(s)
- Doreen Hansmann
- School of Psychology, Speech and Hearing, University of Canterbury, Christchurch, New Zealand
| | - Donald Derrick
- New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Christchurch, New Zealand
| | - Catherine Theys
- School of Psychology, Speech and Hearing, University of Canterbury, Christchurch, New Zealand
- New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
5
|
Sato M. The timing of visual speech modulates auditory neural processing. BRAIN AND LANGUAGE 2022; 235:105196. [PMID: 36343508 DOI: 10.1016/j.bandl.2022.105196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 10/15/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]
Abstract
In face-to-face communication, visual information from a speaker's face and time-varying kinematics of articulatory movements have been shown to fine-tune auditory neural processing and improve speech recognition. To further determine whether the timing of visual gestures modulates auditory cortical processing, three sets of syllables only differing in the onset and duration of silent prephonatory movements, before the acoustic speech signal, were contrasted using EEG. Despite similar visual recognition rates, an increase in the amplitude of P2 auditory evoked responses was observed from the longest to the shortest movements. Taken together, these results clarify how audiovisual speech perception partly operates through visually-based predictions and related processing time, with acoustic-phonetic neural processing paralleling the timing of visual prephonatory gestures.
Collapse
Affiliation(s)
- Marc Sato
- Laboratoire Parole et Langage, Centre National de la Recherche Scientifique, Aix-Marseille Université, Aix-en-Provence, France.
| |
Collapse
|
6
|
Franken MK, Liu BC, Ostry DJ. Towards a somatosensory theory of speech perception. J Neurophysiol 2022; 128:1683-1695. [PMID: 36416451 PMCID: PMC9762980 DOI: 10.1152/jn.00381.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 11/19/2022] [Accepted: 11/19/2022] [Indexed: 11/24/2022] Open
Abstract
Speech perception is known to be a multimodal process, relying not only on auditory input but also on the visual system and possibly on the motor system as well. To date there has been little work on the potential involvement of the somatosensory system in speech perception. In the present review, we identify the somatosensory system as another contributor to speech perception. First, we argue that evidence in favor of a motor contribution to speech perception can just as easily be interpreted as showing somatosensory involvement. Second, physiological and neuroanatomical evidence for auditory-somatosensory interactions across the auditory hierarchy indicates the availability of a neural infrastructure that supports somatosensory involvement in auditory processing in general. Third, there is accumulating evidence for somatosensory involvement in the context of speech specifically. In particular, tactile stimulation modifies speech perception, and speech auditory input elicits activity in somatosensory cortical areas. Moreover, speech sounds can be decoded from activity in somatosensory cortex; lesions to this region affect perception, and vowels can be identified based on somatic input alone. We suggest that the somatosensory involvement in speech perception derives from the somatosensory-auditory pairing that occurs during speech production and learning. By bringing together findings from a set of studies that have not been previously linked, the present article identifies the somatosensory system as a presently unrecognized contributor to speech perception.
Collapse
Affiliation(s)
| | | | - David J Ostry
- McGill University, Montreal, Quebec, Canada
- Haskins Laboratories, New Haven, Connecticut
| |
Collapse
|
7
|
Sato M. Motor and visual influences on auditory neural processing during speaking and listening. Cortex 2022; 152:21-35. [DOI: 10.1016/j.cortex.2022.03.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 02/02/2022] [Accepted: 03/15/2022] [Indexed: 11/03/2022]
|
8
|
Tremblay P, Basirat A, Pinto S, Sato M. Visual prediction cues can facilitate behavioural and neural speech processing in young and older adults. Neuropsychologia 2021; 159:107949. [PMID: 34228997 DOI: 10.1016/j.neuropsychologia.2021.107949] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 06/16/2021] [Accepted: 07/01/2021] [Indexed: 02/06/2023]
Abstract
The ability to process speech evolves over the course of the lifespan. Understanding speech at low acoustic intensity and in the presence of background noise becomes harder, and the ability for older adults to benefit from audiovisual speech also appears to decline. These difficulties can have important consequences on quality of life. Yet, a consensus on the cause of these difficulties is still lacking. The objective of this study was to examine the processing of speech in young and older adults under different modalities (i.e. auditory [A], visual [V], audiovisual [AV]) and in the presence of different visual prediction cues (i.e., no predictive cue (control), temporal predictive cue, phonetic predictive cue, and combined temporal and phonetic predictive cues). We focused on recognition accuracy and four auditory evoked potential (AEP) components: P1-N1-P2 and N2. Thirty-four right-handed French-speaking adults were recruited, including 17 younger adults (28 ± 2 years; 20-42 years) and 17 older adults (67 ± 3.77 years; 60-73 years). Participants completed a forced-choice speech identification task. The main findings of the study are: (1) The faciliatory effect of visual information was reduced, but present, in older compared to younger adults, (2) visual predictive cues facilitated speech recognition in younger and older adults alike, (3) age differences in AEPs were localized to later components (P2 and N2), suggesting that aging predominantly affects higher-order cortical processes related to speech processing rather than lower-level auditory processes. (4) Specifically, AV facilitation on P2 amplitude was lower in older adults, there was a reduced effect of the temporal predictive cue on N2 amplitude for older compared to younger adults, and P2 and N2 latencies were longer for older adults. Finally (5) behavioural performance was associated with P2 amplitude in older adults. Our results indicate that aging affects speech processing at multiple levels, including audiovisual integration (P2) and auditory attentional processes (N2). These findings have important implications for understanding barriers to communication in older ages, as well as for the development of compensation strategies for those with speech processing difficulties.
Collapse
Affiliation(s)
- Pascale Tremblay
- Département de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Canada; Cervo Brain Research Centre, Quebec City, Canada.
| | - Anahita Basirat
- Univ. Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, Lille, France
| | - Serge Pinto
- France Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Marc Sato
- France Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| |
Collapse
|
9
|
Stuckenberg MV, Schröger E, Widmann A. Presentation Probability of Visual–Auditory Pairs Modulates Visually Induced Auditory Predictions. J Cogn Neurosci 2019; 31:1110-1125. [DOI: 10.1162/jocn_a_01398] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Predictions about forthcoming auditory events can be established on the basis of preceding visual information. Sounds being incongruent to predictive visual information have been found to elicit an enhanced negative ERP in the latency range of the auditory N1 compared with physically identical sounds being preceded by congruent visual information. This so-called incongruency response (IR) is interpreted as reduced prediction error for predicted sounds at a sensory level. The main purpose of this study was to examine the impact of probability manipulations on the IR. We manipulated the probability with which particular congruent visual–auditory pairs were presented (83/17 vs. 50/50 condition). This manipulation led to two conditions with different strengths of the association of visual with auditory information. A visual cue was presented either above or below a fixation cross and was followed by either a high- or low-pitched sound. In 90% of trials, the visual cue correctly predicted the subsequent sound. In one condition, one of the sounds was presented more frequently (83% of trials), whereas in the other condition both sounds were presented with equal probability (50% of trials). Therefore, in the 83/17 condition, one congruent combination of visual cue and corresponding sound was presented more frequently than the other combinations presumably leading to a stronger visual–auditory association. A significant IR for unpredicted compared with predicted but otherwise identical sounds was observed only in the 83/17 condition, but not in the 50/50 condition, where both congruent visual cue–sound combinations were presented with equal probability. We also tested whether the processing of the prediction violation is dependent on the task relevance of the visual information. Therefore, we contrasted a visual–auditory matching task with a pitch discrimination task. It turned out that the task only had an impact on the behavioral performance but not on the prediction error signals. Results suggest that the generation of visual-to-auditory sensory predictions is facilitated by a strong association between the visual cue and the predicted sound (83/17 condition) but is not influenced by the task relevance of the visual information.
Collapse
Affiliation(s)
- Maria V. Stuckenberg
- Institute of Psychology, University of Leipzig
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | | | - Andreas Widmann
- Institute of Psychology, University of Leipzig
- Leibniz Institute for Neurobiology, Magdeburg, Germany
| |
Collapse
|
10
|
Electrophysiological evidence for a self-processing advantage during audiovisual speech integration. Exp Brain Res 2017; 235:2867-2876. [PMID: 28676921 DOI: 10.1007/s00221-017-5018-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 06/23/2017] [Indexed: 10/19/2022]
Abstract
Previous electrophysiological studies have provided strong evidence for early multisensory integrative mechanisms during audiovisual speech perception. From these studies, one unanswered issue is whether hearing our own voice and seeing our own articulatory gestures facilitate speech perception, possibly through a better processing and integration of sensory inputs with our own sensory-motor knowledge. The present EEG study examined the impact of self-knowledge during the perception of auditory (A), visual (V) and audiovisual (AV) speech stimuli that were previously recorded from the participant or from a speaker he/she had never met. Audiovisual interactions were estimated by comparing N1 and P2 auditory evoked potentials during the bimodal condition (AV) with the sum of those observed in the unimodal conditions (A + V). In line with previous EEG studies, our results revealed an amplitude decrease of P2 auditory evoked potentials in AV compared to A + V conditions. Crucially, a temporal facilitation of N1 responses was observed during the visual perception of self speech movements compared to those of another speaker. This facilitation was negatively correlated with the saliency of visual stimuli. These results provide evidence for a temporal facilitation of the integration of auditory and visual speech signals when the visual situation involves our own speech gestures.
Collapse
|
11
|
Kokinous J, Tavano A, Kotz SA, Schröger E. Perceptual integration of faces and voices depends on the interaction of emotional content and spatial frequency. Biol Psychol 2017; 123:155-165. [DOI: 10.1016/j.biopsycho.2016.12.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Revised: 10/11/2016] [Accepted: 12/11/2016] [Indexed: 10/20/2022]
|
12
|
Rosenblum LD, Dorsi J, Dias JW. The Impact and Status of Carol Fowler's Supramodal Theory of Multisensory Speech Perception. ECOLOGICAL PSYCHOLOGY 2016. [DOI: 10.1080/10407413.2016.1230373] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
13
|
Hisanaga S, Sekiyama K, Igasaki T, Murayama N. Language/Culture Modulates Brain and Gaze Processes in Audiovisual Speech Perception. Sci Rep 2016; 6:35265. [PMID: 27734953 PMCID: PMC5062344 DOI: 10.1038/srep35265] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Accepted: 09/27/2016] [Indexed: 11/25/2022] Open
Abstract
Several behavioural studies have shown that the interplay between voice and face information in audiovisual speech perception is not universal. Native English speakers (ESs) are influenced by visual mouth movement to a greater degree than native Japanese speakers (JSs) when listening to speech. However, the biological basis of these group differences is unknown. Here, we demonstrate the time-varying processes of group differences in terms of event-related brain potentials (ERP) and eye gaze for audiovisual and audio-only speech perception. On a behavioural level, while congruent mouth movement shortened the ESs’ response time for speech perception, the opposite effect was observed in JSs. Eye-tracking data revealed a gaze bias to the mouth for the ESs but not the JSs, especially before the audio onset. Additionally, the ERP P2 amplitude indicated that ESs processed multisensory speech more efficiently than auditory-only speech; however, the JSs exhibited the opposite pattern. Taken together, the ESs’ early visual attention to the mouth was likely to promote phonetic anticipation, which was not the case for the JSs. These results clearly indicate the impact of language and/or culture on multisensory speech processing, suggesting that linguistic/cultural experiences lead to the development of unique neural systems for audiovisual speech perception.
Collapse
Affiliation(s)
- Satoko Hisanaga
- Division of Cognitive Psychology, Faculty of Letters, Kumamoto University 2-40-1, Kurokami, Chuo-ku, Kumamoto, 860-8555, Japan
| | - Kaoru Sekiyama
- Division of Cognitive Psychology, Faculty of Letters, Kumamoto University 2-40-1, Kurokami, Chuo-ku, Kumamoto, 860-8555, Japan
| | - Tomohiko Igasaki
- Department of Information Technology on Human and Environmental Science, Graduate School of Science and Technology, Kumamoto University, 2-39-1, Kurokami, Chuo-ku, Kumamoto, 860-8555, Japan
| | - Nobuki Murayama
- Department of Information Technology on Human and Environmental Science, Graduate School of Science and Technology, Kumamoto University, 2-39-1, Kurokami, Chuo-ku, Kumamoto, 860-8555, Japan
| |
Collapse
|
14
|
Baart M. Quantifying lip-read-induced suppression and facilitation of the auditory N1 and P2 reveals peak enhancements and delays. Psychophysiology 2016; 53:1295-306. [DOI: 10.1111/psyp.12683] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 05/09/2016] [Indexed: 11/29/2022]
Affiliation(s)
- Martijn Baart
- BCBL. Basque Center on Cognition, Brain and Language; Donostia-San Sebastián Spain
- Department of Cognitive Neuropsychology; Tilburg University; Tilburg The Netherlands
| |
Collapse
|
15
|
Rosenblum LD, Dias JW, Dorsi J. The supramodal brain: implications for auditory perception. JOURNAL OF COGNITIVE PSYCHOLOGY 2016. [DOI: 10.1080/20445911.2016.1181691] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
16
|
Tiippana K, Möttönen R, Schwartz JL. Multisensory and sensorimotor interactions in speech perception. Front Psychol 2015; 6:458. [PMID: 25941506 PMCID: PMC4403297 DOI: 10.3389/fpsyg.2015.00458] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 03/30/2015] [Indexed: 11/29/2022] Open
Affiliation(s)
- Kaisa Tiippana
- Institute of Behavioural Sciences, University of Helsinki Helsinki, Finland
| | - Riikka Möttönen
- Department of Experimental Psychology, University of Oxford Oxford, UK
| | - Jean-Luc Schwartz
- Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, Centre National de la Recherche Scientifique, Grenoble University Grenoble, France
| |
Collapse
|
17
|
Ganesh AC, Berthommier F, Vilain C, Sato M, Schwartz JL. A possible neurophysiological correlate of audiovisual binding and unbinding in speech perception. Front Psychol 2014; 5:1340. [PMID: 25505438 PMCID: PMC4244540 DOI: 10.3389/fpsyg.2014.01340] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Accepted: 11/03/2014] [Indexed: 11/13/2022] Open
Abstract
Audiovisual (AV) speech integration of auditory and visual streams generally ends up in a fusion into a single percept. One classical example is the McGurk effect in which incongruent auditory and visual speech signals may lead to a fused percept different from either visual or auditory inputs. In a previous set of experiments, we showed that if a McGurk stimulus is preceded by an incongruent AV context (composed of incongruent auditory and visual speech materials) the amount of McGurk fusion is largely decreased. We interpreted this result in the framework of a two-stage "binding and fusion" model of AV speech perception, with an early AV binding stage controlling the fusion/decision process and likely to produce "unbinding" with less fusion if the context is incoherent. In order to provide further electrophysiological evidence for this binding/unbinding stage, early auditory evoked N1/P2 responses were here compared during auditory, congruent and incongruent AV speech perception, according to either prior coherent or incoherent AV contexts. Following the coherent context, in line with previous electroencephalographic/magnetoencephalographic studies, visual information in the congruent AV condition was found to modify auditory evoked potentials, with a latency decrease of P2 responses compared to the auditory condition. Importantly, both P2 amplitude and latency in the congruent AV condition increased from the coherent to the incoherent context. Although potential contamination by visual responses from the visual cortex cannot be discarded, our results might provide a possible neurophysiological correlate of early binding/unbinding process applied on AV interactions.
Collapse
Affiliation(s)
- Attigodu C Ganesh
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University, Grenoble France
| | - Frédéric Berthommier
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University, Grenoble France
| | - Coriandre Vilain
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University, Grenoble France
| | - Marc Sato
- CNRS, Laboratoire Parole et Langage, Brain and Language Research Institute, UMR 7309, Aix-Marseille University, Aix-en-Provence France
| | - Jean-Luc Schwartz
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University, Grenoble France
| |
Collapse
|