1
|
Teramoto W, Ernst MO. Effects of invisible lip movements on phonetic perception. Sci Rep 2023; 13:6478. [PMID: 37081084 PMCID: PMC10119180 DOI: 10.1038/s41598-023-33791-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 04/19/2023] [Indexed: 04/22/2023] Open
Abstract
We investigated whether 'invisible' visual information, i.e., visual information that is not consciously perceived, could affect auditory speech perception. Repeated exposure to McGurk stimuli (auditory /ba/ with visual [ga]) temporarily changes the perception of the auditory /ba/ into a 'da' or 'ga'. This altered auditory percept persists even after the presentation of the McGurk stimuli when the auditory stimulus is presented alone (McGurk aftereffect). We used this and presented the auditory /ba/ either with or without (No Face) a masked face articulating a visual [ba] (Congruent Invisible) or a visual [ga] (Incongruent Invisible). Thus, we measured the extent to which the invisible faces could undo or prolong the McGurk aftereffects. In a further control condition, the incongruent faces remained unmasked and thus visible, resulting in four conditions in total. Visibility was defined by the participants' subjective dichotomous reports ('visible' or 'invisible'). The results showed that the Congruent Invisible condition reduced the McGurk aftereffects compared with the other conditions, while the Incongruent Invisible condition showed no difference with the No Face condition. These results suggest that 'invisible' visual information that is not consciously perceived can affect phonetic perception, but only when visual information is congruent with auditory information.
Collapse
Affiliation(s)
- W Teramoto
- Faculty of Humanities and Cultural Sciences (Psychology), Kumamoto University, 2-40-1 Kurokami, Kumamoto, 860-8555, Japan.
| | - M O Ernst
- Applied Cognitive Psychology, Ulm University, Albert-Einstein-Allee 43, 89081, Ulm, Germany
| |
Collapse
|
2
|
Wahn B, Schmitz L, Kingstone A, Böckler-Raettig A. When eyes beat lips: speaker gaze affects audiovisual integration in the McGurk illusion. PSYCHOLOGICAL RESEARCH 2021; 86:1930-1943. [PMID: 34854983 PMCID: PMC9363401 DOI: 10.1007/s00426-021-01618-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 11/10/2021] [Indexed: 11/26/2022]
Abstract
Eye contact is a dynamic social signal that captures attention and plays a critical role in human communication. In particular, direct gaze often accompanies communicative acts in an ostensive function: a speaker directs her gaze towards the addressee to highlight the fact that this message is being intentionally communicated to her. The addressee, in turn, integrates the speaker’s auditory and visual speech signals (i.e., her vocal sounds and lip movements) into a unitary percept. It is an open question whether the speaker’s gaze affects how the addressee integrates the speaker’s multisensory speech signals. We investigated this question using the classic McGurk illusion, an illusory percept created by presenting mismatching auditory (vocal sounds) and visual information (speaker’s lip movements). Specifically, we manipulated whether the speaker (a) moved his eyelids up/down (i.e., open/closed his eyes) prior to speaking or did not show any eye motion, and (b) spoke with open or closed eyes. When the speaker’s eyes moved (i.e., opened or closed) before an utterance, and when the speaker spoke with closed eyes, the McGurk illusion was weakened (i.e., addressees reported significantly fewer illusory percepts). In line with previous research, this suggests that motion (opening or closing), as well as the closed state of the speaker’s eyes, captured addressees’ attention, thereby reducing the influence of the speaker’s lip movements on the addressees’ audiovisual integration process. Our findings reaffirm the power of speaker gaze to guide attention, showing that its dynamics can modulate low-level processes such as the integration of multisensory speech signals.
Collapse
Affiliation(s)
- Basil Wahn
- Department of Psychology, Leibniz Universität Hannover, Hannover, Germany.
| | - Laura Schmitz
- Institute of Sports Science, Leibniz Universität Hannover, Hannover, Germany
| | - Alan Kingstone
- Department of Psychology, University of British Columbia, Vancouver, BC, Canada
| | | |
Collapse
|
3
|
Ujiie Y, Takahashi K. Weaker McGurk Effect for Rubin's Vase-Type Speech in People With High Autistic Traits. Multisens Res 2021; 34:1-17. [PMID: 33873157 DOI: 10.1163/22134808-bja10047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 04/05/2021] [Indexed: 11/19/2022]
Abstract
While visual information from facial speech modulates auditory speech perception, it is less influential on audiovisual speech perception among autistic individuals than among typically developed individuals. In this study, we investigated the relationship between autistic traits (Autism-Spectrum Quotient; AQ) and the influence of visual speech on the recognition of Rubin's vase-type speech stimuli with degraded facial speech information. Participants were 31 university students (13 males and 18 females; mean age: 19.2, SD: 1.13 years) who reported normal (or corrected-to-normal) hearing and vision. All participants completed three speech recognition tasks (visual, auditory, and audiovisual stimuli) and the AQ-Japanese version. The results showed that accuracies of speech recognition for visual (i.e., lip-reading) and auditory stimuli were not significantly related to participants' AQ. In contrast, audiovisual speech perception was less susceptible to facial speech perception among individuals with high rather than low autistic traits. The weaker influence of visual information on audiovisual speech perception in autism spectrum disorder (ASD) was robust regardless of the clarity of the visual information, suggesting a difficulty in the process of audiovisual integration rather than in the visual processing of facial speech.
Collapse
Affiliation(s)
- Yuta Ujiie
- Graduate School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa-ku, Nagoya-shi, Aichi, 466-8666, Japan
- Japan Society for the Promotion of Science, Kojimachi Business Center Building, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
- Research and Development Initiative, Chuo University, 1-13-27, Kasuga, Bunkyo-ku, Tokyo, 112-8551, Japan
| | - Kohske Takahashi
- School of Psychology, Chukyo University, 101-2 Yagoto Honmachi, Showa-ku, Nagoya-shi, Aichi, 466-8666, Japan
| |
Collapse
|
4
|
Ching ASM, Kim J, Davis C. Auditory-visual integration during nonconscious perception. Cortex 2019; 117:1-15. [PMID: 30925308 DOI: 10.1016/j.cortex.2019.02.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 02/11/2019] [Accepted: 02/12/2019] [Indexed: 10/27/2022]
Abstract
Our study proposes a test of a key assumption of the most prominent model of consciousness - the global workspace (GWS) model (e.g., Baars, 2002, 2005, 2007; Dehaene & Naccache, 2001; Mudrik, Faivre, & Koch, 2014). This assumption is that multimodal integration requires consciousness; however, few studies have explicitly tested if integration can occur between nonconscious information from different modalities. The proposed study examined whether a classic indicator of multimodal integration - the McGurk effect - can be elicited with subliminal auditory-visual speech stimuli. We used a masked speech priming paradigm developed by Kouider and Dupoux (2005) in conjunction with continuous flash suppression (CFS; Tsuchiya & Koch, 2005), a binocular rivalry technique for presenting video stimuli subliminally. Applying these techniques together, we carried out two experiments in which participants categorised auditory syllable targets which were preceded by subliminal auditory-visual (AV) speech primes. Subliminal AV primes were either illusion-inducing (McGurk) or illusion-neutral (Incongruent) combinations of speech stimuli. In Experiment 1, the categorisation of the syllable target ("pa") was facilitated by the same syllable prime when it was part of a McGurk combination (auditory "pa" and visual "ka") but not when part of an Incongruent combination (auditory "pa" and visual "wa"). This dependency on specific AV combinations indicated a nonconscious AV interaction. Experiment 2 presented a different syllable target ("ta") which matched the predicted illusory outcome of the McGurk combination - here, both the McGurk combination (auditory "pa" and visual "ka") and the Incongruent combination (auditory "ta" and visual "ka") failed to facilitate target categorisation. The combined results of both Experiments demonstrate a type of nonconscious multimodal interaction that is distinct from integration - it allows unimodal information that is compatible for integration (i.e., McGurk combinations) to persist and influence later processes, but does not actually combine and alter that information. As the GWS model does not account for non-integrative multimodal interactions, this places some pressure on such models of consciousness.
Collapse
Affiliation(s)
| | - Jeesun Kim
- The MARCS Institute, Western Sydney University, Australia
| | - Chris Davis
- The MARCS Institute, Western Sydney University, Australia
| |
Collapse
|
5
|
Delong P, Aller M, Giani AS, Rohe T, Conrad V, Watanabe M, Noppeney U. Invisible Flashes Alter Perceived Sound Location. Sci Rep 2018; 8:12376. [PMID: 30120294 PMCID: PMC6098122 DOI: 10.1038/s41598-018-30773-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 07/31/2018] [Indexed: 12/05/2022] Open
Abstract
Information integration across the senses is fundamental for effective interactions with our environment. The extent to which signals from different senses can interact in the absence of awareness is controversial. Combining the spatial ventriloquist illusion and dynamic continuous flash suppression (dCFS), we investigated in a series of two experiments whether visual signals that observers do not consciously perceive can influence spatial perception of sounds. Importantly, dCFS obliterated visual awareness only on a fraction of trials allowing us to compare spatial ventriloquism for physically identical flashes that were judged as visible or invisible. Our results show a stronger ventriloquist effect for visible than invisible flashes. Critically, a robust ventriloquist effect emerged also for invisible flashes even when participants were at chance when locating the flash. Collectively, our findings demonstrate that signals that we are not aware of in one sensory modality can alter spatial perception of signals in another sensory modality.
Collapse
Affiliation(s)
- Patrycja Delong
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, B15 2TT, Birmingham, UK.
| | - Máté Aller
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, B15 2TT, Birmingham, UK
| | - Anette S Giani
- Max Planck Institute for Biological Cybernetics, 72076, Tübingen, Germany
| | - Tim Rohe
- Max Planck Institute for Biological Cybernetics, 72076, Tübingen, Germany
| | - Verena Conrad
- Max Planck Institute for Biological Cybernetics, 72076, Tübingen, Germany
| | - Masataka Watanabe
- Max Planck Institute for Biological Cybernetics, 72076, Tübingen, Germany
| | - Uta Noppeney
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, B15 2TT, Birmingham, UK
- Max Planck Institute for Biological Cybernetics, 72076, Tübingen, Germany
| |
Collapse
|
6
|
Deroy O, Faivre N, Lunghi C, Spence C, Aller M, Noppeney U. The Complex Interplay Between Multisensory Integration and Perceptual Awareness. Multisens Res 2018; 29:585-606. [PMID: 27795942 DOI: 10.1163/22134808-00002529] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The integration of information has been considered a hallmark of human consciousness, as it requires information being globally available via widespread neural interactions. Yet the complex interdependencies between multisensory integration and perceptual awareness, or consciousness, remain to be defined. While perceptual awareness has traditionally been studied in a single sense, in recent years we have witnessed a surge of interest in the role of multisensory integration in perceptual awareness. Based on a recent IMRF symposium on multisensory awareness, this review discusses three key questions from conceptual, methodological and experimental perspectives: (1) What do we study when we study multisensory awareness? (2) What is the relationship between multisensory integration and perceptual awareness? (3) Which experimental approaches are most promising to characterize multisensory awareness? We hope that this review paper will provoke lively discussions, novel experiments, and conceptual considerations to advance our understanding of the multifaceted interplay between multisensory integration and consciousness.
Collapse
Affiliation(s)
- O Deroy
- Centre for the Study of the Senses, Institute of Philosophy, School of Advanced Study, University of London, London, UK
| | - N Faivre
- Laboratory of Cognitive Neuroscience, Brain Mind Institute, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - C Lunghi
- Department of Translational Research on New Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
| | - C Spence
- Crossmodal Research Laboratory, Department of Experimental Psychology, Oxford University, Oxford, UK
| | - M Aller
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, UK
| | - U Noppeney
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, UK
| |
Collapse
|
7
|
Rodríguez-Martínez GA, Castillo-Parra H. Bistable perception: neural bases and usefulness in psychological research. Int J Psychol Res (Medellin) 2018; 11:63-76. [PMID: 32612780 PMCID: PMC7110285 DOI: 10.21500/20112084.3375] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Bistable images have the possibility of being perceived in two different ways. Due to their physical characteristics, these visual stimuli allow two different perceptions, associated with top-down and bottom-up modulating processes. Based on an extensive literature review, the present article aims to gather the conceptual models and the foundations of perceptual bistability. This theoretical article compiles not only notions that are intertwined with the understanding of this perceptual phenomenon, but also the diverse classification and uses of bistable images in psychological research, along with a detailed explanation of the neural correlates that are involved in perceptual reversibility. We conclude that the use of bistable images as a paradigmatic resource in psychological research might be extensive. In addition, due to their characteristics, visual bistable stimuli have the potential to be implemented as a resource in experimental tasks that seek to understand diverse concerns linked essentially to attention, sensory, perceptual and memory processes.
Collapse
Affiliation(s)
- Guillermo Andrés Rodríguez-Martínez
- Escuela de Publicidad - Universidad de Bogotá Jorge Tadeo Lozano, Bogotá, Colombia. Universidad de Bogotá Jorge Tadeo Lozano Universidad de Bogotá Jorge Tadeo Lozano Bogotá Colombia.,Facultad de Psicología - Universidad de San Buenaventura de Medellín, Colombia. Universidad de San Buenaventura Universidad de San Buenaventura de Medellín Colombia
| | - Henry Castillo-Parra
- Facultad de Psicología - Universidad de San Buenaventura de Medellín, Colombia. Universidad de San Buenaventura Universidad de San Buenaventura de Medellín Colombia
| |
Collapse
|
8
|
Morís Fernández L, Torralba M, Soto-Faraco S. Theta oscillations reflect conflict processing in the perception of the McGurk illusion. Eur J Neurosci 2018; 48:2630-2641. [DOI: 10.1111/ejn.13804] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 12/12/2017] [Accepted: 12/12/2017] [Indexed: 11/27/2022]
Affiliation(s)
- Luis Morís Fernández
- Multisensory Research Group; Center for Brain and Cognition; Dept. de Tecnologies de la Informació i les Comunicacions; Universitat Pompeu Fabra; Office 55.128., Roc Boronat, 138 08018 Barcelona Spain
| | - Mireia Torralba
- Multisensory Research Group; Center for Brain and Cognition; Dept. de Tecnologies de la Informació i les Comunicacions; Universitat Pompeu Fabra; Office 55.128., Roc Boronat, 138 08018 Barcelona Spain
| | - Salvador Soto-Faraco
- Multisensory Research Group; Center for Brain and Cognition; Dept. de Tecnologies de la Informació i les Comunicacions; Universitat Pompeu Fabra; Office 55.128., Roc Boronat, 138 08018 Barcelona Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA); Barcelona Spain
| |
Collapse
|
9
|
Alsius A, Paré M, Munhall KG. Forty Years After Hearing Lips and Seeing Voices: the McGurk Effect Revisited. Multisens Res 2018; 31:111-144. [PMID: 31264597 DOI: 10.1163/22134808-00002565] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 03/09/2017] [Indexed: 11/19/2022]
Abstract
Since its discovery 40 years ago, the McGurk illusion has been usually cited as a prototypical paradigmatic case of multisensory binding in humans, and has been extensively used in speech perception studies as a proxy measure for audiovisual integration mechanisms. Despite the well-established practice of using the McGurk illusion as a tool for studying the mechanisms underlying audiovisual speech integration, the magnitude of the illusion varies enormously across studies. Furthermore, the processing of McGurk stimuli differs from congruent audiovisual processing at both phenomenological and neural levels. This questions the suitability of this illusion as a tool to quantify the necessary and sufficient conditions under which audiovisual integration occurs in natural conditions. In this paper, we review some of the practical and theoretical issues related to the use of the McGurk illusion as an experimental paradigm. We believe that, without a richer understanding of the mechanisms involved in the processing of the McGurk effect, experimenters should be really cautious when generalizing data generated by McGurk stimuli to matching audiovisual speech events.
Collapse
Affiliation(s)
- Agnès Alsius
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| | - Martin Paré
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| | - Kevin G Munhall
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| |
Collapse
|
10
|
Ahveninen J, Huang S, Ahlfors SP, Hämäläinen M, Rossi S, Sams M, Jääskeläinen IP. Interacting parallel pathways associate sounds with visual identity in auditory cortices. Neuroimage 2015; 124:858-868. [PMID: 26419388 DOI: 10.1016/j.neuroimage.2015.09.044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Revised: 08/26/2015] [Accepted: 09/20/2015] [Indexed: 10/23/2022] Open
Abstract
Spatial and non-spatial information of sound events is presumably processed in parallel auditory cortex (AC) "what" and "where" streams, which are modulated by inputs from the respective visual-cortex subsystems. How these parallel processes are integrated to perceptual objects that remain stable across time and the source agent's movements is unknown. We recorded magneto- and electroencephalography (MEG/EEG) data while subjects viewed animated video clips featuring two audiovisual objects, a black cat and a gray cat. Adaptor-probe events were either linked to the same object (the black cat meowed twice in a row in the same location) or included a visually conveyed identity change (the black and then the gray cat meowed with identical voices in the same location). In addition to effects in visual (including fusiform, middle temporal or MT areas) and frontoparietal association areas, the visually conveyed object-identity change was associated with a release from adaptation of early (50-150ms) activity in posterior ACs, spreading to left anterior ACs at 250-450ms in our combined MEG/EEG source estimates. Repetition of events belonging to the same object resulted in increased theta-band (4-8Hz) synchronization within the "what" and "where" pathways (e.g., between anterior AC and fusiform areas). In contrast, the visually conveyed identity changes resulted in distributed synchronization at higher frequencies (alpha and beta bands, 8-32Hz) across different auditory, visual, and association areas. The results suggest that sound events become initially linked to perceptual objects in posterior AC, followed by modulations of representations in anterior AC. Hierarchical what and where pathways seem to operate in parallel after repeating audiovisual associations, whereas the resetting of such associations engages a distributed network across auditory, visual, and multisensory areas.
Collapse
Affiliation(s)
- Jyrki Ahveninen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA, USA.
| | - Samantha Huang
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA, USA
| | - Seppo P Ahlfors
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA, USA
| | - Matti Hämäläinen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA, USA; Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, USA; Department of Neuroscience and Biomedical Engineering, Aalto University, School of Science, Espoo, Finland
| | - Stephanie Rossi
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, MA, USA
| | - Mikko Sams
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, Finland
| | - Iiro P Jääskeläinen
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University School of Science, Espoo, Finland
| |
Collapse
|
11
|
Aller M, Giani A, Conrad V, Watanabe M, Noppeney U. A spatially collocated sound thrusts a flash into awareness. Front Integr Neurosci 2015; 9:16. [PMID: 25774126 PMCID: PMC4343005 DOI: 10.3389/fnint.2015.00016] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 02/05/2015] [Indexed: 11/22/2022] Open
Abstract
To interact effectively with the environment the brain integrates signals from multiple senses. It is currently unclear to what extent spatial information can be integrated across different senses in the absence of awareness. Combining dynamic continuous flash suppression (CFS) and spatial audiovisual stimulation, the current study investigated whether a sound facilitates a concurrent visual flash to elude flash suppression and enter perceptual awareness depending on audiovisual spatial congruency. Our results demonstrate that a concurrent sound boosts unaware visual signals into perceptual awareness. Critically, this process depended on the spatial congruency of the auditory and visual signals pointing towards low level mechanisms of audiovisual integration. Moreover, the concurrent sound biased the reported location of the flash as a function of flash visibility. The spatial bias of sounds on reported flash location was strongest for flashes that were judged invisible. Our results suggest that multisensory integration is a critical mechanism that enables signals to enter conscious perception.
Collapse
Affiliation(s)
- Máté Aller
- Computational Cognitive Neuroimaging Laboratory, Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham Birmingham, UK
| | - Anette Giani
- Max Planck Institute for Biological Cybernetics Tübingen, Germany
| | - Verena Conrad
- Max Planck Institute for Biological Cybernetics Tübingen, Germany
| | | | - Uta Noppeney
- Computational Cognitive Neuroimaging Laboratory, Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham Birmingham, UK ; Max Planck Institute for Biological Cybernetics Tübingen, Germany
| |
Collapse
|
12
|
Vidal M, Barrès V. Hearing (rivaling) lips and seeing voices: how audiovisual interactions modulate perceptual stabilization in binocular rivalry. Front Hum Neurosci 2014; 8:677. [PMID: 25237302 PMCID: PMC4154468 DOI: 10.3389/fnhum.2014.00677] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 08/13/2014] [Indexed: 11/13/2022] Open
Abstract
In binocular rivalry (BR), sensory input remains the same yet subjective experience fluctuates irremediably between two mutually exclusive representations. We investigated the perceptual stabilization effect of an additional sound on the BR dynamics using speech stimuli known to involve robust audiovisual (AV) interactions at several cortical levels. Subjects sensitive to the McGurk effect were presented looping videos of rivaling faces uttering /aba/ and /aga/, respectively, while synchronously hearing the voice /aba/. They reported continuously the dominant percept, either observing passively or trying actively to promote one of the faces. The few studies that investigated the influence of information from an external modality on perceptual competition reported results that seem at first sight inconsistent. Since these differences could stem from how well the modalities matched, we addressed this by comparing two levels of AV congruence: real (/aba/ viseme) vs. illusory (/aga/ viseme producing the /ada/ McGurk fusion). First, adding the voice /aba/ stabilized both real and illusory congruent lips percept. Second, real congruence of the added voice improved volitional control whereas illusory congruence did not, suggesting a graded contribution to the top-down sensitivity control of selective attention. In conclusion, a congruent sound enhanced considerably attentional control over the perceptual outcome selection; however, differences between passive stabilization and active control according to AV congruency suggest these are governed by two distinct mechanisms. Based on existing theoretical models of BR, selective attention and AV interaction in speech perception, we provide a general interpretation of our findings.
Collapse
Affiliation(s)
- Manuel Vidal
- Laboratoire de Physiologie de la Perception et de l'Action, UMR7152 Collège de France, CNRS Paris, France ; Institut de Neurosciences de la Timone, UMR7289 Aix-Marseille Université, CNRS Marseille, France
| | - Victor Barrès
- Laboratoire de Physiologie de la Perception et de l'Action, UMR7152 Collège de France, CNRS Paris, France
| |
Collapse
|
13
|
Shestopalova L, Bőhm TM, Bendixen A, Andreou AG, Georgiou J, Garreau G, Hajdu B, Denham SL, Winkler I. Do audio-visual motion cues promote segregation of auditory streams? Front Neurosci 2014; 8:64. [PMID: 24778604 PMCID: PMC3985028 DOI: 10.3389/fnins.2014.00064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 03/19/2014] [Indexed: 11/19/2022] Open
Abstract
An audio-visual experiment using moving sound sources was designed to investigate whether the analysis of auditory scenes is modulated by synchronous presentation of visual information. Listeners were presented with an alternating sequence of two pure tones delivered by two separate sound sources. In different conditions, the two sound sources were either stationary or moving on random trajectories around the listener. Both the sounds and the movement trajectories were derived from recordings in which two humans were moving with loudspeakers attached to their heads. Visualized movement trajectories modeled by a computer animation were presented together with the sounds. In the main experiment, behavioral reports on sound organization were collected from young healthy volunteers. The proportion and stability of the different sound organizations were compared between the conditions in which the visualized trajectories matched the movement of the sound sources and when the two were independent of each other. The results corroborate earlier findings that separation of sound sources in space promotes segregation. However, no additional effect of auditory movement per se on the perceptual organization of sounds was obtained. Surprisingly, the presentation of movement-congruent visual cues did not strengthen the effects of spatial separation on segregating auditory streams. Our findings are consistent with the view that bistability in the auditory modality can occur independently from other modalities.
Collapse
Affiliation(s)
- Lidia Shestopalova
- Pavlov Institute of Physiology, Russian Academy of Sciences St.-Petersburg, Russia
| | - Tamás M Bőhm
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics Budapest, Hungary
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University of Oldenburg Oldenburg, Germany
| | - Andreas G Andreou
- Department of Electrical and Computer Engineering, Johns Hopkins University Baltimore, MD, USA ; Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Julius Georgiou
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Guillaume Garreau
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Botond Hajdu
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary
| | - Susan L Denham
- School of Psychology, Cognition Institute, University of Plymouth Plymouth, UK
| | - István Winkler
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged Szeged, Hungary
| |
Collapse
|
14
|
Deroy O, Chen YC, Spence C. Multisensory constraints on awareness. Philos Trans R Soc Lond B Biol Sci 2014; 369:20130207. [PMID: 24639579 DOI: 10.1098/rstb.2013.0207] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Given that multiple senses are often stimulated at the same time, perceptual awareness is most likely to take place in multisensory situations. However, theories of awareness are based on studies and models established for a single sense (mostly vision). Here, we consider the methodological and theoretical challenges raised by taking a multisensory perspective on perceptual awareness. First, we consider how well tasks designed to study unisensory awareness perform when used in multisensory settings, stressing that studies using binocular rivalry, bistable figure perception, continuous flash suppression, the attentional blink, repetition blindness and backward masking can demonstrate multisensory influences on unisensory awareness, but fall short of tackling multisensory awareness directly. Studies interested in the latter phenomenon rely on a method of subjective contrast and can, at best, delineate conditions under which individuals report experiencing a multisensory object or two unisensory objects. As there is not a perfect match between these conditions and those in which multisensory integration and binding occur, the link between awareness and binding advocated for visual information processing needs to be revised for multisensory cases. These challenges point at the need to question the very idea of multisensory awareness.
Collapse
Affiliation(s)
- Ophelia Deroy
- Centre for the Study of the Senses and Institute of Philosophy, School of Advanced Study, University of London, , London, UK
| | | | | |
Collapse
|
15
|
Abstract
Resolution of perceptual ambiguity is one function of cross-modal interactions. Here we investigate whether auditory and tactile stimuli can influence binocular rivalry generated by interocular temporal conflict in human subjects. Using dichoptic visual stimuli modulating at different temporal frequencies, we added modulating sounds or vibrations congruent with one or the other visual temporal frequency. Auditory and tactile stimulation both interacted with binocular rivalry by promoting dominance of the congruent visual stimulus. This effect depended on the cross-modal modulation strength and was absent when modulation depth declined to 33%. However, when auditory and tactile stimuli that were too weak on their own to bias binocular rivalry were combined, their influence over vision was very strong, suggesting the auditory and tactile temporal signals combined to influence vision. Similarly, interleaving discrete pulses of auditory and tactile stimuli also promoted dominance of the visual stimulus congruent with the supramodal frequency. When auditory and tactile stimuli were presented at maximum strength, but in antiphase, they had no influence over vision for low temporal frequencies, a null effect again suggesting audio-tactile combination. We also found that the cross-modal interaction was frequency-sensitive at low temporal frequencies, when information about temporal phase alignment can be perceptually tracked. These results show that auditory and tactile temporal processing is functionally linked, suggesting a common neural substrate for the two sensory modalities and that at low temporal frequencies visual activity can be synchronized by a congruent cross-modal signal in a frequency-selective way, suggesting the existence of a supramodal temporal binding mechanism.
Collapse
|
16
|
Conrad V, Kleiner M, Bartels A, Hartcher O'Brien J, Bülthoff HH, Noppeney U. Naturalistic stimulus structure determines the integration of audiovisual looming signals in binocular rivalry. PLoS One 2013; 8:e70710. [PMID: 24015176 PMCID: PMC3754975 DOI: 10.1371/journal.pone.0070710] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 06/26/2013] [Indexed: 11/19/2022] Open
Abstract
Rapid integration of biologically relevant information is crucial for the survival of an organism. Most prominently, humans should be biased to attend and respond to looming stimuli that signal approaching danger (e.g. predator) and hence require rapid action. This psychophysics study used binocular rivalry to investigate the perceptual advantage of looming (relative to receding) visual signals (i.e. looming bias) and how this bias can be influenced by concurrent auditory looming/receding stimuli and the statistical structure of the auditory and visual signals. Subjects were dichoptically presented with looming/receding visual stimuli that were paired with looming or receding sounds. The visual signals conformed to two different statistical structures: (1) a 'simple' random-dot kinematogram showing a starfield and (2) a "naturalistic" visual Shepard stimulus. Likewise, the looming/receding sound was (1) a simple amplitude- and frequency-modulated (AM-FM) tone or (2) a complex Shepard tone. Our results show that the perceptual looming bias (i.e. the increase in dominance times for looming versus receding percepts) is amplified by looming sounds, yet reduced and even converted into a receding bias by receding sounds. Moreover, the influence of looming/receding sounds on the visual looming bias depends on the statistical structure of both the visual and auditory signals. It is enhanced when audiovisual signals are Shepard stimuli. In conclusion, visual perception prioritizes processing of biologically significant looming stimuli especially when paired with looming auditory signals. Critically, these audiovisual interactions are amplified for statistically complex signals that are more naturalistic and known to engage neural processing at multiple levels of the cortical hierarchy.
Collapse
Affiliation(s)
- Verena Conrad
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Mario Kleiner
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Andreas Bartels
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- Werner Reichardt Centre for Integrative Neurosciences, Tübingen, Germany
| | | | - Heinrich H. Bülthoff
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul, Korea
| | - Uta Noppeney
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, United Kingdom
| |
Collapse
|
17
|
Mochida T, Kimura T, Hiroya S, Kitagawa N, Gomi H, Kondo T. Speech misperception: speaking and seeing interfere differently with hearing. PLoS One 2013; 8:e68619. [PMID: 23844227 PMCID: PMC3701087 DOI: 10.1371/journal.pone.0068619] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 05/30/2013] [Indexed: 11/23/2022] Open
Abstract
Speech perception is thought to be linked to speech motor production. This linkage is considered to mediate multimodal aspects of speech perception, such as audio-visual and audio-tactile integration. However, direct coupling between articulatory movement and auditory perception has been little studied. The present study reveals a clear dissociation between the effects of a listener’s own speech action and the effects of viewing another’s speech movements on the perception of auditory phonemes. We assessed the intelligibility of the syllables [pa], [ta], and [ka] when listeners silently and simultaneously articulated syllables that were congruent/incongruent with the syllables they heard. The intelligibility was compared with a condition where the listeners simultaneously watched another’s mouth producing congruent/incongruent syllables, but did not articulate. The intelligibility of [ta] and [ka] were degraded by articulating [ka] and [ta] respectively, which are associated with the same primary articulator (tongue) as the heard syllables. But they were not affected by articulating [pa], which is associated with a different primary articulator (lips) from the heard syllables. In contrast, the intelligibility of [ta] and [ka] was degraded by watching the production of [pa]. These results indicate that the articulatory-induced distortion of speech perception occurs in an articulator-specific manner while visually induced distortion does not. The articulator-specific nature of the auditory-motor interaction in speech perception suggests that speech motor processing directly contributes to our ability to hear speech.
Collapse
Affiliation(s)
- Takemi Mochida
- NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Atsugi, Japan.
| | | | | | | | | | | |
Collapse
|
18
|
Touch interacts with vision during binocular rivalry with a tight orientation tuning. PLoS One 2013; 8:e58754. [PMID: 23472219 PMCID: PMC3589364 DOI: 10.1371/journal.pone.0058754] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 02/06/2013] [Indexed: 12/03/2022] Open
Abstract
Multisensory integration is a common feature of the mammalian brain that allows it to deal more efficiently with the ambiguity of sensory input by combining complementary signals from several sensory sources. Growing evidence suggests that multisensory interactions can occur as early as primary sensory cortices. Here we present incompatible visual signals (orthogonal gratings) to each eye to create visual competition between monocular inputs in primary visual cortex where binocular combination would normally take place. The incompatibility prevents binocular fusion and triggers an ambiguous perceptual response in which the two images are perceived one at a time in an irregular alternation. One key function of multisensory integration is to minimize perceptual ambiguity by exploiting cross-sensory congruence. We show that a haptic signal matching one of the visual alternatives helps disambiguate visual perception during binocular rivalry by both prolonging the dominance period of the congruent visual stimulus and by shortening its suppression period. Importantly, this interaction is strictly tuned for orientation, with a mismatch as small as 7.5° between visual and haptic orientations sufficient to annul the interaction. These results indicate important conclusions: first, that vision and touch interact at early levels of visual processing where interocular conflicts are first detected and orientation tunings are narrow, and second, that haptic input can influence visual signals outside of visual awareness, bringing a stimulus made invisible by binocular rivalry suppression back to awareness sooner than would occur without congruent haptic input.
Collapse
|
19
|
Alsius A, Munhall KG. Detection of Audiovisual Speech Correspondences Without Visual Awareness. Psychol Sci 2013; 24:423-31. [DOI: 10.1177/0956797612457378] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Mounting physiological and behavioral evidence has shown that the detectability of a visual stimulus can be enhanced by a simultaneously presented sound. The mechanisms underlying these cross-sensory effects, however, remain largely unknown. Using continuous flash suppression (CFS), we rendered a complex, dynamic visual stimulus (i.e., a talking face) consciously invisible to participants. We presented the visual stimulus together with a suprathreshold auditory stimulus (i.e., a voice speaking a sentence) that either matched or mismatched the lip movements of the talking face. We compared how long it took for the talking face to overcome interocular suppression and become visible to participants in the matched and mismatched conditions. Our results showed that the detection of the face was facilitated by the presentation of a matching auditory sentence, in comparison with the presentation of a mismatching sentence. This finding indicates that the registration of audiovisual correspondences occurs at an early stage of processing, even when the visual information is blocked from conscious awareness.
Collapse
|
20
|
Schwartz JL, Grimault N, Hupé JM, Moore BCJ, Pressnitzer D. Multistability in perception: binding sensory modalities, an overview. Philos Trans R Soc Lond B Biol Sci 2012; 367:896-905. [PMID: 22371612 DOI: 10.1098/rstb.2011.0254] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
This special issue presents research concerning multistable perception in different sensory modalities. Multistability occurs when a single physical stimulus produces alternations between different subjective percepts. Multistability was first described for vision, where it occurs, for example, when different stimuli are presented to the two eyes or for certain ambiguous figures. It has since been described for other sensory modalities, including audition, touch and olfaction. The key features of multistability are: (i) stimuli have more than one plausible perceptual organization; (ii) these organizations are not compatible with each other. We argue here that most if not all cases of multistability are based on competition in selecting and binding stimulus information. Binding refers to the process whereby the different attributes of objects in the environment, as represented in the sensory array, are bound together within our perceptual systems, to provide a coherent interpretation of the world around us. We argue that multistability can be used as a method for studying binding processes within and across sensory modalities. We emphasize this theme while presenting an outline of the papers in this issue. We end with some thoughts about open directions and avenues for further research.
Collapse
Affiliation(s)
- Jean-Luc Schwartz
- Gipsa-lab, UMR 5216 CNRS, Grenoble INP, Université Joseph Fourier, Université Stendhal, Grenoble, France
| | | | | | | | | |
Collapse
|
21
|
Person identification through faces and voices: An ERP study. Brain Res 2011; 1407:13-26. [DOI: 10.1016/j.brainres.2011.03.029] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 03/11/2011] [Indexed: 11/17/2022]
|
22
|
Buchan JN, Munhall KG. The Influence of Selective Attention to Auditory and Visual Speech on the Integration of Audiovisual Speech Information. Perception 2011; 40:1164-82. [DOI: 10.1068/p6939] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Abstract
Conflicting visual speech information can influence the perception of acoustic speech, causing an illusory percept of a sound not present in the actual acoustic speech (the McGurk effect). We examined whether participants can voluntarily selectively attend to either the auditory or visual modality by instructing participants to pay attention to the information in one modality and to ignore competing information from the other modality. We also examined how performance under these instructions was affected by weakening the influence of the visual information by manipulating the temporal offset between the audio and video channels (experiment 1), and the spatial frequency information present in the video (experiment 2). Gaze behaviour was also monitored to examine whether attentional instructions influenced the gathering of visual information. While task instructions did have an influence on the observed integration of auditory and visual speech information, participants were unable to completely ignore conflicting information, particularly information from the visual stream. Manipulating temporal offset had a more pronounced interaction with task instructions than manipulating the amount of visual information. Participants' gaze behaviour suggests that the attended modality influences the gathering of visual information in audiovisual speech perception.
Collapse
Affiliation(s)
| | - Kevin G Munhall
- Department of Otolaryngology, Queen's University, Kingston, Ontario, Canada
| |
Collapse
|