1
|
Nguyen T, Lagacé-Cusiac R, Everling JC, Henry MJ, Grahn JA. Audiovisual integration of rhythm in musicians and dancers. Atten Percept Psychophys 2024; 86:1400-1416. [PMID: 38557941 DOI: 10.3758/s13414-024-02874-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2024] [Indexed: 04/04/2024]
Abstract
Music training is associated with better beat processing in the auditory modality. However, it is unknown how rhythmic training that emphasizes visual rhythms, such as dance training, might affect beat processing, nor whether training effects in general are modality specific. Here we examined how music and dance training interacted with modality during audiovisual integration and synchronization to auditory and visual isochronous sequences. In two experiments, musicians, dancers, and controls completed an audiovisual integration task and an audiovisual target-distractor synchronization task using dynamic visual stimuli (a bouncing figure). The groups performed similarly on the audiovisual integration tasks (Experiments 1 and 2). However, in the finger-tapping synchronization task (Experiment 1), musicians were more influenced by auditory distractors when synchronizing to visual sequences, while dancers were more influenced by visual distractors when synchronizing to auditory sequences. When participants synchronized with whole-body movements instead of finger-tapping (Experiment 2), all groups were more influenced by the visual distractor than the auditory distractor. Taken together, this study highlights how training is associated with audiovisual processing, and how different types of visual rhythmic stimuli and different movements alter beat perception and production outcome measures. Implications for the modality appropriateness hypothesis are discussed.
Collapse
Affiliation(s)
- Tram Nguyen
- Brain and Mind Institute and Department of Psychology, University of Western Ontario, London, Ontario, Canada
| | - Rebekka Lagacé-Cusiac
- Brain and Mind Institute and Department of Psychology, University of Western Ontario, London, Ontario, Canada
| | - J Celina Everling
- Brain and Mind Institute and Department of Psychology, University of Western Ontario, London, Ontario, Canada
| | - Molly J Henry
- Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany
- Department of Psychology, Toronto Metropolitan University, Toronto, Ontario, Canada
| | - Jessica A Grahn
- Brain and Mind Institute and Department of Psychology, University of Western Ontario, London, Ontario, Canada.
| |
Collapse
|
2
|
Cornelio P, Velasco C, Obrist M. Multisensory Integration as per Technological Advances: A Review. Front Neurosci 2021; 15:652611. [PMID: 34239410 PMCID: PMC8257956 DOI: 10.3389/fnins.2021.652611] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 04/29/2021] [Indexed: 11/13/2022] Open
Abstract
Multisensory integration research has allowed us to better understand how humans integrate sensory information to produce a unitary experience of the external world. However, this field is often challenged by the limited ability to deliver and control sensory stimuli, especially when going beyond audio-visual events and outside laboratory settings. In this review, we examine the scope and challenges of new technology in the study of multisensory integration in a world that is increasingly characterized as a fusion of physical and digital/virtual events. We discuss multisensory integration research through the lens of novel multisensory technologies and, thus, bring research in human-computer interaction, experimental psychology, and neuroscience closer together. Today, for instance, displays have become volumetric so that visual content is no longer limited to 2D screens, new haptic devices enable tactile stimulation without physical contact, olfactory interfaces provide users with smells precisely synchronized with events in virtual environments, and novel gustatory interfaces enable taste perception through levitating stimuli. These technological advances offer new ways to control and deliver sensory stimulation for multisensory integration research beyond traditional laboratory settings and open up new experimentations in naturally occurring events in everyday life experiences. Our review then summarizes these multisensory technologies and discusses initial insights to introduce a bridge between the disciplines in order to advance the study of multisensory integration.
Collapse
Affiliation(s)
- Patricia Cornelio
- Department of Computer Science, University College London, London, United Kingdom
| | - Carlos Velasco
- Centre for Multisensory Marketing, Department of Marketing, BI Norwegian Business School, Oslo, Norway
| | - Marianna Obrist
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
3
|
Keni RR, Radhakrishnan A. Using McGurk effect to detect speech-perceptional abnormalities in refractory epilepsy. Epilepsy Behav 2021; 114:107600. [PMID: 33248941 DOI: 10.1016/j.yebeh.2020.107600] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 10/19/2020] [Accepted: 10/25/2020] [Indexed: 11/29/2022]
Abstract
BACKGROUND McGurk effect is a perceptual phenomenon that demonstrates an interaction between hearing and vision in speech perception. A wide range of neuropsychological deficits have been described in people with long-standing epilepsy, which affect multimodal integration in speech perception and hence refractory epilepsy patients are ideal for testing the McGurk effect. MATERIALS AND METHODS We studied the McGurk effect in 50 patients diagnosed with medically refractory left or right hemispheric epilepsy based on clinical, radiological, and electrophysiological data. RESULTS The McGurk effect was better perceived (p = 0.006) in patients with left hemispheric epilepsy (n = 12, 71%) compared to right (n = 5, 29%). The other factors which compromised the perception of the McGurk effect were impairments in visual memory (p = 0.041), facial emotion recognition (p = 0.001), and lip-reading (p = 0.006). Perception of the McGurk effect reduced significantly (p = 0.006) when the epilepsy duration was 10 years or beyond. CONCLUSION The McGurk effect can be used in refractory epilepsy patients, to detect subtle abnormalities in speech perception, before significant irreversible speech and language dysfunction become evident.
Collapse
Affiliation(s)
- Ravish R Keni
- R. Madhavan Nayar Centre for Comprehensive Epilepsy Care, Department of Neurology, Sree ChitraTirunal Institute for Medical Sciences and Technology, Kerala, India
| | - Ashalatha Radhakrishnan
- R. Madhavan Nayar Centre for Comprehensive Epilepsy Care, Department of Neurology, Sree ChitraTirunal Institute for Medical Sciences and Technology, Kerala, India.
| |
Collapse
|
4
|
Thézé R, Gadiri MA, Albert L, Provost A, Giraud AL, Mégevand P. Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments. Sci Rep 2020; 10:15540. [PMID: 32968127 PMCID: PMC7511320 DOI: 10.1038/s41598-020-72375-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 08/31/2020] [Indexed: 11/09/2022] Open
Abstract
Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.
Collapse
Affiliation(s)
- Raphaël Thézé
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Mehdi Ali Gadiri
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Louis Albert
- Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland
| | - Antoine Provost
- Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Pierre Mégevand
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland. .,Division of Neurology, Geneva University Hospitals, Geneva, Switzerland.
| |
Collapse
|
5
|
Feng G, Zhou B, Zhou W, Beauchamp MS, Magnotti JF. A Laboratory Study of the McGurk Effect in 324 Monozygotic and Dizygotic Twins. Front Neurosci 2019; 13:1029. [PMID: 31636529 PMCID: PMC6787151 DOI: 10.3389/fnins.2019.01029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 09/10/2019] [Indexed: 11/13/2022] Open
Abstract
Multisensory integration of information from the talker's voice and the talker's mouth facilitates human speech perception. A popular assay of audiovisual integration is the McGurk effect, an illusion in which incongruent visual speech information categorically changes the percept of auditory speech. There is substantial interindividual variability in susceptibility to the McGurk effect. To better understand possible sources of this variability, we examined the McGurk effect in 324 native Mandarin speakers, consisting of 73 monozygotic (MZ) and 89 dizygotic (DZ) twin pairs. When tested with 9 different McGurk stimuli, some participants never perceived the illusion and others always perceived it. Within participants, perception was similar across time (r = 0.55 at a 2-year retest in 150 participants) suggesting that McGurk susceptibility reflects a stable trait rather than short-term perceptual fluctuations. To examine the effects of shared genetics and prenatal environment, we compared McGurk susceptibility between MZ and DZ twins. Both twin types had significantly greater correlation than unrelated pairs (r = 0.28 for MZ twins and r = 0.21 for DZ twins) suggesting that the genes and environmental factors shared by twins contribute to individual differences in multisensory speech perception. Conversely, the existence of substantial differences within twin pairs (even MZ co-twins) and the overall low percentage of explained variance (5.5%) argues against a deterministic view of individual differences in multisensory integration.
Collapse
Affiliation(s)
- Guo Feng
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- Psychological Research and Counseling Center, Southwest Jiaotong University, Chengdu, China
| | - Bin Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Wen Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Michael S. Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| | - John F. Magnotti
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
6
|
"Paying" attention to audiovisual speech: Do incongruent stimuli incur greater costs? Atten Percept Psychophys 2019; 81:1743-1756. [PMID: 31197661 DOI: 10.3758/s13414-019-01772-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The McGurk effect is a multisensory phenomenon in which discrepant auditory and visual speech signals typically result in an illusory percept. McGurk stimuli are often used in studies assessing the attentional requirements of audiovisual integration, but no study has directly compared the costs associated with integrating congruent versus incongruent audiovisual speech. Some evidence suggests that the McGurk effect may not be representative of naturalistic audiovisual speech processing - susceptibility to the McGurk effect is not associated with the ability to derive benefit from the addition of the visual signal, and distinct cortical regions are recruited when processing congruent versus incongruent speech. In two experiments, one using response times to identify congruent and incongruent syllables and one using a dual-task paradigm, we assessed whether congruent and incongruent audiovisual speech incur different attentional costs. We demonstrated that response times to both the speech task (Experiment 1) and a secondary vibrotactile task (Experiment 2) were indistinguishable for congruent compared to incongruent syllables, but McGurk fusions were responded to more quickly than McGurk non-fusions. These results suggest that despite documented differences in how congruent and incongruent stimuli are processed, they do not appear to differ in terms of processing time or effort, at least in the open-set task speech task used here. However, responses that result in McGurk fusions are processed more quickly than those that result in non-fusions, though attentional cost is comparable for the two response types.
Collapse
|
7
|
Brown VA, Hedayati M, Zanger A, Mayn S, Ray L, Dillman-Hasso N, Strand JF. What accounts for individual differences in susceptibility to the McGurk effect? PLoS One 2018; 13:e0207160. [PMID: 30418995 PMCID: PMC6231656 DOI: 10.1371/journal.pone.0207160] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 10/25/2018] [Indexed: 11/29/2022] Open
Abstract
The McGurk effect is a classic audiovisual speech illusion in which discrepant auditory and visual syllables can lead to a fused percept (e.g., an auditory /bɑ/ paired with a visual /gɑ/ often leads to the perception of /dɑ/). The McGurk effect is robust and easily replicated in pooled group data, but there is tremendous variability in the extent to which individual participants are susceptible to it. In some studies, the rate at which individuals report fusion responses ranges from 0% to 100%. Despite its widespread use in the audiovisual speech perception literature, the roots of the wide variability in McGurk susceptibility are largely unknown. This study evaluated whether several perceptual and cognitive traits are related to McGurk susceptibility through correlational analyses and mixed effects modeling. We found that an individual's susceptibility to the McGurk effect was related to their ability to extract place of articulation information from the visual signal (i.e., a more fine-grained analysis of lipreading ability), but not to scores on tasks measuring attentional control, processing speed, working memory capacity, or auditory perceptual gradiency. These results provide support for the claim that a small amount of the variability in susceptibility to the McGurk effect is attributable to lipreading skill. In contrast, cognitive and perceptual abilities that are commonly used predictors in individual differences studies do not appear to underlie susceptibility to the McGurk effect.
Collapse
Affiliation(s)
- Violet A. Brown
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Maryam Hedayati
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Annie Zanger
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Sasha Mayn
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Lucia Ray
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Naseem Dillman-Hasso
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| | - Julia F. Strand
- Department of Psychology, Carleton College, Northfield, Minnesota, United States of America
| |
Collapse
|
8
|
Proverbio AM, Raso G, Zani A. Electrophysiological Indexes of Incongruent Audiovisual Phonemic Processing: Unraveling the McGurk Effect. Neuroscience 2018; 385:215-226. [PMID: 29932985 DOI: 10.1016/j.neuroscience.2018.06.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 06/11/2018] [Accepted: 06/12/2018] [Indexed: 11/15/2022]
Abstract
In this study the timing of electromagnetic signals recorded during incongruent and congruent audiovisual (AV) stimulation in 14 Italian healthy volunteers was examined. In a previous study (Proverbio et al., 2016) we investigated the McGurk effect in the Italian language and found out which visual and auditory inputs provided the most compelling illusory effects (e.g., bilabial phonemes presented acoustically and paired with non-labials, especially alveolar-nasal and velar-occlusive phonemes). In this study EEG was recorded from 128 scalp sites while participants observed a female and a male actor uttering 288 syllables selected on the basis of the previous investigation (lasting approximately 600 ms) and responded to rare targets (/re/, /ri/, /ro/, /ru/). In half of the cases the AV information was incongruent, except for targets that were always congruent. A pMMN (phonological Mismatch Negativity) to incongruent AV stimuli was identified 500 ms after voice onset time. This automatic response indexed the detection of an incongruity between the labial and phonetic information. SwLORETA (Low-Resolution Electromagnetic Tomography) analysis applied to the difference voltage incongruent-congruent in the same time window revealed that the strongest sources of this activity were the right superior temporal (STG) and superior frontal gyri, which supports their involvement in AV integration.
Collapse
Affiliation(s)
- Alice Mado Proverbio
- Neuro-Mi Center for Neuroscience, Dept. of Psychology, University of Milano-Bicocca, Italy.
| | - Giulia Raso
- Neuro-Mi Center for Neuroscience, Dept. of Psychology, University of Milano-Bicocca, Italy
| | | |
Collapse
|
9
|
Alsius A, Paré M, Munhall KG. Forty Years After Hearing Lips and Seeing Voices: the McGurk Effect Revisited. Multisens Res 2018; 31:111-144. [PMID: 31264597 DOI: 10.1163/22134808-00002565] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 03/09/2017] [Indexed: 11/19/2022]
Abstract
Since its discovery 40 years ago, the McGurk illusion has been usually cited as a prototypical paradigmatic case of multisensory binding in humans, and has been extensively used in speech perception studies as a proxy measure for audiovisual integration mechanisms. Despite the well-established practice of using the McGurk illusion as a tool for studying the mechanisms underlying audiovisual speech integration, the magnitude of the illusion varies enormously across studies. Furthermore, the processing of McGurk stimuli differs from congruent audiovisual processing at both phenomenological and neural levels. This questions the suitability of this illusion as a tool to quantify the necessary and sufficient conditions under which audiovisual integration occurs in natural conditions. In this paper, we review some of the practical and theoretical issues related to the use of the McGurk illusion as an experimental paradigm. We believe that, without a richer understanding of the mechanisms involved in the processing of the McGurk effect, experimenters should be really cautious when generalizing data generated by McGurk stimuli to matching audiovisual speech events.
Collapse
Affiliation(s)
- Agnès Alsius
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| | - Martin Paré
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| | - Kevin G Munhall
- Psychology Department, Queen's University, Humphrey Hall, 62 Arch St., Kingston, Ontario, K7L 3N6 Canada
| |
Collapse
|
10
|
Ross LA, Del Bene VA, Molholm S, Woo YJ, Andrade GN, Abrahams BS, Foxe JJ. Common variation in the autism risk gene CNTNAP2, brain structural connectivity and multisensory speech integration. BRAIN AND LANGUAGE 2017; 174:50-60. [PMID: 28738218 DOI: 10.1016/j.bandl.2017.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 04/07/2017] [Accepted: 07/11/2017] [Indexed: 06/07/2023]
Abstract
Three lines of evidence motivated this study. 1) CNTNAP2 variation is associated with autism risk and speech-language development. 2) CNTNAP2 variations are associated with differences in white matter (WM) tracts comprising the speech-language circuitry. 3) Children with autism show impairment in multisensory speech perception. Here, we asked whether an autism risk-associated CNTNAP2 single nucleotide polymorphism in neurotypical adults was associated with multisensory speech perception performance, and whether such a genotype-phenotype association was mediated through white matter tract integrity in speech-language circuitry. Risk genotype at rs7794745 was associated with decreased benefit from visual speech and lower fractional anisotropy (FA) in several WM tracts (right precentral gyrus, left anterior corona radiata, right retrolenticular internal capsule). These structural connectivity differences were found to mediate the effect of genotype on audiovisual speech perception, shedding light on possible pathogenic pathways in autism and biological sources of inter-individual variation in audiovisual speech processing in neurotypicals.
Collapse
Affiliation(s)
- Lars A Ross
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA.
| | - Victor A Del Bene
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Ferkauf Graduate School of Psychology Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Sophie Molholm
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Young Jae Woo
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Gizely N Andrade
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA
| | - Brett S Abrahams
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - John J Foxe
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA.
| |
Collapse
|
11
|
Festa EK, Katz AP, Ott BR, Tremont G, Heindel WC. Dissociable Effects of Aging and Mild Cognitive Impairment on Bottom-Up Audiovisual Integration. J Alzheimers Dis 2017; 59:155-167. [DOI: 10.3233/jad-161062] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Elena K. Festa
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Andrew P. Katz
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| | - Brian R. Ott
- Department of Neurology, Alpert Medical School, Brown University, Providence, RI, USA
- Department of Neurology, Rhode Island Hospital, Providence, RI, USA
| | - Geoffrey Tremont
- Department of Psychiatry and Human Behavior, Alpert Medical School, Brown University, Providence, RI, USA
- Department of Psychiatry, Rhode Island Hospital, Providence, RI, USA
| | - William C. Heindel
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA
| |
Collapse
|
12
|
Scarbel L, Beautemps D, Schwartz JL, Sato M. Sensory-motor relationships in speech production in post-lingually deaf cochlear-implanted adults and normal-hearing seniors: Evidence from phonetic convergence and speech imitation. Neuropsychologia 2017; 101:39-46. [PMID: 28483485 DOI: 10.1016/j.neuropsychologia.2017.05.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2017] [Revised: 04/17/2017] [Accepted: 05/04/2017] [Indexed: 11/26/2022]
Abstract
Speech communication can be viewed as an interactive process involving a functional coupling between sensory and motor systems. One striking example comes from phonetic convergence, when speakers automatically tend to mimic their interlocutor's speech during communicative interaction. The goal of this study was to investigate sensory-motor linkage in speech production in postlingually deaf cochlear implanted participants and normal hearing elderly adults through phonetic convergence and imitation. To this aim, two vowel production tasks, with or without instruction to imitate an acoustic vowel, were proposed to three groups of young adults with normal hearing, elderly adults with normal hearing and post-lingually deaf cochlear-implanted patients. Measure of the deviation of each participant's f0 from their own mean f0 was measured to evaluate the ability to converge to each acoustic target. RESULTS showed that cochlear-implanted participants have the ability to converge to an acoustic target, both intentionally and unintentionally, albeit with a lower degree than young and elderly participants with normal hearing. By providing evidence for phonetic convergence and speech imitation, these results suggest that, as in young adults, perceptuo-motor relationships are efficient in elderly adults with normal hearing and that cochlear-implanted adults recovered significant perceptuo-motor abilities following cochlear implantation.
Collapse
Affiliation(s)
- Lucie Scarbel
- GIPSA-LAB, Département Parole & Cognition, CNRS & Grenoble Université, Grenoble, France.
| | - Denis Beautemps
- GIPSA-LAB, Département Parole & Cognition, CNRS & Grenoble Université, Grenoble, France
| | - Jean-Luc Schwartz
- GIPSA-LAB, Département Parole & Cognition, CNRS & Grenoble Université, Grenoble, France
| | - Marc Sato
- Laboratoire Parole & Langage, CNRS & Aix-Marseille Université, Aix-en-Provence, France
| |
Collapse
|
13
|
Skilled musicians are not subject to the McGurk effect. Sci Rep 2016; 6:30423. [PMID: 27453363 PMCID: PMC4958963 DOI: 10.1038/srep30423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 07/05/2016] [Indexed: 11/25/2022] Open
Abstract
The McGurk effect is a compelling illusion in which humans auditorily perceive mismatched audiovisual speech as a completely different syllable. In this study evidences are provided that professional musicians are not subject to this illusion, possibly because of their finer auditory or attentional abilities. 80 healthy age-matched graduate students volunteered to the study. 40 were musicians of Brescia Luca Marenzio Conservatory of Music with at least 8–13 years of musical academic studies. /la/, /da/, /ta/, /ga/, /ka/, /na/, /ba/, /pa/ phonemes were presented to participants in audiovisual congruent and incongruent conditions, or in unimodal (only visual or only auditory) conditions while engaged in syllable recognition tasks. Overall musicians showed no significant McGurk effect for any of the phonemes. Controls showed a marked McGurk effect for several phonemes (including alveolar-nasal, velar-occlusive and bilabial ones). The results indicate that the early and intensive musical training might affect the way the auditory cortex process phonetic information.
Collapse
|
14
|
Poliva O. From Mimicry to Language: A Neuroanatomically Based Evolutionary Model of the Emergence of Vocal Language. Front Neurosci 2016; 10:307. [PMID: 27445676 PMCID: PMC4928493 DOI: 10.3389/fnins.2016.00307] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 06/17/2016] [Indexed: 11/24/2022] Open
Abstract
The auditory cortex communicates with the frontal lobe via the middle temporal gyrus (auditory ventral stream; AVS) or the inferior parietal lobule (auditory dorsal stream; ADS). Whereas the AVS is ascribed only with sound recognition, the ADS is ascribed with sound localization, voice detection, prosodic perception/production, lip-speech integration, phoneme discrimination, articulation, repetition, phonological long-term memory and working memory. Previously, I interpreted the juxtaposition of sound localization, voice detection, audio-visual integration and prosodic analysis, as evidence that the behavioral precursor to human speech is the exchange of contact calls in non-human primates. Herein, I interpret the remaining ADS functions as evidence of additional stages in language evolution. According to this model, the role of the ADS in vocal control enabled early Homo (Hominans) to name objects using monosyllabic calls, and allowed children to learn their parents' calls by imitating their lip movements. Initially, the calls were forgotten quickly but gradually were remembered for longer periods. Once the representations of the calls became permanent, mimicry was limited to infancy, and older individuals encoded in the ADS a lexicon for the names of objects (phonological lexicon). Consequently, sound recognition in the AVS was sufficient for activating the phonological representations in the ADS and mimicry became independent of lip-reading. Later, by developing inhibitory connections between acoustic-syllabic representations in the AVS and phonological representations of subsequent syllables in the ADS, Hominans became capable of concatenating the monosyllabic calls for repeating polysyllabic words (i.e., developed working memory). Finally, due to strengthening of connections between phonological representations in the ADS, Hominans became capable of encoding several syllables as a single representation (chunking). Consequently, Hominans began vocalizing and mimicking/rehearsing lists of words (sentences).
Collapse
|
15
|
Rosenblum LD, Dias JW, Dorsi J. The supramodal brain: implications for auditory perception. JOURNAL OF COGNITIVE PSYCHOLOGY 2016. [DOI: 10.1080/20445911.2016.1181691] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
16
|
Bourguignon NJ, Baum SR, Shiller DM. Please say what this word is-Vowel-extrinsic normalization in the sensorimotor control of speech. J Exp Psychol Hum Percept Perform 2016; 42:1039-47. [PMID: 26820250 DOI: 10.1037/xhp0000209] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The extent to which the adaptive nature of speech perception influences the acoustic targets underlying speech production is not well understood. For example, listeners can rapidly accommodate to talker-dependent phonetic properties-a process known as vowel-extrinsic normalization-without altering their speech output. Recent evidence, however, shows that reinforcement-based learning in vowel perception alters the processing of speech auditory feedback, impacting sensorimotor control during vowel production. This suggests that more automatic and ubiquitous forms of perceptual plasticity, such as those characterizing perceptual talker normalization, may also impact the sensorimotor control of speech. To test this hypothesis, we set out to examine the possible effects of vowel-extrinsic normalization on experimental subjects' interpretation of their own speech outcomes. By combining a well-known manipulation of vowel-extrinsic normalization with speech auditory-motor adaptation, we show that exposure to different vowel spectral properties subsequently alters auditory feedback processing during speech production, thereby influencing speech motor adaptation. These findings extend the scope of perceptual normalization processes to include auditory feedback and support the idea that naturally occurring adaptations found in speech perception impact speech production. (PsycINFO Database Record
Collapse
Affiliation(s)
- Nicolas J Bourguignon
- École d'orthophonie et d'audiologie, University of Montreal, Centre de recherche, CHU Sainte-Justine
| | - Shari R Baum
- School of Communication Sciences and Disorders, McGill University
| | - Douglas M Shiller
- École d'orthophonie et d'audiologie, University of Montreal, Centre de recherche, CHU Sainte-Justine
| |
Collapse
|
17
|
Woynaroski TG, Kwakye LD, Foss-Feig JH, Stevenson RA, Stone WL, Wallace MT. Multisensory speech perception in children with autism spectrum disorders. J Autism Dev Disord 2014; 43:2891-902. [PMID: 23624833 DOI: 10.1007/s10803-013-1836-5] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
This study examined unisensory and multisensory speech perception in 8-17 year old children with autism spectrum disorders (ASD) and typically developing controls matched on chronological age, sex, and IQ. Consonant-vowel syllables were presented in visual only, auditory only, matched audiovisual, and mismatched audiovisual ("McGurk") conditions. Participants with ASD displayed deficits in visual only and matched audiovisual speech perception. Additionally, children with ASD reported a visual influence on heard speech in response to mismatched audiovisual syllables over a wider window of time relative to controls. Correlational analyses revealed associations between multisensory speech perception, communicative characteristics, and responses to sensory stimuli in ASD. Results suggest atypical speech perception is linked to broader behavioral characteristics of ASD.
Collapse
Affiliation(s)
- Tiffany G Woynaroski
- Department of Hearing and Speech Sciences, Vanderbilt University, 1211 Medical Center Drive, Nashville, TN, 37232, USA,
| | | | | | | | | | | |
Collapse
|
18
|
Sato M, Grabski K, Garnier M, Granjon L, Schwartz JL, Nguyen N. Converging toward a common speech code: imitative and perceptuo-motor recalibration processes in speech production. Front Psychol 2013; 4:422. [PMID: 23874316 PMCID: PMC3708162 DOI: 10.3389/fpsyg.2013.00422] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 06/20/2013] [Indexed: 11/21/2022] Open
Abstract
Auditory and somatosensory systems play a key role in speech motor control. In the act of speaking, segmental speech movements are programmed to reach phonemic sensory goals, which in turn are used to estimate actual sensory feedback in order to further control production. The adult's tendency to automatically imitate a number of acoustic-phonetic characteristics in another speaker's speech however suggests that speech production not only relies on the intended phonemic sensory goals and actual sensory feedback but also on the processing of external speech inputs. These online adaptive changes in speech production, or phonetic convergence effects, are thought to facilitate conversational exchange by contributing to setting a common perceptuo-motor ground between the speaker and the listener. In line with previous studies on phonetic convergence, we here demonstrate, in a non-interactive situation of communication, online unintentional and voluntary imitative changes in relevant acoustic features of acoustic vowel targets (fundamental and first formant frequencies) during speech production and imitation. In addition, perceptuo-motor recalibration processes, or after-effects, occurred not only after vowel production and imitation but also after auditory categorization of the acoustic vowel targets. Altogether, these findings demonstrate adaptive plasticity of phonemic sensory-motor goals and suggest that, apart from sensory-motor knowledge, speech production continuously draws on perceptual learning from the external speech environment.
Collapse
Affiliation(s)
- Marc Sato
- Grenoble Images Parole Signal Automatique-LAB, Département Parole and Cognition, Centre National de la Recherche Scientifique, Grenoble UniversitéGrenoble, France
| | - Krystyna Grabski
- Centre for Research on Brain, Language and Music, McGill UniversityMontreal, QC, Canada
| | - Maëva Garnier
- Grenoble Images Parole Signal Automatique-LAB, Département Parole and Cognition, Centre National de la Recherche Scientifique, Grenoble UniversitéGrenoble, France
| | - Lionel Granjon
- Laboratoire Psychologie de la Perception, Centre National de la Recherche Scientifique, École Normale SupérieureParis, France
| | - Jean-Luc Schwartz
- Grenoble Images Parole Signal Automatique-LAB, Département Parole and Cognition, Centre National de la Recherche Scientifique, Grenoble UniversitéGrenoble, France
| | - Noël Nguyen
- Laboratoire Parole and Langage, Centre National de la Recherche Scientifique, Aix-Marseille UniversitéAix-en-Provence, France
| |
Collapse
|
19
|
Ten Oever S, Sack AT, Wheat KL, Bien N, van Atteveldt N. Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs. Front Psychol 2013; 4:331. [PMID: 23805110 PMCID: PMC3693065 DOI: 10.3389/fpsyg.2013.00331] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 05/21/2013] [Indexed: 11/15/2022] Open
Abstract
Content and temporal cues have been shown to interact during audio-visual (AV) speech identification. Typically, the most reliable unimodal cue is used more strongly to identify specific speech features; however, visual cues are only used if the AV stimuli are presented within a certain temporal window of integration (TWI). This suggests that temporal cues denote whether unimodal stimuli belong together, that is, whether they should be integrated. It is not known whether temporal cues also provide information about the identity of a syllable. Since spoken syllables have naturally varying AV onset asynchronies, we hypothesize that for suboptimal AV cues presented within the TWI, information about the natural AV onset differences can aid in speech identification. To test this, we presented low-intensity auditory syllables concurrently with visual speech signals, and varied the stimulus onset asynchronies (SOA) of the AV pair, while participants were instructed to identify the auditory syllables. We revealed that specific speech features (e.g., voicing) were identified by relying primarily on one modality (e.g., auditory). Additionally, we showed a wide window in which visual information influenced auditory perception, that seemed even wider for congruent stimulus pairs. Finally, we found a specific response pattern across the SOA range for syllables that were not reliably identified by the unimodal cues, which we explained as the result of the use of natural onset differences between AV speech signals. This indicates that temporal cues not only provide information about the temporal integration of AV stimuli, but additionally convey information about the identity of AV pairs. These results provide a detailed behavioral basis for further neuro-imaging and stimulation studies to unravel the neurofunctional mechanisms of the audio-visual-temporal interplay within speech perception.
Collapse
Affiliation(s)
- Sanne Ten Oever
- Faculty of Psychology and Neuroscience, Maastricht University Maastricht, Netherlands
| | | | | | | | | |
Collapse
|
20
|
Silent articulation modulates auditory and audiovisual speech perception. Exp Brain Res 2013; 227:275-88. [DOI: 10.1007/s00221-013-3510-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 04/03/2013] [Indexed: 10/26/2022]
|
21
|
Shook A, Marian V. The Bilingual Language Interaction Network for Comprehension of Speech. BILINGUALISM (CAMBRIDGE, ENGLAND) 2013; 16:10.1017/S1366728912000466. [PMID: 24363602 PMCID: PMC3866103 DOI: 10.1017/s1366728912000466] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
During speech comprehension, bilinguals co-activate both of their languages, resulting in cross-linguistic interaction at various levels of processing. This interaction has important consequences for both the structure of the language system and the mechanisms by which the system processes spoken language. Using computational modeling, we can examine how cross-linguistic interaction affects language processing in a controlled, simulated environment. Here we present a connectionist model of bilingual language processing, the Bilingual Language Interaction Network for Comprehension of Speech (BLINCS), wherein interconnected levels of processing are created using dynamic, self-organizing maps. BLINCS can account for a variety of psycholinguistic phenomena, including cross-linguistic interaction at and across multiple levels of processing, cognate facilitation effects, and audio-visual integration during speech comprehension. The model also provides a way to separate two languages without requiring a global language-identification system. We conclude that BLINCS serves as a promising new model of bilingual spoken language comprehension.
Collapse
|
22
|
Franz EA. The allocation of attention to learning of goal-directed actions: a cognitive neuroscience framework focusing on the Basal Ganglia. Front Psychol 2012; 3:535. [PMID: 23267335 PMCID: PMC3527823 DOI: 10.3389/fpsyg.2012.00535] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 11/12/2012] [Indexed: 12/16/2022] Open
Abstract
The present paper builds on the idea that attention is largely in service of our actions. A framework and model which captures the allocation of attention for learning of goal-directed actions is proposed and developed. This framework highlights an evolutionary model based on the notion that rudimentary functions of the basal ganglia have become embedded into increasingly higher levels of networks which all contribute to adaptive learning. Supporting the proposed model, background literature is presented alongside key evidence based on experimental studies in the so-called "split-brain" (surgically divided cerebral hemispheres), and selected evidence from related areas of research. Although overlap with other existing findings and models is acknowledged, the proposed framework is an original synthesis of cognitive experimental findings with supporting evidence of a neural system and a carefully formulated model of attention. It is the hope that this new synthesis will be informative in fields of cognition and other fields of brain sciences and will lead to new avenues for experimentation across domains.
Collapse
Affiliation(s)
- E. A. Franz
- Division of Science, Department of Psychology, University of OtagoDunedin, New Zealand
| |
Collapse
|
23
|
Sinke C, Neufeld J, Zedler M, Emrich HM, Bleich S, Münte TF, Szycik GR. Reduced audiovisual integration in synesthesia--evidence from bimodal speech perception. J Neuropsychol 2012; 8:94-106. [PMID: 23279836 DOI: 10.1111/jnp.12006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Revised: 10/24/2012] [Accepted: 10/24/2012] [Indexed: 11/28/2022]
Abstract
Recent research suggests synesthesia as a result of a hypersensitive multimodal binding mechanism. To address the question whether multimodal integration is altered in synesthetes in general, grapheme-colour and auditory-visual synesthetes were investigated using speech-related stimulation in two behavioural experiments. First, we used the McGurk illusion to test the strength and number of illusory perceptions in synesthesia. In a second step, we analysed the gain in speech perception coming from seen articulatory movements under acoustically noisy conditions. We used disyllabic nouns as stimulation and varied signal-to-noise ratio of the auditory stream presented concurrently to a matching video of the speaker. We hypothesized that if synesthesia is due to a general hyperbinding mechanism this group of subjects should be more susceptible to McGurk illusions and profit more from the visual information during audiovisual speech perception. The results indicate that there are differences between synesthetes and controls concerning multisensory integration--but in the opposite direction as hypothesized. Synesthetes showed a reduced number of illusions and had a reduced gain in comprehension by viewing matching articulatory movements in comparison to control subjects. Our results indicate that rather than having a hypersensitive binding mechanism, synesthetes show weaker integration of vision and audition.
Collapse
Affiliation(s)
- Christopher Sinke
- Department of Psychiatry, Social Psychiatry and Psychotherapy, Hannover Medical School, Hanover, Germany; Center of Systems Neuroscience, Hanover, Germany
| | | | | | | | | | | | | |
Collapse
|
24
|
Neural correlates of interindividual differences in children's audiovisual speech perception. J Neurosci 2011; 31:13963-71. [PMID: 21957257 DOI: 10.1523/jneurosci.2605-11.2011] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Children use information from both the auditory and visual modalities to aid in understanding speech. A dramatic illustration of this multisensory integration is the McGurk effect, an illusion in which an auditory syllable is perceived differently when it is paired with an incongruent mouth movement. However, there are significant interindividual differences in McGurk perception: some children never perceive the illusion, while others always do. Because converging evidence suggests that the posterior superior temporal sulcus (STS) is a critical site for multisensory integration, we hypothesized that activity within the STS would predict susceptibility to the McGurk effect. To test this idea, we used BOLD fMRI in 17 children aged 6-12 years to measure brain responses to the following three audiovisual stimulus categories: McGurk incongruent, non-McGurk incongruent, and congruent syllables. Two separate analysis approaches, one using independent functional localizers and another using whole-brain voxel-based regression, showed differences in the left STS between perceivers and nonperceivers. The STS of McGurk perceivers responded significantly more than that of nonperceivers to McGurk syllables, but not to other stimuli, and perceivers' hemodynamic responses in the STS were significantly prolonged. In addition to the STS, weaker differences between perceivers and nonperceivers were observed in the fusiform face area and extrastriate visual cortex. These results suggest that the STS is an important source of interindividual variability in children's audiovisual speech perception.
Collapse
|
25
|
Nath AR, Beauchamp MS. A neural basis for interindividual differences in the McGurk effect, a multisensory speech illusion. Neuroimage 2011; 59:781-7. [PMID: 21787869 DOI: 10.1016/j.neuroimage.2011.07.024] [Citation(s) in RCA: 167] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Revised: 07/05/2011] [Accepted: 07/10/2011] [Indexed: 11/30/2022] Open
Abstract
The McGurk effect is a compelling illusion in which humans perceive mismatched audiovisual speech as a completely different syllable. However, some normal individuals do not experience the illusion, reporting that the stimulus sounds the same with or without visual input. Converging evidence suggests that the left superior temporal sulcus (STS) is critical for audiovisual integration during speech perception. We used blood-oxygen level dependent functional magnetic resonance imaging (BOLD fMRI) to measure brain activity as McGurk perceivers and non-perceivers were presented with congruent audiovisual syllables, McGurk audiovisual syllables, and non-McGurk incongruent syllables. The inferior frontal gyrus showed an effect of stimulus condition (greater responses for incongruent stimuli) but not susceptibility group, while the left auditory cortex showed an effect of susceptibility group (greater response in susceptible individuals) but not stimulus condition. Only one brain region, the left STS, showed a significant effect of both susceptibility and stimulus condition. The amplitude of the response in the left STS was significantly correlated with the likelihood of perceiving the McGurk effect: a weak STS response meant that a subject was less likely to perceive the McGurk effect, while a strong response meant that a subject was more likely to perceive it. These results suggest that the left STS is a key locus for interindividual differences in speech perception.
Collapse
Affiliation(s)
- Audrey R Nath
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston TX 77030, USA
| | | |
Collapse
|
26
|
When flavor guides motor control: an effector independence study. Exp Brain Res 2011; 212:339-46. [PMID: 21618038 DOI: 10.1007/s00221-011-2733-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 05/12/2011] [Indexed: 10/18/2022]
Abstract
Research on multisensory integration during natural tasks has revealed how chemical senses contribute to plan and control movements. An aspect which has yet to be investigated regards whether the motor representations evoked by chemosensory stimuli, once established for a particular movement, can be used to control different effectors. Here, we investigate this issue by asking participants to drink a sip of flavored solution, grasp with the hand a visual target, and then bring it to the mouth, miming the action of biting. Results show that hand and lip apertures were scaled according to the size of the object evoked by the flavor. Maximum hand and lip apertures were greater when the action toward a small visual target (e.g., strawberry) was preceded by a sip of a "large" (e.g., orange) than a "small" (e.g., almond) flavor solution. Conversely, maximum hand and lip apertures were smaller when the action toward a large visual target (e.g., apple) was preceded by the presentation of a "small" (e.g., strawberry) rather than a "large" flavor solution. These findings support previous evidence on the presence of a unique motor plan underlying the act of grasping with-the-hand and with-the-mouth, extending the knowledge of chemosensorimotor transformations to motor equivalence.
Collapse
|
27
|
Abstract
Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal speech information could explain the reported automaticity, immediacy, and completeness of audiovisual speech integration. However, recent findings suggest that speech integration can be influenced by higher cognitive properties such as lexical status and semantic context. Proponents of amodal accounts will need to explain these results.
Collapse
|
28
|
Skipper JI, van Wassenhove V, Nusbaum HC, Small SL. Hearing lips and seeing voices: how cortical areas supporting speech production mediate audiovisual speech perception. Cereb Cortex 2007; 17:2387-99. [PMID: 17218482 PMCID: PMC2896890 DOI: 10.1093/cercor/bhl147] [Citation(s) in RCA: 268] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Observing a speaker's mouth profoundly influences speech perception. For example, listeners perceive an "illusory" "ta" when the video of a face producing /ka/ is dubbed onto an audio /pa/. Here, we show how cortical areas supporting speech production mediate this illusory percept and audiovisual (AV) speech perception more generally. Specifically, cortical activity during AV speech perception occurs in many of the same areas that are active during speech production. We find that different perceptions of the same syllable and the perception of different syllables are associated with different distributions of activity in frontal motor areas involved in speech production. Activity patterns in these frontal motor areas resulting from the illusory "ta" percept are more similar to the activity patterns evoked by AV(/ta/) than they are to patterns evoked by AV(/pa/) or AV(/ka/). In contrast to the activity in frontal motor areas, stimulus-evoked activity for the illusory "ta" in auditory and somatosensory areas and visual areas initially resembles activity evoked by AV(/pa/) and AV(/ka/), respectively. Ultimately, though, activity in these regions comes to resemble activity evoked by AV(/ta/). Together, these results suggest that AV speech elicits in the listener a motor plan for the production of the phoneme that the speaker might have been attempting to produce, and that feedback in the form of efference copy from the motor system ultimately influences the phonetic interpretation.
Collapse
Affiliation(s)
- Jeremy I Skipper
- Department of Neurology, The University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA.
| | | | | | | |
Collapse
|
29
|
Gentilucci M, Bernardis P. Imitation during phoneme production. Neuropsychologia 2007; 45:608-15. [PMID: 16698051 DOI: 10.1016/j.neuropsychologia.2006.04.004] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2005] [Revised: 03/14/2006] [Accepted: 04/05/2006] [Indexed: 12/01/2022]
Abstract
Does listening to and observing the speaking interlocutor influence phoneme production? In two experiments female participants were required to recognize and, then, to repeat the string-of-phonemes /aba/ presented by actors visually, acoustically and audiovisually. In experiment 1 a male actor presented the string-of-phonemes and the participants' lip kinematics and voice spectra were compared with those of a reading control condition. In experiment 2 female and male actors presented the string-of-phonemes and the lip kinematics and the voice spectra of the participants' responses to the male actors were compared with those to the female actors (control condition). In both experiments 1 and 2, the lip kinematics in the visual presentations and the voice spectra in the acoustical presentations changed in the comparison with the control conditions approaching the male actors' values, which were different from those of the female participants and actors. The variation in lip kinematics induced changes also in voice formants but only in the visual presentation. The data suggest that both features of the lip kinematics and of the voice spectra tend to be automatically imitated when repeating a string-of-phonemes presented by a visible and/or audible speaking interlocutor. The use of imitation, in place of the usual lip kinematics and vocal features, suggests an automatic and unconscious tendency of the perceiver to interact closely with the interlocutor. This is in accordance with the idea that resonant circuits are activated by the activity of the mirror system, which relates observation to execution of arm and mouth gestures.
Collapse
Affiliation(s)
- Maurizio Gentilucci
- Dipartimento di Neuroscienze, Università di Parma, Via Volturno 39, 43100 Parma, Italy.
| | | |
Collapse
|