1
|
Magnotti JF, Lado A, Beauchamp MS. The noisy encoding of disparity model predicts perception of the McGurk effect in native Japanese speakers. Front Neurosci 2024; 18:1421713. [PMID: 38988770 PMCID: PMC11233445 DOI: 10.3389/fnins.2024.1421713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/28/2024] [Indexed: 07/12/2024] Open
Abstract
In the McGurk effect, visual speech from the face of the talker alters the perception of auditory speech. The diversity of human languages has prompted many intercultural studies of the effect in both Western and non-Western cultures, including native Japanese speakers. Studies of large samples of native English speakers have shown that the McGurk effect is characterized by high variability in the susceptibility of different individuals to the illusion and in the strength of different experimental stimuli to induce the illusion. The noisy encoding of disparity (NED) model of the McGurk effect uses principles from Bayesian causal inference to account for this variability, separately estimating the susceptibility and sensory noise for each individual and the strength of each stimulus. To determine whether variation in McGurk perception is similar between Western and non-Western cultures, we applied the NED model to data collected from 80 native Japanese-speaking participants. Fifteen different McGurk stimuli that varied in syllable content (unvoiced auditory "pa" + visual "ka" or voiced auditory "ba" + visual "ga") were presented interleaved with audiovisual congruent stimuli. The McGurk effect was highly variable across stimuli and participants, with the percentage of illusory fusion responses ranging from 3 to 78% across stimuli and from 0 to 91% across participants. Despite this variability, the NED model accurately predicted perception, predicting fusion rates for individual stimuli with 2.1% error and for individual participants with 2.4% error. Stimuli containing the unvoiced pa/ka pairing evoked more fusion responses than the voiced ba/ga pairing. Model estimates of sensory noise were correlated with participant age, with greater sensory noise in older participants. The NED model of the McGurk effect offers a principled way to account for individual and stimulus differences when examining the McGurk effect in different cultures.
Collapse
Affiliation(s)
- John F Magnotti
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Anastasia Lado
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
2
|
Dong C, Noppeney U, Wang S. Perceptual uncertainty explains activation differences between audiovisual congruent speech and McGurk stimuli. Hum Brain Mapp 2024; 45:e26653. [PMID: 38488460 DOI: 10.1002/hbm.26653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 02/20/2024] [Accepted: 02/26/2024] [Indexed: 03/19/2024] Open
Abstract
Face-to-face communication relies on the integration of acoustic speech signals with the corresponding facial articulations. In the McGurk illusion, an auditory /ba/ phoneme presented simultaneously with a facial articulation of a /ga/ (i.e., viseme), is typically fused into an illusory 'da' percept. Despite its widespread use as an index of audiovisual speech integration, critics argue that it arises from perceptual processes that differ categorically from natural speech recognition. Conversely, Bayesian theoretical frameworks suggest that both the illusory McGurk and the veridical audiovisual congruent speech percepts result from probabilistic inference based on noisy sensory signals. According to these models, the inter-sensory conflict in McGurk stimuli may only increase observers' perceptual uncertainty. This functional magnetic resonance imaging (fMRI) study presented participants (20 male and 24 female) with audiovisual congruent, McGurk (i.e., auditory /ba/ + visual /ga/), and incongruent (i.e., auditory /ga/ + visual /ba/) stimuli along with their unisensory counterparts in a syllable categorization task. Behaviorally, observers' response entropy was greater for McGurk compared to congruent audiovisual stimuli. At the neural level, McGurk stimuli increased activations in a widespread neural system, extending from the inferior frontal sulci (IFS) to the pre-supplementary motor area (pre-SMA) and insulae, typically involved in cognitive control processes. Crucially, in line with Bayesian theories these activation increases were fully accounted for by observers' perceptual uncertainty as measured by their response entropy. Our findings suggest that McGurk and congruent speech processing rely on shared neural mechanisms, thereby supporting the McGurk illusion as a valid measure of natural audiovisual speech perception.
Collapse
Affiliation(s)
- Chenjie Dong
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
- Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, the Netherlands
| | - Uta Noppeney
- Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, the Netherlands
| | - Suiping Wang
- Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents (South China Normal University), Ministry of Education, Guangzhou, China
| |
Collapse
|
3
|
Ahn E, Majumdar A, Lee T, Brang D. Evidence for a Causal Dissociation of the McGurk Effect and Congruent Audiovisual Speech Perception via TMS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.27.568892. [PMID: 38077093 PMCID: PMC10705272 DOI: 10.1101/2023.11.27.568892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Congruent visual speech improves speech perception accuracy, particularly in noisy environments. Conversely, mismatched visual speech can alter what is heard, leading to an illusory percept known as the McGurk effect. This illusion has been widely used to study audiovisual speech integration, illustrating that auditory and visual cues are combined in the brain to generate a single coherent percept. While prior transcranial magnetic stimulation (TMS) and neuroimaging studies have identified the left posterior superior temporal sulcus (pSTS) as a causal region involved in the generation of the McGurk effect, it remains unclear whether this region is critical only for this illusion or also for the more general benefits of congruent visual speech (e.g., increased accuracy and faster reaction times). Indeed, recent correlative research suggests that the benefits of congruent visual speech and the McGurk effect reflect largely independent mechanisms. To better understand how these different features of audiovisual integration are causally generated by the left pSTS, we used single-pulse TMS to temporarily impair processing while subjects were presented with either incongruent (McGurk) or congruent audiovisual combinations. Consistent with past research, we observed that TMS to the left pSTS significantly reduced the strength of the McGurk effect. Importantly, however, left pSTS stimulation did not affect the positive benefits of congruent audiovisual speech (increased accuracy and faster reaction times), demonstrating a causal dissociation between the two processes. Our results are consistent with models proposing that the pSTS is but one of multiple critical areas supporting audiovisual speech interactions. Moreover, these data add to a growing body of evidence suggesting that the McGurk effect is an imperfect surrogate measure for more general and ecologically valid audiovisual speech behaviors.
Collapse
Affiliation(s)
- EunSeon Ahn
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Areti Majumdar
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Taraz Lee
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
4
|
Feng X, Monzalvo K, Dehaene S, Dehaene-Lambertz G. Evolution of reading and face circuits during the first three years of reading acquisition. Neuroimage 2022; 259:119394. [PMID: 35718022 DOI: 10.1016/j.neuroimage.2022.119394] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 05/19/2022] [Accepted: 06/14/2022] [Indexed: 11/25/2022] Open
Abstract
Although words and faces activate neighboring regions in the fusiform gyrus, we lack an understanding of how such category selectivity emerges during development. To investigate the organization of reading and face circuits at the earliest stage of reading acquisition, we measured the fMRI responses to words, faces, houses, and checkerboards in three groups of 60 French children: 6-year-old pre-readers, 6-year-old beginning readers and 9-year-old advanced readers. The results showed that specific responses to written words were absent prior to reading, but emerged in beginning readers, irrespective of age. Likewise, specific responses to faces were barely visible in pre-readers and continued to evolve in the 9-year-olds, yet primarily driven by age rather than by schooling. Crucially, the sectors of ventral visual cortex that become specialized for words and faces harbored their own functional connectivity prior to reading acquisition: the VWFA with left-hemispheric spoken language areas, and the FFA with the contralateral region and the amygdalae. The results support the view that reading acquisition occurs through the recycling of a pre-existing but plastic circuit which, in pre-readers, already connects the VWFA site to other distant language areas. We argue that reading acquisition does not compete with the face system directly, through a pruning of preexisting face responses, but indirectly, by hindering the slow growth of face responses in the left hemisphere, thus increasing a pre-existing right hemispheric bias.
Collapse
Affiliation(s)
- Xiaoxia Feng
- Cognitive Neuroimaging Unit, CNRS ERL 9003, INSERM U992, CEA, Université Paris-Saclay, NeuroSpin center, 91191 Gif/Yvette, France
| | - Karla Monzalvo
- Cognitive Neuroimaging Unit, CNRS ERL 9003, INSERM U992, CEA, Université Paris-Saclay, NeuroSpin center, 91191 Gif/Yvette, France
| | - Stanislas Dehaene
- Cognitive Neuroimaging Unit, CNRS ERL 9003, INSERM U992, CEA, Université Paris-Saclay, NeuroSpin center, 91191 Gif/Yvette, France; Collège de France, Université PSL Paris Sciences Lettres, Paris, France
| | - Ghislaine Dehaene-Lambertz
- Cognitive Neuroimaging Unit, CNRS ERL 9003, INSERM U992, CEA, Université Paris-Saclay, NeuroSpin center, 91191 Gif/Yvette, France.
| |
Collapse
|
5
|
Fiber tracing and microstructural characterization among audiovisual integration brain regions in neonates compared with young adults. Neuroimage 2022; 254:119141. [PMID: 35342006 DOI: 10.1016/j.neuroimage.2022.119141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 02/23/2022] [Accepted: 03/21/2022] [Indexed: 11/23/2022] Open
Abstract
Audiovisual integration has been related with cognitive-processing and behavioral advantages, as well as with various socio-cognitive disorders. While some studies have identified brain regions instantiating this ability shortly after birth, little is known about the structural pathways connecting them. The goal of the present study was to reconstruct fiber tracts linking AVI regions in the newborn in-vivo brain and assess their adult-likeness by comparing them with analogous fiber tracts of young adults. We performed probabilistic tractography and compared connective probabilities between a sample of term-born neonates (N = 311; the Developing Human Connectome Project (dHCP, http://www.developingconnectome.org) and young adults (N = 311 The Human Connectome Project; https://www.humanconnectome.org/) by means of a classification algorithm. Furthermore, we computed Dice coefficients to assess between-group spatial similarity of the reconstructed fibers and used diffusion metrics to characterize neonates' AVI brain network in terms of microstructural properties, interhemispheric differences and the association with perinatal covariates and biological sex. Overall, our results indicate that the AVI fiber bundles were successfully reconstructed in a vast majority of neonates, similarly to adults. Connective probability distributional similarities and spatial overlaps of AVI fibers between the two groups differed across the reconstructed fibers. There was a rank-order correspondence of the fibers' connective strengths across the groups. Additionally, the study revealed patterns of diffusion metrics in line with early white matter developmental trajectories and a developmental advantage for females. Altogether, these findings deliver evidence of meaningful structural connections among AVI regions in the newborn in-vivo brain.
Collapse
|
6
|
Butera IM, Larson ED, DeFreese AJ, Lee AKC, Gifford RH, Wallace MT. Functional localization of audiovisual speech using near infrared spectroscopy. Brain Topogr 2022; 35:416-430. [PMID: 35821542 PMCID: PMC9334437 DOI: 10.1007/s10548-022-00904-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 05/19/2022] [Indexed: 11/21/2022]
Abstract
Visual cues are especially vital for hearing impaired individuals such as cochlear implant (CI) users to understand speech in noise. Functional Near Infrared Spectroscopy (fNIRS) is a light-based imaging technology that is ideally suited for measuring the brain activity of CI users due to its compatibility with both the ferromagnetic and electrical components of these implants. In a preliminary step toward better elucidating the behavioral and neural correlates of audiovisual (AV) speech integration in CI users, we designed a speech-in-noise task and measured the extent to which 24 normal hearing individuals could integrate the audio of spoken monosyllabic words with the corresponding visual signals of a female speaker. In our behavioral task, we found that audiovisual pairings provided average improvements of 103% and 197% over auditory-alone listening conditions in -6 and -9 dB signal-to-noise ratios consisting of multi-talker background noise. In an fNIRS task using similar stimuli, we measured activity during auditory-only listening, visual-only lipreading, and AV listening conditions. We identified cortical activity in all three conditions over regions of middle and superior temporal cortex typically associated with speech processing and audiovisual integration. In addition, three channels active during the lipreading condition showed uncorrected correlations associated with behavioral measures of audiovisual gain as well as with the McGurk effect. Further work focusing primarily on the regions of interest identified in this study could test how AV speech integration may differ for CI users who rely on this mechanism for daily communication.
Collapse
Affiliation(s)
- Iliza M. Butera
- grid.152326.10000 0001 2264 7217Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN USA
| | - Eric D. Larson
- grid.34477.330000000122986657Institute for Learning & Brain Sciences, University of Washington, Seattle Washington, USA
| | - Andrea J. DeFreese
- grid.152326.10000 0001 2264 7217Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN USA
| | - Adrian KC Lee
- grid.34477.330000000122986657Institute for Learning & Brain Sciences, University of Washington, Seattle Washington, USA ,grid.34477.330000000122986657Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington USA
| | - René H. Gifford
- grid.152326.10000 0001 2264 7217Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN USA
| | - Mark T. Wallace
- grid.152326.10000 0001 2264 7217Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN USA ,grid.412807.80000 0004 1936 9916Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN USA
| |
Collapse
|
7
|
Albonico A, Yu S, Corrow SL, Barton JJS. Facial identity and facial speech processing in developmental prosopagnosia. Neuropsychologia 2022; 168:108163. [DOI: 10.1016/j.neuropsychologia.2022.108163] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 12/20/2021] [Accepted: 01/21/2022] [Indexed: 11/16/2022]
|
8
|
Motor Circuit and Superior Temporal Sulcus Activities Linked to Individual Differences in Multisensory Speech Perception. Brain Topogr 2021; 34:779-792. [PMID: 34480635 DOI: 10.1007/s10548-021-00869-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 08/24/2021] [Indexed: 10/20/2022]
Abstract
Integrating multimodal information into a unified perception is a fundamental human capacity. McGurk effect is a remarkable multisensory illusion that demonstrates a percept different from incongruent auditory and visual syllables. However, not all listeners perceive the McGurk illusion to the same degree. The neural basis for individual differences in modulation of multisensory integration and syllabic perception remains largely unclear. To probe the possible involvement of specific neural circuits in individual differences in multisensory speech perception, we first implemented a behavioral experiment to examine the McGurk susceptibility. Then, functional magnetic resonance imaging was performed in 63 participants to measure the brain activity in response to non-McGurk audiovisual syllables. We revealed significant individual variability in McGurk illusion perception. Moreover, we found significant differential activations of the auditory and visual regions and the left Superior temporal sulcus (STS), as well as multiple motor areas between strong and weak McGurk perceivers. Importantly, the individual engagement of the STS and motor areas could specifically predict the behavioral McGurk susceptibility, contrary to the sensory regions. These findings suggest that the distinct multimodal integration in STS as well as coordinated phonemic modulatory processes in motor circuits may serve as a neural substrate for interindividual differences in multisensory speech perception.
Collapse
|
9
|
Siemann JK, Veenstra-VanderWeele J, Wallace MT. Approaches to Understanding Multisensory Dysfunction in Autism Spectrum Disorder. Autism Res 2020; 13:1430-1449. [PMID: 32869933 PMCID: PMC7721996 DOI: 10.1002/aur.2375] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 07/20/2020] [Accepted: 07/28/2020] [Indexed: 12/14/2022]
Abstract
Abnormal sensory responses are a DSM-5 symptom of autism spectrum disorder (ASD), and research findings demonstrate altered sensory processing in ASD. Beyond difficulties with processing information within single sensory domains, including both hypersensitivity and hyposensitivity, difficulties in multisensory processing are becoming a core issue of focus in ASD. These difficulties may be targeted by treatment approaches such as "sensory integration," which is frequently applied in autism treatment but not yet based on clear evidence. Recently, psychophysical data have emerged to demonstrate multisensory deficits in some children with ASD. Unlike deficits in social communication, which are best understood in humans, sensory and multisensory changes offer a tractable marker of circuit dysfunction that is more easily translated into animal model systems to probe the underlying neurobiological mechanisms. Paralleling experimental paradigms that were previously applied in humans and larger mammals, we and others have demonstrated that multisensory function can also be examined behaviorally in rodents. Here, we review the sensory and multisensory difficulties commonly found in ASD, examining laboratory findings that relate these findings across species. Next, we discuss the known neurobiology of multisensory integration, drawing largely on experimental work in larger mammals, and extensions of these paradigms into rodents. Finally, we describe emerging investigations into multisensory processing in genetic mouse models related to autism risk. By detailing findings from humans to mice, we highlight the advantage of multisensory paradigms that can be easily translated across species, as well as the potential for rodent experimental systems to reveal opportunities for novel treatments. LAY SUMMARY: Sensory and multisensory deficits are commonly found in ASD and may result in cascading effects that impact social communication. By using similar experiments to those in humans, we discuss how studies in animal models may allow an understanding of the brain mechanisms that underlie difficulties in multisensory integration, with the ultimate goal of developing new treatments. Autism Res 2020, 13: 1430-1449. © 2020 International Society for Autism Research, Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Justin K Siemann
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, USA
| | - Jeremy Veenstra-VanderWeele
- Department of Psychiatry, Columbia University, Center for Autism and the Developing Brain, New York Presbyterian Hospital, and New York State Psychiatric Institute, New York, New York, USA
| | - Mark T Wallace
- Department of Psychiatry, Vanderbilt University, Nashville, Tennessee, USA
- Department of Psychology, Vanderbilt University, Nashville, Tennessee, USA
- Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, Tennessee, USA
- Kennedy Center for Research on Human Development, Vanderbilt University, Nashville, Tennessee, USA
| |
Collapse
|
10
|
Wang F, Karipidis II, Pleisch G, Fraga-González G, Brem S. Development of Print-Speech Integration in the Brain of Beginning Readers With Varying Reading Skills. Front Hum Neurosci 2020; 14:289. [PMID: 32922271 PMCID: PMC7457077 DOI: 10.3389/fnhum.2020.00289] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 06/26/2020] [Indexed: 12/13/2022] Open
Abstract
Learning print-speech sound correspondences is a crucial step at the beginning of reading acquisition and often impaired in children with developmental dyslexia. Despite increasing insight into audiovisual language processing, it remains largely unclear how integration of print and speech develops at the neural level during initial learning in the first years of schooling. To investigate this development, 32 healthy, German-speaking children at varying risk for developmental dyslexia (17 typical readers and 15 poor readers) participated in a longitudinal study including behavioral and fMRI measurements in first (T1) and second (T2) grade. We used an implicit audiovisual (AV) non-word target detection task aimed at characterizing differential activation to congruent (AVc) and incongruent (AVi) audiovisual non-word pairs. While children’s brain activation did not differ between AVc and AVi pairs in first grade, an incongruency effect (AVi > AVc) emerged in bilateral inferior temporal and superior frontal gyri in second grade. Of note, pseudoword reading performance improvements with time were associated with the development of the congruency effect (AVc > AVi) in the left posterior superior temporal gyrus (STG) from first to second grade. Finally, functional connectivity analyses indicated divergent development and reading expertise dependent coupling from the left occipito-temporal and superior temporal cortex to regions of the default mode (precuneus) and fronto-temporal language networks. Our results suggest that audiovisual integration areas as well as their functional coupling to other language areas and areas of the default mode network show a different development in poor vs. typical readers at varying familial risk for dyslexia.
Collapse
Affiliation(s)
- Fang Wang
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital of Psychiatry, University of Zurich, Zurich, Switzerland.,Department of Psychology, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Iliana I Karipidis
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital of Psychiatry, University of Zurich, Zurich, Switzerland.,Center for Interdisciplinary Brain Sciences Research, Department of Psychiatry and Behavioral Sciences, School of Medicine, Stanford University, Stanford, CA, United States
| | - Georgette Pleisch
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital of Psychiatry, University of Zurich, Zurich, Switzerland
| | - Gorka Fraga-González
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital of Psychiatry, University of Zurich, Zurich, Switzerland
| | - Silvia Brem
- Department of Child and Adolescent Psychiatry and Psychotherapy, University Hospital of Psychiatry, University of Zurich, Zurich, Switzerland.,Neuroscience Center Zurich, University of Zurich and ETH Zürich, Zurich, Switzerland
| |
Collapse
|
11
|
Conant LL, Liebenthal E, Desai A, Seidenberg MS, Binder JR. Differential activation of the visual word form area during auditory phoneme perception in youth with dyslexia. Neuropsychologia 2020; 146:107543. [PMID: 32598966 DOI: 10.1016/j.neuropsychologia.2020.107543] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 03/16/2020] [Accepted: 06/21/2020] [Indexed: 12/12/2022]
Abstract
Developmental dyslexia is a learning disorder characterized by difficulties reading words accurately and/or fluently. Several behavioral studies have suggested the presence of anomalies at an early stage of phoneme processing, when the complex spectrotemporal patterns in the speech signal are analyzed and assigned to phonemic categories. In this study, fMRI was used to compare brain responses associated with categorical discrimination of speech syllables (P) and acoustically matched nonphonemic stimuli (N) in children and adolescents with dyslexia and in typically developing (TD) controls, aged 8-17 years. The TD group showed significantly greater activation during the P condition relative to N in an area of the left ventral occipitotemporal cortex that corresponds well with the region referred to as the "visual word form area" (VWFA). Regression analyses using reading performance as a continuous variable across the full group of participants yielded similar results. Overall, the findings are consistent with those of previous neuroimaging studies using print stimuli in individuals with dyslexia that found reduced activation in left occipitotemporal regions; however, the current study shows that these activation differences seen during reading are apparent during auditory phoneme discrimination in youth with dyslexia, suggesting that the primary deficit in at least a subset of children may lie early in the speech processing stream and that categorical perception may be an important target of early intervention in children at risk for dyslexia.
Collapse
Affiliation(s)
- Lisa L Conant
- Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA.
| | - Einat Liebenthal
- Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA; Department of Psychiatry, McLean Hospital, Harvard Medical School, Boston, MA, USA
| | - Anjali Desai
- Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Mark S Seidenberg
- Department of Psychology, University of Wisconsin-Madison, Madison, WI, USA
| | - Jeffrey R Binder
- Department of Neurology, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
12
|
Ujiie Y, Kanazawa S, Yamaguchi MK. The Other-Race-Effect on Audiovisual Speech Integration in Infants: A NIRS Study. Front Psychol 2020; 11:971. [PMID: 32499746 PMCID: PMC7243679 DOI: 10.3389/fpsyg.2020.00971] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 04/20/2020] [Indexed: 11/21/2022] Open
Abstract
Previous studies have revealed perceptual narrowing for the own-race-face in face discrimination, but this phenomenon is poorly understood in face and voice integration. We focused on infants' brain responses to the McGurk effect to examine whether the other-race effect occurs in the activation patterns. In Experiment 1, we conducted fNIRS measurements to find the presence of a mapping of the McGurk effect in Japanese 8- to 9-month-old infants and to examine the difference between the activation patterns in response to own-race-face and other-race-face stimuli. We used two race-face conditions, own-race-face (East Asian) and other-race-face (Caucasian), each of which contained audiovisual-matched and McGurk-type stimuli. While the infants (N = 34) were observing each speech stimulus for each race, we measured cerebral hemoglobin concentrations in bilateral temporal brain regions. The results showed that in the own-race-face condition, audiovisual-matched stimuli induced the activation of the left temporal region, and the McGurk stimuli induced the activation of the bilateral temporal regions. No significant activations were found in the other-race-face condition. These results mean that the McGurk effect occurred only in the own-race-face condition. In Experiment 2, we used a familiarization/novelty preference procedure to confirm that the infants (N = 28) could perceive the McGurk effect in the own-race-face condition but not that of the other-race-face. The behavioral data supported the results of the fNIRS data, implying the presence of narrowing for the own-race face in the McGurk effect. These results suggest that narrowing of the McGurk effect may be involved in the development of relatively high-order processing, such as face-to-face communication with people surrounding the infant. We discuss the hypothesis that perceptual narrowing is a modality-general, pan-sensory process.
Collapse
Affiliation(s)
- Yuta Ujiie
- Graduate School of Psychology, Chukyo University, Aichi, Japan
- Research and Development Initiative, Chuo University, Tokyo, Japan
- Japan Society for the Promotion of Science, Tokyo, Japan
| | - So Kanazawa
- Department of Psychology, Japan Women’s University, Kawasaki, Japan
| | | |
Collapse
|
13
|
Feng G, Zhou B, Zhou W, Beauchamp MS, Magnotti JF. A Laboratory Study of the McGurk Effect in 324 Monozygotic and Dizygotic Twins. Front Neurosci 2019; 13:1029. [PMID: 31636529 PMCID: PMC6787151 DOI: 10.3389/fnins.2019.01029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 09/10/2019] [Indexed: 11/13/2022] Open
Abstract
Multisensory integration of information from the talker's voice and the talker's mouth facilitates human speech perception. A popular assay of audiovisual integration is the McGurk effect, an illusion in which incongruent visual speech information categorically changes the percept of auditory speech. There is substantial interindividual variability in susceptibility to the McGurk effect. To better understand possible sources of this variability, we examined the McGurk effect in 324 native Mandarin speakers, consisting of 73 monozygotic (MZ) and 89 dizygotic (DZ) twin pairs. When tested with 9 different McGurk stimuli, some participants never perceived the illusion and others always perceived it. Within participants, perception was similar across time (r = 0.55 at a 2-year retest in 150 participants) suggesting that McGurk susceptibility reflects a stable trait rather than short-term perceptual fluctuations. To examine the effects of shared genetics and prenatal environment, we compared McGurk susceptibility between MZ and DZ twins. Both twin types had significantly greater correlation than unrelated pairs (r = 0.28 for MZ twins and r = 0.21 for DZ twins) suggesting that the genes and environmental factors shared by twins contribute to individual differences in multisensory speech perception. Conversely, the existence of substantial differences within twin pairs (even MZ co-twins) and the overall low percentage of explained variance (5.5%) argues against a deterministic view of individual differences in multisensory integration.
Collapse
Affiliation(s)
- Guo Feng
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
- Psychological Research and Counseling Center, Southwest Jiaotong University, Chengdu, China
| | - Bin Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Wen Zhou
- CAS Key Laboratory of Behavioral Science, Institute of Psychology, CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Michael S. Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| | - John F. Magnotti
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
14
|
Rennig J, Beauchamp MS. Free viewing of talking faces reveals mouth and eye preferring regions of the human superior temporal sulcus. Neuroimage 2018; 183:25-36. [PMID: 30092347 PMCID: PMC6214361 DOI: 10.1016/j.neuroimage.2018.08.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 07/31/2018] [Accepted: 08/05/2018] [Indexed: 01/22/2023] Open
Abstract
During face-to-face communication, the mouth of the talker is informative about speech content, while the eyes of the talker convey other information, such as gaze location. Viewers most often fixate either the mouth or the eyes of the talker's face, presumably allowing them to sample these different sources of information. To study the neural correlates of this process, healthy humans freely viewed talking faces while brain activity was measured with BOLD fMRI and eye movements were recorded with a video-based eye tracker. Post hoc trial sorting was used to divide the data into trials in which participants fixated the mouth of the talker and trials in which they fixated the eyes. Although the audiovisual stimulus was identical, the two trials types evoked differing responses in subregions of the posterior superior temporal sulcus (pSTS). The anterior pSTS preferred trials in which participants fixated the mouth of the talker while the posterior pSTS preferred fixations on the eye of the talker. A second fMRI experiment demonstrated that anterior pSTS responded more strongly to auditory and audiovisual speech than posterior pSTS eye-preferring regions. These results provide evidence for functional specialization within the pSTS under more realistic viewing and stimulus conditions than in previous neuroimaging studies.
Collapse
Affiliation(s)
- Johannes Rennig
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, USA
| | - Michael S Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
15
|
Ozker M, Yoshor D, Beauchamp MS. Converging Evidence From Electrocorticography and BOLD fMRI for a Sharp Functional Boundary in Superior Temporal Gyrus Related to Multisensory Speech Processing. Front Hum Neurosci 2018; 12:141. [PMID: 29740294 PMCID: PMC5928751 DOI: 10.3389/fnhum.2018.00141] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 03/28/2018] [Indexed: 01/15/2023] Open
Abstract
Although humans can understand speech using the auditory modality alone, in noisy environments visual speech information from the talker’s mouth can rescue otherwise unintelligible auditory speech. To investigate the neural substrates of multisensory speech perception, we compared neural activity from the human superior temporal gyrus (STG) in two datasets. One dataset consisted of direct neural recordings (electrocorticography, ECoG) from surface electrodes implanted in epilepsy patients (this dataset has been previously published). The second dataset consisted of indirect measures of neural activity using blood oxygen level dependent functional magnetic resonance imaging (BOLD fMRI). Both ECoG and fMRI participants viewed the same clear and noisy audiovisual speech stimuli and performed the same speech recognition task. Both techniques demonstrated a sharp functional boundary in the STG, spatially coincident with an anatomical boundary defined by the posterior edge of Heschl’s gyrus. Cortex on the anterior side of the boundary responded more strongly to clear audiovisual speech than to noisy audiovisual speech while cortex on the posterior side of the boundary did not. For both ECoG and fMRI measurements, the transition between the functionally distinct regions happened within 10 mm of anterior-to-posterior distance along the STG. We relate this boundary to the multisensory neural code underlying speech perception and propose that it represents an important functional division within the human speech perception network.
Collapse
Affiliation(s)
- Muge Ozker
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Daniel Yoshor
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States.,Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
16
|
McGurk stimuli for the investigation of multisensory integration in cochlear implant users: The Oldenburg Audio Visual Speech Stimuli (OLAVS). Psychon Bull Rev 2018; 24:863-872. [PMID: 27562763 DOI: 10.3758/s13423-016-1148-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The concurrent presentation of different auditory and visual syllables may result in the perception of a third syllable, reflecting an illusory fusion of visual and auditory information. This well-known McGurk effect is frequently used for the study of audio-visual integration. Recently, it was shown that the McGurk effect is strongly stimulus-dependent, which complicates comparisons across perceivers and inferences across studies. To overcome this limitation, we developed the freely available Oldenburg audio-visual speech stimuli (OLAVS), consisting of 8 different talkers and 12 different syllable combinations. The quality of the OLAVS set was evaluated with 24 normal-hearing subjects. All 96 stimuli were characterized based on their stimulus disparity, which was obtained from a probabilistic model (cf. Magnotti & Beauchamp, 2015). Moreover, the McGurk effect was studied in eight adult cochlear implant (CI) users. By applying the individual, stimulus-independent parameters of the probabilistic model, the predicted effect of stronger audio-visual integration in CI users could be confirmed, demonstrating the validity of the new stimulus material.
Collapse
|
17
|
Van Ackeren MJ, Barbero FM, Mattioni S, Bottini R, Collignon O. Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech. eLife 2018; 7:e31640. [PMID: 29338838 PMCID: PMC5790372 DOI: 10.7554/elife.31640] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 01/16/2018] [Indexed: 11/13/2022] Open
Abstract
The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives.
Collapse
Affiliation(s)
| | - Francesca M Barbero
- Institute of research in PsychologyUniversity of LouvainLouvainBelgium
- Institute of NeuroscienceUniversity of LouvainLouvainBelgium
| | | | - Roberto Bottini
- Center for Mind/Brain StudiesUniversity of TrentoTrentoItaly
| | - Olivier Collignon
- Center for Mind/Brain StudiesUniversity of TrentoTrentoItaly
- Institute of research in PsychologyUniversity of LouvainLouvainBelgium
- Institute of NeuroscienceUniversity of LouvainLouvainBelgium
| |
Collapse
|
18
|
Ross LA, Del Bene VA, Molholm S, Woo YJ, Andrade GN, Abrahams BS, Foxe JJ. Common variation in the autism risk gene CNTNAP2, brain structural connectivity and multisensory speech integration. BRAIN AND LANGUAGE 2017; 174:50-60. [PMID: 28738218 DOI: 10.1016/j.bandl.2017.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 04/07/2017] [Accepted: 07/11/2017] [Indexed: 06/07/2023]
Abstract
Three lines of evidence motivated this study. 1) CNTNAP2 variation is associated with autism risk and speech-language development. 2) CNTNAP2 variations are associated with differences in white matter (WM) tracts comprising the speech-language circuitry. 3) Children with autism show impairment in multisensory speech perception. Here, we asked whether an autism risk-associated CNTNAP2 single nucleotide polymorphism in neurotypical adults was associated with multisensory speech perception performance, and whether such a genotype-phenotype association was mediated through white matter tract integrity in speech-language circuitry. Risk genotype at rs7794745 was associated with decreased benefit from visual speech and lower fractional anisotropy (FA) in several WM tracts (right precentral gyrus, left anterior corona radiata, right retrolenticular internal capsule). These structural connectivity differences were found to mediate the effect of genotype on audiovisual speech perception, shedding light on possible pathogenic pathways in autism and biological sources of inter-individual variation in audiovisual speech processing in neurotypicals.
Collapse
Affiliation(s)
- Lars A Ross
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA.
| | - Victor A Del Bene
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Ferkauf Graduate School of Psychology Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Sophie Molholm
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Young Jae Woo
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Gizely N Andrade
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA
| | - Brett S Abrahams
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - John J Foxe
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA.
| |
Collapse
|
19
|
Yamamoto R, Naito Y, Tona R, Moroto S, Tamaya R, Fujiwara K, Shinohara S, Takebayashi S, Kikuchi M, Michida T. Audio-visual speech perception in prelingually deafened Japanese children following sequential bilateral cochlear implantation. Int J Pediatr Otorhinolaryngol 2017; 102:160-168. [PMID: 29106867 DOI: 10.1016/j.ijporl.2017.09.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 09/15/2017] [Accepted: 09/18/2017] [Indexed: 11/28/2022]
Abstract
OBJECTIVES An effect of audio-visual (AV) integration is observed when the auditory and visual stimuli are incongruent (the McGurk effect). In general, AV integration is helpful especially in subjects wearing hearing aids or cochlear implants (CIs). However, the influence of AV integration on spoken word recognition in individuals with bilateral CIs (Bi-CIs) has not been fully investigated so far. In this study, we investigated AV integration in children with Bi-CIs. METHODS The study sample included thirty one prelingually deafened children who underwent sequential bilateral cochlear implantation. We assessed their responses to congruent and incongruent AV stimuli with three CI-listening modes: only the 1st CI, only the 2nd CI, and Bi-CIs. The responses were assessed in the whole group as well as in two sub-groups: a proficient group (syllable intelligibility ≥80% with the 1st CI) and a non-proficient group (syllable intelligibility < 80% with the 1st CI). RESULTS We found evidence of the McGurk effect in each of the three CI-listening modes. AV integration responses were observed in a subset of incongruent AV stimuli, and the patterns observed with the 1st CI and with Bi-CIs were similar. In the proficient group, the responses with the 2nd CI were not significantly different from those with the 1st CI whereas in the non-proficient group the responses with the 2nd CI were driven by visual stimuli more than those with the 1st CI. CONCLUSION Our results suggested that prelingually deafened Japanese children who underwent sequential bilateral cochlear implantation exhibit AV integration abilities, both in monaural listening as well as in binaural listening. We also observed a higher influence of visual stimuli on speech perception with the 2nd CI in the non-proficient group, suggesting that Bi-CIs listeners with poorer speech recognition rely on visual information more compared to the proficient subjects to compensate for poorer auditory input. Nevertheless, poorer quality auditory input with the 2nd CI did not interfere with AV integration with binaural listening (with Bi-CIs). Overall, the findings of this study might be used to inform future research to identify the best strategies for speech training using AV integration effectively in prelingually deafened children.
Collapse
Affiliation(s)
- Ryosuke Yamamoto
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Yasushi Naito
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan.
| | - Risa Tona
- Department of Otolaryngology, Head and Neck Surgery, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Saburo Moroto
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Rinko Tamaya
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Keizo Fujiwara
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Shogo Shinohara
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Shinji Takebayashi
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Masahiro Kikuchi
- Department of Otolaryngology, Head and Neck Surgery, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Tetsuhiko Michida
- Department of Otolaryngology, Institute of Biomedical Research and Innovation, Kobe, Japan
| |
Collapse
|
20
|
Albonico A, Barton JJS. Face perception in pure alexia: Complementary contributions of the left fusiform gyrus to facial identity and facial speech processing. Cortex 2017; 96:59-72. [PMID: 28964939 DOI: 10.1016/j.cortex.2017.08.029] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 05/16/2017] [Accepted: 08/24/2017] [Indexed: 10/18/2022]
Abstract
Recent concepts of cerebral visual processing predict from overlapping patterns of face and word activation in cortex that left fusiform lesions will not only cause pure alexia but also lead to mild impairments of face processing. Our goal was to determine if alexic subjects had deficits in facial identity processing similar to those seen after right fusiform lesions, or complementary deficits affecting different aspects of face processing. We studied four alexic patients whose lesions involved the left fusiform gyrus and one prosopagnosic subject with a right fusiform lesion, on standard tests of face perception and recognition. We evaluated their ability first to process faces in linear contour images, and second to detect, discriminate, identify and integrate facial speech patterns into perception. We found that all five patients were impaired in face matching across viewpoint, but the alexic subjects performed worse with line-drawn faces, while the prosopagnosic subject did not. Alexic subjects could detect facial speech patterns but had trouble identifying them and did not integrate facial speech patterns with speech sounds, whereas identification and integration was intact in the prosopagnosic subject. We conclude that, in addition to their role in reading, the left-sided regions damaged in alexic subjects participate in the perception of facial identity but in a non-redundant fashion, focusing on the information in linear contours at higher spatial frequencies. In addition they have a dominant role in processing facial speech patterns, another visual aspect of language processing.
Collapse
Affiliation(s)
- Andrea Albonico
- Human Vision and Eye Movement Laboratory, Departments of Medicine (Neurology), Ophthalmology and Visual Sciences, Psychology, University of British Columbia, Vancouver, Canada; NeuroMI - Milan Center for Neuroscience, Milano, Italy
| | - Jason J S Barton
- Human Vision and Eye Movement Laboratory, Departments of Medicine (Neurology), Ophthalmology and Visual Sciences, Psychology, University of British Columbia, Vancouver, Canada.
| |
Collapse
|
21
|
Irwin J, DiBlasi L. Audiovisual speech perception: A new approach and implications for clinical populations. LANGUAGE AND LINGUISTICS COMPASS 2017; 11:77-91. [PMID: 29520300 PMCID: PMC5839512 DOI: 10.1111/lnc3.12237] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 01/25/2017] [Indexed: 06/01/2023]
Abstract
This selected overview of audiovisual (AV) speech perception examines the influence of visible articulatory information on what is heard. Thought to be a cross-cultural phenomenon that emerges early in typical language development, variables that influence AV speech perception include properties of the visual and the auditory signal, attentional demands, and individual differences. A brief review of the existing neurobiological evidence on how visual information influences heard speech indicates potential loci, timing, and facilitatory effects of AV over auditory only speech. The current literature on AV speech in certain clinical populations (individuals with an autism spectrum disorder, developmental language disorder, or hearing loss) reveals differences in processing that may inform interventions. Finally, a new method of assessing AV speech that does not require obvious cross-category mismatch or auditory noise was presented as a novel approach for investigators.
Collapse
Affiliation(s)
- Julia Irwin
- LEARN Center, Haskins Laboratories Inc., USA
| | | |
Collapse
|
22
|
A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech. PLoS Comput Biol 2017; 13:e1005229. [PMID: 28207734 PMCID: PMC5312805 DOI: 10.1371/journal.pcbi.1005229] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 11/01/2016] [Indexed: 11/19/2022] Open
Abstract
Audiovisual speech integration combines information from auditory speech (talker’s voice) and visual speech (talker’s mouth movements) to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same source, a process known as causal inference. A well-known illusion, the McGurk Effect, consists of incongruent audiovisual syllables, such as auditory “ba” + visual “ga” (AbaVga), that are integrated to produce a fused percept (“da”). This illusion raises two fundamental questions: first, given the incongruence between the auditory and visual syllables in the McGurk stimulus, why are they integrated; and second, why does the McGurk effect not occur for other, very similar syllables (e.g., AgaVba). We describe a simplified model of causal inference in multisensory speech perception (CIMS) that predicts the perception of arbitrary combinations of auditory and visual speech. We applied this model to behavioral data collected from 60 subjects perceiving both McGurk and non-McGurk incongruent speech stimuli. The CIMS model successfully predicted both the audiovisual integration observed for McGurk stimuli and the lack of integration observed for non-McGurk stimuli. An identical model without causal inference failed to accurately predict perception for either form of incongruent speech. The CIMS model uses causal inference to provide a computational framework for studying how the brain performs one of its most important tasks, integrating auditory and visual speech cues to allow us to communicate with others. During face-to-face conversations, we seamlessly integrate information from the talker’s voice with information from the talker’s face. This multisensory integration increases speech perception accuracy and can be critical for understanding speech in noisy environments with many people talking simultaneously. A major challenge for models of multisensory speech perception is thus deciding which voices and faces should be integrated. Our solution to this problem is based on the idea of causal inference—given a particular pair of auditory and visual syllables, the brain calculates the likelihood they are from a single vs. multiple talkers and uses this likelihood to determine the final speech percept. We compared our model with an alternative model that is identical, except that it always integrated the available cues. Using behavioral speech perception data from a large number of subjects, the model with causal inference better predicted how humans would (or would not) integrate audiovisual speech syllables. Our results suggest a fundamental role for a causal inference type calculation in multisensory speech perception.
Collapse
|
23
|
Yahata I, Kawase T, Kanno A, Hidaka H, Sakamoto S, Nakasato N, Kawashima R, Katori Y. Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance. PLoS One 2017; 12:e0170166. [PMID: 28141836 PMCID: PMC5283660 DOI: 10.1371/journal.pone.0170166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Accepted: 12/30/2016] [Indexed: 11/18/2022] Open
Abstract
The effects of visual speech (the moving image of the speaker’s face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker’s face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker’s face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information.
Collapse
Affiliation(s)
- Izumi Yahata
- Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
| | - Tetsuaki Kawase
- Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
- Laboratory of Rehabilitative Auditory Science, Tohoku University Graduate School of Biomedical Engineering, Sendai, Miyagi, Japan
- Department of Audiology, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
- * E-mail:
| | - Akitake Kanno
- Department of Functional Brain Imaging, Institute of Development, Aging and Cancer, Tohoku University, Sendai, Miyagi, Japan
| | - Hiroshi Hidaka
- Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
| | - Shuichi Sakamoto
- Research Institute of Electrical Communication, Tohoku University, Sendai, Miyagi, Japan
| | - Nobukazu Nakasato
- Department of Epileptology, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
| | - Ryuta Kawashima
- Department of Functional Brain Imaging, Institute of Development, Aging and Cancer, Tohoku University, Sendai, Miyagi, Japan
| | - Yukio Katori
- Department of Otolaryngology-Head and Neck Surgery, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan
| |
Collapse
|
24
|
Mossbridge J, Zweig J, Grabowecky M, Suzuki S. An Association between Auditory-Visual Synchrony Processing and Reading Comprehension: Behavioral and Electrophysiological Evidence. J Cogn Neurosci 2017; 29:435-447. [PMID: 28129060 DOI: 10.1162/jocn_a_01052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The perceptual system integrates synchronized auditory-visual signals in part to promote individuation of objects in cluttered environments. The processing of auditory-visual synchrony may more generally contribute to cognition by synchronizing internally generated multimodal signals. Reading is a prime example because the ability to synchronize internal phonological and/or lexical processing with visual orthographic processing may facilitate encoding of words and meanings. Consistent with this possibility, developmental and clinical research has suggested a link between reading performance and the ability to compare visual spatial/temporal patterns with auditory temporal patterns. Here, we provide converging behavioral and electrophysiological evidence suggesting that greater behavioral ability to judge auditory-visual synchrony (Experiment 1) and greater sensitivity of an electrophysiological marker of auditory-visual synchrony processing (Experiment 2) both predict superior reading comprehension performance, accounting for 16% and 25% of the variance, respectively. These results support the idea that the mechanisms that detect auditory-visual synchrony contribute to reading comprehension.
Collapse
|
25
|
Havy M, Foroud A, Fais L, Werker JF. The Role of Auditory and Visual Speech in Word Learning at 18 Months and in Adulthood. Child Dev 2017; 88:2043-2059. [PMID: 28124795 DOI: 10.1111/cdev.12715] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Visual information influences speech perception in both infants and adults. It is still unknown whether lexical representations are multisensory. To address this question, we exposed 18-month-old infants (n = 32) and adults (n = 32) to new word-object pairings: Participants either heard the acoustic form of the words or saw the talking face in silence. They were then tested on recognition in the same or the other modality. Both 18-month-old infants and adults learned the lexical mappings when the words were presented auditorily and recognized the mapping at test when the word was presented in either modality, but only adults learned new words in a visual-only presentation. These results suggest developmental changes in the sensory format of lexical representations.
Collapse
Affiliation(s)
- Mélanie Havy
- University of British Columbia.,Université de Genève
| | | | | | | |
Collapse
|
26
|
Shinozaki J, Hiroe N, Sato MA, Nagamine T, Sekiyama K. Impact of language on functional connectivity for audiovisual speech integration. Sci Rep 2016; 6:31388. [PMID: 27510407 PMCID: PMC4980767 DOI: 10.1038/srep31388] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 07/19/2016] [Indexed: 11/17/2022] Open
Abstract
Visual information about lip and facial movements plays a role in audiovisual (AV) speech perception. Although this has been widely confirmed, previous behavioural studies have shown interlanguage differences, that is, native Japanese speakers do not integrate auditory and visual speech as closely as native English speakers. To elucidate the neural basis of such interlanguage differences, 22 native English speakers and 24 native Japanese speakers were examined in behavioural or functional Magnetic Resonance Imaging (fMRI) experiments while mono-syllabic speech was presented under AV, auditory-only, or visual-only conditions for speech identification. Behavioural results indicated that the English speakers identified visual speech more quickly than the Japanese speakers, and that the temporal facilitation effect of congruent visual speech was significant in the English speakers but not in the Japanese speakers. Using fMRI data, we examined the functional connectivity among brain regions important for auditory-visual interplay. The results indicated that the English speakers had significantly stronger connectivity between the visual motion area MT and the Heschl’s gyrus compared with the Japanese speakers, which may subserve lower-level visual influences on speech perception in English speakers in a multisensory environment. These results suggested that linguistic experience strongly affects neural connectivity involved in AV speech integration.
Collapse
Affiliation(s)
- Jun Shinozaki
- Department of Systems Neuroscience, School of Medicine, Sapporo Medical University, Sapporo, Japan
| | - Nobuo Hiroe
- ATR Neural Information Analysis Laboratories, Seika-cho, Japan
| | - Masa-Aki Sato
- ATR Neural Information Analysis Laboratories, Seika-cho, Japan
| | - Takashi Nagamine
- Department of Systems Neuroscience, School of Medicine, Sapporo Medical University, Sapporo, Japan
| | - Kaoru Sekiyama
- Division of Cognitive Psychology, Faculty of Letters, Kumamoto University, Kumamoto, Japan
| |
Collapse
|
27
|
Skilled musicians are not subject to the McGurk effect. Sci Rep 2016; 6:30423. [PMID: 27453363 PMCID: PMC4958963 DOI: 10.1038/srep30423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 07/05/2016] [Indexed: 11/25/2022] Open
Abstract
The McGurk effect is a compelling illusion in which humans auditorily perceive mismatched audiovisual speech as a completely different syllable. In this study evidences are provided that professional musicians are not subject to this illusion, possibly because of their finer auditory or attentional abilities. 80 healthy age-matched graduate students volunteered to the study. 40 were musicians of Brescia Luca Marenzio Conservatory of Music with at least 8–13 years of musical academic studies. /la/, /da/, /ta/, /ga/, /ka/, /na/, /ba/, /pa/ phonemes were presented to participants in audiovisual congruent and incongruent conditions, or in unimodal (only visual or only auditory) conditions while engaged in syllable recognition tasks. Overall musicians showed no significant McGurk effect for any of the phonemes. Controls showed a marked McGurk effect for several phonemes (including alveolar-nasal, velar-occlusive and bilabial ones). The results indicate that the early and intensive musical training might affect the way the auditory cortex process phonetic information.
Collapse
|
28
|
Variability and stability in the McGurk effect: contributions of participants, stimuli, time, and response type. Psychon Bull Rev 2016; 22:1299-307. [PMID: 25802068 DOI: 10.3758/s13423-015-0817-4] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In the McGurk effect, pairing incongruent auditory and visual syllables produces a percept different from the component syllables. Although it is a popular assay of audiovisual speech integration, little is known about the distribution of responses to the McGurk effect in the population. In our first experiment, we measured McGurk perception using 12 different McGurk stimuli in a sample of 165 English-speaking adults, 40 of whom were retested following a one-year interval. We observed dramatic differences both in how frequently different individuals perceived the illusion (from 0 % to 100 %) and in how frequently the illusion was perceived across different stimuli (17 % to 58 %). For individual stimuli, the distributions of response frequencies deviated strongly from normality, with 77 % of participants almost never or almost always perceiving the effect (≤10 % or ≥90 %). This deviation suggests that the mean response frequency, the most commonly reported measure of the McGurk effect, is a poor measure of individual participants' responses, and that the assumptions made by parametric statistical tests are invalid. Despite the substantial variability across individuals and stimuli, there was little change in the frequency of the effect between initial testing and a one-year retest (mean change in frequency = 2 %; test-retest correlation, r = 0.91). In a second experiment, we replicated our findings of high variability using eight new McGurk stimuli and tested the effects of open-choice versus forced-choice responding. Forced-choice responding resulted in an estimated 18 % greater frequency of the McGurk effect but similar levels of interindividual variability. Our results highlight the importance of examining individual differences in McGurk perception instead of relying on summary statistics averaged across a population. However, individual variability in the McGurk effect does not preclude its use as a stable measure of audiovisual integration.
Collapse
|
29
|
Tona R, Naito Y, Moroto S, Yamamoto R, Fujiwara K, Yamazaki H, Shinohara S, Kikuchi M. Audio-visual integration during speech perception in prelingually deafened Japanese children revealed by the McGurk effect. Int J Pediatr Otorhinolaryngol 2015; 79:2072-8. [PMID: 26455920 DOI: 10.1016/j.ijporl.2015.09.016] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2015] [Revised: 09/08/2015] [Accepted: 09/13/2015] [Indexed: 10/23/2022]
Abstract
OBJECTIVE To investigate the McGurk effect in profoundly deafened Japanese children with cochlear implants (CI) and in normal-hearing children. This was done to identify how children with profound deafness using CI established audiovisual integration during the speech acquisition period. METHODS Twenty-four prelingually deafened children with CI and 12 age-matched normal-hearing children participated in this study. Responses to audiovisual stimuli were compared between deafened and normal-hearing controls. Additionally, responses of the children with CI younger than 6 years of age were compared with those of the children with CI at least 6 years of age at the time of the test. RESULTS Responses to stimuli combining auditory labials and visual non-labials were significantly different between deafened children with CI and normal-hearing controls (p<0.05). Additionally, the McGurk effect tended to be more induced in deafened children older than 6 years of age than in their younger counterparts. CONCLUSIONS The McGurk effect was more significantly induced in prelingually deafened Japanese children with CI than in normal-hearing, age-matched Japanese children. Despite having good speech-perception skills and auditory input through their CI, from early childhood, deafened children may use more visual information in speech perception than normal-hearing children. As children using CI need to communicate based on insufficient speech signals coded by CI, additional activities of higher-order brain function may be necessary to compensate for the incomplete auditory input. This study provided information on the influence of deafness on the development of audiovisual integration related to speech, which could contribute to our further understanding of the strategies used in spoken language communication by prelingually deafened children.
Collapse
Affiliation(s)
- Risa Tona
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan; Department of Otolaryngology, Institute of Biomedical Research and Innovation, Kobe, Japan; Department of Otolaryngology, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Yasushi Naito
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan; Department of Otolaryngology, Institute of Biomedical Research and Innovation, Kobe, Japan.
| | - Saburo Moroto
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan; Department of Otolaryngology, Institute of Biomedical Research and Innovation, Kobe, Japan
| | - Rinko Yamamoto
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Keizo Fujiwara
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan; Department of Otolaryngology, Institute of Biomedical Research and Innovation, Kobe, Japan
| | - Hiroshi Yamazaki
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan; Department of Otolaryngology, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Shogo Shinohara
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan
| | - Masahiro Kikuchi
- Department of Otolaryngology, Kobe City Medical Center General Hospital, Kobe, Japan
| |
Collapse
|
30
|
A link between individual differences in multisensory speech perception and eye movements. Atten Percept Psychophys 2015; 77:1333-41. [PMID: 25810157 DOI: 10.3758/s13414-014-0821-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The McGurk effect is an illusion in which visual speech information dramatically alters the perception of auditory speech. However, there is a high degree of individual variability in how frequently the illusion is perceived: some individuals almost always perceive the McGurk effect, while others rarely do. Another axis of individual variability is the pattern of eye movements make while viewing a talking face: some individuals often fixate the mouth of the talker, while others rarely do. Since the talker's mouth carries the visual speech necessary information to induce the McGurk effect, we hypothesized that individuals who frequently perceive the McGurk effect should spend more time fixating the talker's mouth. We used infrared eye tracking to study eye movements as 40 participants viewed audiovisual speech. Frequent perceivers of the McGurk effect were more likely to fixate the mouth of the talker, and there was a significant correlation between McGurk frequency and mouth looking time. The noisy encoding of disparity model of McGurk perception showed that individuals who frequently fixated the mouth had lower sensory noise and higher disparity thresholds than those who rarely fixated the mouth. Differences in eye movements when viewing the talker's face may be an important contributor to interindividual differences in multisensory speech perception.
Collapse
|
31
|
Stevenson RA, Segers M, Ferber S, Barense MD, Camarata S, Wallace MT. Keeping time in the brain: Autism spectrum disorder and audiovisual temporal processing. Autism Res 2015; 9:720-38. [PMID: 26402725 DOI: 10.1002/aur.1566] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Revised: 08/22/2015] [Accepted: 08/29/2015] [Indexed: 12/21/2022]
Abstract
A growing area of interest and relevance in the study of autism spectrum disorder (ASD) focuses on the relationship between multisensory temporal function and the behavioral, perceptual, and cognitive impairments observed in ASD. Atypical sensory processing is becoming increasingly recognized as a core component of autism, with evidence of atypical processing across a number of sensory modalities. These deviations from typical processing underscore the value of interpreting ASD within a multisensory framework. Furthermore, converging evidence illustrates that these differences in audiovisual processing may be specifically related to temporal processing. This review seeks to bridge the connection between temporal processing and audiovisual perception, and to elaborate on emerging data showing differences in audiovisual temporal function in autism. We also discuss the consequence of such changes, the specific impact on the processing of different classes of audiovisual stimuli (e.g. speech vs. nonspeech, etc.), and the presumptive brain processes and networks underlying audiovisual temporal integration. Finally, possible downstream behavioral implications, and possible remediation strategies are outlined. Autism Res 2016, 9: 720-738. © 2015 International Society for Autism Research, Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Ryan A Stevenson
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada
| | - Magali Segers
- Department of Psychology, York University, Toronto, Ontario, Canada
| | - Susanne Ferber
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada.,Rotman Research Institute, Toronto, Ontario, Canada
| | - Morgan D Barense
- Department of Psychology, University of Toronto, Toronto, Ontario, Canada.,Rotman Research Institute, Toronto, Ontario, Canada
| | - Stephen Camarata
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee.,Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Mark T Wallace
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee.,Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, Tennessee.,Vanderbilt Brain Institute, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Psychology, Vanderbilt University, Nashville, Tennessee.,Department of Psychiatry, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
32
|
Abstract
The superior temporal sulcus (STS) is implicated in a variety of social processes, ranging from language perception to simulating the mental processes of others (theory of mind). In a new study, Deen and colleagues use functional magnetic resonance imaging (fMRI) to show a regular anterior-posterior organization in the STS for different social tasks.
Collapse
Affiliation(s)
- Michael S Beauchamp
- Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
33
|
Hannagan T, Amedi A, Cohen L, Dehaene-Lambertz G, Dehaene S. Origins of the specialization for letters and numbers in ventral occipitotemporal cortex. Trends Cogn Sci 2015; 19:374-82. [DOI: 10.1016/j.tics.2015.05.006] [Citation(s) in RCA: 110] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Revised: 05/15/2015] [Accepted: 05/15/2015] [Indexed: 01/06/2023]
|
34
|
Raschle NM, Smith SA, Zuk J, Dauvermann MR, Figuccio MJ, Gaab N. Investigating the neural correlates of voice versus speech-sound directed information in pre-school children. PLoS One 2014; 9:e115549. [PMID: 25532132 PMCID: PMC4274095 DOI: 10.1371/journal.pone.0115549] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 11/24/2014] [Indexed: 02/06/2023] Open
Abstract
Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI) in 20 typically developing preschool children (average age = 5.8 y; range 5.2-6.8 y) to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized, and functions similarly to adults by the age of five. We thus extend previous knowledge of voice-specific regions and their functional connections to the young brain which may further our understanding of the neuronal mechanism of speech-specific processing in children with developmental disorders, such as autism or specific language impairments.
Collapse
Affiliation(s)
- Nora Maria Raschle
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Psychiatric University Clinics Basel, Department of Child and Adolescent Psychiatry, Basel, Switzerland
| | - Sara Ashley Smith
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
| | - Jennifer Zuk
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Maria Regina Dauvermann
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Michael Joseph Figuccio
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
| | - Nadine Gaab
- Laboratories of Cognitive Neuroscience, Division of Developmental Medicine, Department of Developmental Medicine, Boston Children's Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard Graduate School of Education, Cambridge, Massachusetts, United States of America
| |
Collapse
|
35
|
Kaganovich N, Schumaker J. Audiovisual integration for speech during mid-childhood: electrophysiological evidence. BRAIN AND LANGUAGE 2014; 139:36-48. [PMID: 25463815 PMCID: PMC4363284 DOI: 10.1016/j.bandl.2014.09.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Revised: 09/28/2014] [Accepted: 09/30/2014] [Indexed: 05/05/2023]
Abstract
Previous studies have demonstrated that the presence of visual speech cues reduces the amplitude and latency of the N1 and P2 event-related potential (ERP) components elicited by speech stimuli. However, the developmental trajectory of this effect is not yet fully mapped. We examined ERP responses to auditory, visual, and audiovisual speech in two groups of school-age children (7-8-year-olds and 10-11-year-olds) and in adults. Audiovisual speech led to the attenuation of the N1 and P2 components in all groups of participants, suggesting that the neural mechanisms underlying these effects are functional by early school years. Additionally, while the reduction in N1 was largest over the right scalp, the P2 attenuation was largest over the left and midline scalp. The difference in the hemispheric distribution of the N1 and P2 attenuation supports the idea that these components index at least somewhat disparate neural processes within the context of audiovisual speech perception.
Collapse
Affiliation(s)
- Natalya Kaganovich
- Department of Speech, Language, and Hearing Sciences, Purdue University, Lyles Porter Hall, 715 Clinic Drive, West Lafayette, IN 47907-2038, United States; Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907-2038, United States.
| | - Jennifer Schumaker
- Department of Speech, Language, and Hearing Sciences, Purdue University, Lyles Porter Hall, 715 Clinic Drive, West Lafayette, IN 47907-2038, United States
| |
Collapse
|
36
|
Altvater-Mackensen N, Grossmann T. Learning to Match Auditory and Visual Speech Cues: Social Influences on Acquisition of Phonological Categories. Child Dev 2014; 86:362-78. [DOI: 10.1111/cdev.12320] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
37
|
Wallace MT, Stevenson RA. The construct of the multisensory temporal binding window and its dysregulation in developmental disabilities. Neuropsychologia 2014; 64:105-23. [PMID: 25128432 PMCID: PMC4326640 DOI: 10.1016/j.neuropsychologia.2014.08.005] [Citation(s) in RCA: 195] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Revised: 08/04/2014] [Accepted: 08/05/2014] [Indexed: 01/18/2023]
Abstract
Behavior, perception and cognition are strongly shaped by the synthesis of information across the different sensory modalities. Such multisensory integration often results in performance and perceptual benefits that reflect the additional information conferred by having cues from multiple senses providing redundant or complementary information. The spatial and temporal relationships of these cues provide powerful statistical information about how these cues should be integrated or "bound" in order to create a unified perceptual representation. Much recent work has examined the temporal factors that are integral in multisensory processing, with many focused on the construct of the multisensory temporal binding window - the epoch of time within which stimuli from different modalities is likely to be integrated and perceptually bound. Emerging evidence suggests that this temporal window is altered in a series of neurodevelopmental disorders, including autism, dyslexia and schizophrenia. In addition to their role in sensory processing, these deficits in multisensory temporal function may play an important role in the perceptual and cognitive weaknesses that characterize these clinical disorders. Within this context, focus on improving the acuity of multisensory temporal function may have important implications for the amelioration of the "higher-order" deficits that serve as the defining features of these disorders.
Collapse
Affiliation(s)
- Mark T Wallace
- Vanderbilt Brain Institute, Vanderbilt University, 465 21st Avenue South, Nashville, TN 37232, USA; Department of Hearing & Speech Sciences, Vanderbilt University, Nashville, TN, USA; Department of Psychology, Vanderbilt University, Nashville, TN, USA; Department of Psychiatry, Vanderbilt University, Nashville, TN, USA.
| | - Ryan A Stevenson
- Department of Psychology, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
38
|
Žarić G, Fraga González G, Tijms J, van der Molen MW, Blomert L, Bonte M. Reduced neural integration of letters and speech sounds in dyslexic children scales with individual differences in reading fluency. PLoS One 2014; 9:e110337. [PMID: 25329388 PMCID: PMC4199667 DOI: 10.1371/journal.pone.0110337] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Accepted: 09/20/2014] [Indexed: 11/18/2022] Open
Abstract
The acquisition of letter-speech sound associations is one of the basic requirements for fluent reading acquisition and its failure may contribute to reading difficulties in developmental dyslexia. Here we investigated event-related potential (ERP) measures of letter-speech sound integration in 9-year-old typical and dyslexic readers and specifically test their relation to individual differences in reading fluency. We employed an audiovisual oddball paradigm in typical readers (n = 20), dysfluent (n = 18) and severely dysfluent (n = 18) dyslexic children. In one auditory and two audiovisual conditions the Dutch spoken vowels/a/and/o/were presented as standard and deviant stimuli. In audiovisual blocks, the letter ‘a’ was presented either simultaneously (AV0), or 200 ms before (AV200) vowel sound onset. Across the three children groups, vowel deviancy in auditory blocks elicited comparable mismatch negativity (MMN) and late negativity (LN) responses. In typical readers, both audiovisual conditions (AV0 and AV200) led to enhanced MMN and LN amplitudes. In both dyslexic groups, the audiovisual LN effects were mildly reduced. Most interestingly, individual differences in reading fluency were correlated with MMN latency in the AV0 condition. A further analysis revealed that this effect was driven by a short-lived MMN effect encompassing only the N1 window in severely dysfluent dyslexics versus a longer MMN effect encompassing both the N1 and P2 windows in the other two groups. Our results confirm and extend previous findings in dyslexic children by demonstrating a deficient pattern of letter-speech sound integration depending on the level of reading dysfluency. These findings underscore the importance of considering individual differences across the entire spectrum of reading skills in addition to group differences between typical and dyslexic readers.
Collapse
Affiliation(s)
- Gojko Žarić
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, University of Maastricht, Maastricht, Netherlands
- Maastricht Brain Imaging Center (M-BIC), Maastricht, Netherlands
- * E-mail:
| | - Gorka Fraga González
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Rudolf Berlin Center, Amsterdam, Netherlands
| | - Jurgen Tijms
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- IWAL Institute for Dyslexia, Amsterdam, Netherlands
- Rudolf Berlin Center, Amsterdam, Netherlands
| | - Maurits W. van der Molen
- Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
- Rudolf Berlin Center, Amsterdam, Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
| | - Leo Blomert
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, University of Maastricht, Maastricht, Netherlands
- Maastricht Brain Imaging Center (M-BIC), Maastricht, Netherlands
| | - Milene Bonte
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, University of Maastricht, Maastricht, Netherlands
- Maastricht Brain Imaging Center (M-BIC), Maastricht, Netherlands
| |
Collapse
|
39
|
Dissociating Cortical Activity during Processing of Native and Non-Native Audiovisual Speech from Early to Late Infancy. Brain Sci 2014; 4:471-87. [PMID: 25116572 PMCID: PMC4194034 DOI: 10.3390/brainsci4030471] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 06/27/2014] [Accepted: 07/14/2014] [Indexed: 11/17/2022] Open
Abstract
Initially, infants are capable of discriminating phonetic contrasts across the world's languages. Starting between seven and ten months of age, they gradually lose this ability through a process of perceptual narrowing. Although traditionally investigated with isolated speech sounds, such narrowing occurs in a variety of perceptual domains (e.g., faces, visual speech). Thus far, tracking the developmental trajectory of this tuning process has been focused primarily on auditory speech alone, and generally using isolated sounds. But infants learn from speech produced by people talking to them, meaning they learn from a complex audiovisual signal. Here, we use near-infrared spectroscopy to measure blood concentration changes in the bilateral temporal cortices of infants in three different age groups: 3-to-6 months, 7-to-10 months, and 11-to-14-months. Critically, all three groups of infants were tested with continuous audiovisual speech in both their native and another, unfamiliar language. We found that at each age range, infants showed different patterns of cortical activity in response to the native and non-native stimuli. Infants in the youngest group showed bilateral cortical activity that was greater overall in response to non-native relative to native speech; the oldest group showed left lateralized activity in response to native relative to non-native speech. These results highlight perceptual tuning as a dynamic process that happens across modalities and at different levels of stimulus complexity.
Collapse
|
40
|
Guellaï B, Streri A, Yeung HH. The development of sensorimotor influences in the audiovisual speech domain: some critical questions. Front Psychol 2014; 5:812. [PMID: 25147528 PMCID: PMC4123602 DOI: 10.3389/fpsyg.2014.00812] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2014] [Accepted: 07/09/2014] [Indexed: 11/13/2022] Open
Abstract
Speech researchers have long been interested in how auditory and visual speech signals are integrated, and the recent work has revived interest in the role of speech production with respect to this process. Here, we discuss these issues from a developmental perspective. Because speech perception abilities typically outstrip speech production abilities in infancy and childhood, it is unclear how speech-like movements could influence audiovisual speech perception in development. While work on this question is still in its preliminary stages, there is nevertheless increasing evidence that sensorimotor processes (defined here as any motor or proprioceptive process related to orofacial movements) affect developmental audiovisual speech processing. We suggest three areas on which to focus in future research: (i) the relation between audiovisual speech perception and sensorimotor processes at birth, (ii) the pathways through which sensorimotor processes interact with audiovisual speech processing in infancy, and (iii) developmental change in sensorimotor pathways as speech production emerges in childhood.
Collapse
Affiliation(s)
- Bahia Guellaï
- Laboratoire Ethologie, Cognition, Développement, Université Paris Ouest Nanterre La Défense, NanterreFrance
| | - Arlette Streri
- CNRS, Laboratoire Psychologie de la Perception, UMR 8242, ParisFrance
| | - H. Henny Yeung
- CNRS, Laboratoire Psychologie de la Perception, UMR 8242, ParisFrance
- Université Paris Descartes, Paris Sorbonne Cité, ParisFrance
| |
Collapse
|
41
|
Erickson LC, Heeg E, Rauschecker JP, Turkeltaub PE. An ALE meta-analysis on the audiovisual integration of speech signals. Hum Brain Mapp 2014; 35:5587-605. [PMID: 24996043 DOI: 10.1002/hbm.22572] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Revised: 05/28/2014] [Accepted: 06/24/2014] [Indexed: 11/09/2022] Open
Abstract
The brain improves speech processing through the integration of audiovisual (AV) signals. Situations involving AV speech integration may be crudely dichotomized into those where auditory and visual inputs contain (1) equivalent, complementary signals (validating AV speech) or (2) inconsistent, different signals (conflicting AV speech). This simple framework may allow the systematic examination of broad commonalities and differences between AV neural processes engaged by various experimental paradigms frequently used to study AV speech integration. We conducted an activation likelihood estimation metaanalysis of 22 functional imaging studies comprising 33 experiments, 311 subjects, and 347 foci examining "conflicting" versus "validating" AV speech. Experimental paradigms included content congruency, timing synchrony, and perceptual measures, such as the McGurk effect or synchrony judgments, across AV speech stimulus types (sublexical to sentence). Colocalization of conflicting AV speech experiments revealed consistency across at least two contrast types (e.g., synchrony and congruency) in a network of dorsal stream regions in the frontal, parietal, and temporal lobes. There was consistency across all contrast types (synchrony, congruency, and percept) in the bilateral posterior superior/middle temporal cortex. Although fewer studies were available, validating AV speech experiments were localized to other regions, such as ventral stream visual areas in the occipital and inferior temporal cortex. These results suggest that while equivalent, complementary AV speech signals may evoke activity in regions related to the corroboration of sensory input, conflicting AV speech signals recruit widespread dorsal stream areas likely involved in the resolution of conflicting sensory signals.
Collapse
Affiliation(s)
- Laura C Erickson
- Department of Neurology, Georgetown University Medical Center, Washington, District of Columbia; Department of Neuroscience, Georgetown University Medical Center, Washington, District of Columbia
| | | | | | | |
Collapse
|
42
|
Erickson LC, Zielinski BA, Zielinski JEV, Liu G, Turkeltaub PE, Leaver AM, Rauschecker JP. Distinct cortical locations for integration of audiovisual speech and the McGurk effect. Front Psychol 2014; 5:534. [PMID: 24917840 PMCID: PMC4040936 DOI: 10.3389/fpsyg.2014.00534] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 05/14/2014] [Indexed: 11/13/2022] Open
Abstract
Audiovisual (AV) speech integration is often studied using the McGurk effect, where the combination of specific incongruent auditory and visual speech cues produces the perception of a third illusory speech percept. Recently, several studies have implicated the posterior superior temporal sulcus (pSTS) in the McGurk effect; however, the exact roles of the pSTS and other brain areas in "correcting" differing AV sensory inputs remain unclear. Using functional magnetic resonance imaging (fMRI) in ten participants, we aimed to isolate brain areas specifically involved in processing congruent AV speech and the McGurk effect. Speech stimuli were composed of sounds and/or videos of consonant-vowel tokens resulting in four stimulus classes: congruent AV speech (AVCong), incongruent AV speech resulting in the McGurk effect (AVMcGurk), acoustic-only speech (AO), and visual-only speech (VO). In group- and single-subject analyses, left pSTS exhibited significantly greater fMRI signal for congruent AV speech (i.e., AVCong trials) than for both AO and VO trials. Right superior temporal gyrus, medial prefrontal cortex, and cerebellum were also identified. For McGurk speech (i.e., AVMcGurk trials), two clusters in the left posterior superior temporal gyrus (pSTG), just posterior to Heschl's gyrus or on its border, exhibited greater fMRI signal than both AO and VO trials. We propose that while some brain areas, such as left pSTS, may be more critical for the integration of AV speech, other areas, such as left pSTG, may generate the "corrected" or merged percept arising from conflicting auditory and visual cues (i.e., as in the McGurk effect). These findings are consistent with the concept that posterior superior temporal areas represent part of a "dorsal auditory stream," which is involved in multisensory integration, sensorimotor control, and optimal state estimation (Rauschecker and Scott, 2009).
Collapse
Affiliation(s)
- Laura C Erickson
- Department of Neuroscience, Georgetown University Medical Center, Washington DC, USA ; Department of Neurology, Georgetown University Medical Center, Washington DC, USA
| | - Brandon A Zielinski
- Department of Physiology and Biophysics, Georgetown University Medical Center, Washington DC, USA ; Departments of Pediatrics and Neurology, Division of Child Neurology, University of Utah, Salt Lake City UT, USA
| | - Jennifer E V Zielinski
- Department of Physiology and Biophysics, Georgetown University Medical Center, Washington DC, USA
| | - Guoying Liu
- Department of Physiology and Biophysics, Georgetown University Medical Center, Washington DC, USA ; National Institutes of Health, Bethesda MD, USA
| | - Peter E Turkeltaub
- Department of Neurology, Georgetown University Medical Center, Washington DC, USA ; MedStar National Rehabilitation Hospital, Washington DC, USA
| | - Amber M Leaver
- Department of Neuroscience, Georgetown University Medical Center, Washington DC, USA ; Department of Neurology, University of California Los Angeles, Los Angeles CA, USA
| | - Josef P Rauschecker
- Department of Neuroscience, Georgetown University Medical Center, Washington DC, USA ; Department of Physiology and Biophysics, Georgetown University Medical Center, Washington DC, USA
| |
Collapse
|
43
|
Holtzer R, Epstein N, Mahoney JR, Izzetoglu M, Blumen HM. Neuroimaging of mobility in aging: a targeted review. J Gerontol A Biol Sci Med Sci 2014; 69:1375-88. [PMID: 24739495 DOI: 10.1093/gerona/glu052] [Citation(s) in RCA: 210] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND The relationship between mobility and cognition in aging is well established, but the relationship between mobility and the structure and function of the aging brain is relatively unknown. This, in part, is attributed to the technological limitations of most neuroimaging procedures, which require the individual to be immobile or in a supine position. Herein, we provide a targeted review of neuroimaging studies of mobility in aging to promote (i) a better understanding of this relationship, (ii) future research in this area, and (iii) development of applications for improving mobility. METHODS A systematic search of peer-reviewed studies was performed using PubMed. Search terms included (i) aging, older adults, or elderly; (ii) gait, walking, balance, or mobility; and (iii) magnetic resonance imaging, voxel-based morphometry, fluid-attenuated inversion recovery, diffusion tensor imaging, positron emission tomography, functional magnetic resonance imaging, electroencephalography, event-related potential, and functional near-infrared spectroscopy. RESULTS Poor mobility outcomes were reliably associated with reduced gray and white matter volume. Fewer studies examined the relationship between changes in task-related brain activation and mobility performance. Extant findings, however, showed that activation patterns in the cerebellum, basal ganglia, parietal and frontal cortices were related to mobility. Increased involvement of the prefrontal cortex was evident in both imagined walking conditions and conditions where the cognitive demands of locomotion were increased. CONCLUSIONS Cortical control of gait in aging is bilateral, widespread, and dependent on the integrity of both gray and white matter.
Collapse
Affiliation(s)
- Roee Holtzer
- Department of Neurology, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York. Ferkauf Graduate School of Psychology of Yeshiva University, Bronx, New York.
| | - Noah Epstein
- Ferkauf Graduate School of Psychology of Yeshiva University, Bronx, New York
| | - Jeannette R Mahoney
- Department of Neurology, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York
| | - Meltem Izzetoglu
- Drexel University School of Biomedical Engineering, Philadelphia, Pennsylvania
| | - Helena M Blumen
- Department of Medicine, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York
| |
Collapse
|
44
|
Knowland VCP, Mercure E, Karmiloff-Smith A, Dick F, Thomas MSC. Audio-visual speech perception: a developmental ERP investigation. Dev Sci 2014; 17:110-24. [PMID: 24176002 PMCID: PMC3995015 DOI: 10.1111/desc.12098] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 05/14/2013] [Indexed: 11/29/2022]
Abstract
Being able to see a talking face confers a considerable advantage for speech perception in adulthood. However, behavioural data currently suggest that children fail to make full use of these available visual speech cues until age 8 or 9. This is particularly surprising given the potential utility of multiple informational cues during language learning. We therefore explored this at the neural level. The event-related potential (ERP) technique has been used to assess the mechanisms of audio-visual speech perception in adults, with visual cues reliably modulating auditory ERP responses to speech. Previous work has shown congruence-dependent shortening of auditory N1/P2 latency and congruence-independent attenuation of amplitude in the presence of auditory and visual speech signals, compared to auditory alone. The aim of this study was to chart the development of these well-established modulatory effects over mid-to-late childhood. Experiment 1 employed an adult sample to validate a child-friendly stimulus set and paradigm by replicating previously observed effects of N1/P2 amplitude and latency modulation by visual speech cues; it also revealed greater attenuation of component amplitude given incongruent audio-visual stimuli, pointing to a new interpretation of the amplitude modulation effect. Experiment 2 used the same paradigm to map cross-sectional developmental change in these ERP responses between 6 and 11 years of age. The effect of amplitude modulation by visual cues emerged over development, while the effect of latency modulation was stable over the child sample. These data suggest that auditory ERP modulation by visual speech represents separable underlying cognitive processes, some of which show earlier maturation than others over the course of development.
Collapse
Affiliation(s)
- Victoria CP Knowland
- School of Health Sciences, City UniversityLondon, UK
- Department of Psychological Sciences, Birkbeck CollegeLondon, UK
| | | | | | - Fred Dick
- Department of Psychological Sciences, Birkbeck CollegeLondon, UK
| | | |
Collapse
|
45
|
McNorgan C, Randazzo-Wagner M, Booth JR. Cross-modal integration in the brain is related to phonological awareness only in typical readers, not in those with reading difficulty. Front Hum Neurosci 2013; 7:388. [PMID: 23888137 PMCID: PMC3719029 DOI: 10.3389/fnhum.2013.00388] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Accepted: 07/04/2013] [Indexed: 11/13/2022] Open
Abstract
Fluent reading requires successfully mapping between visual orthographic and auditory phonological representations and is thus an intrinsically cross-modal process, though reading difficulty has often been characterized as a phonological deficit. However, recent evidence suggests that orthographic information influences phonological processing in typical developing (TD) readers, but that this effect may be blunted in those with reading difficulty (RD), suggesting that the core deficit underlying reading difficulties may be a failure to integrate orthographic and phonological information. Twenty-six (13 TD and 13 RD) children between 8 and 13 years of age participated in a functional magnetic resonance imaging (fMRI) experiment designed to assess the role of phonemic awareness in cross-modal processing. Participants completed a rhyme judgment task for word pairs presented unimodally (auditory only) and cross-modally (auditory followed by visual). For typically developing children, correlations between elision and neural activation were found for the cross-modal but not unimodal task, whereas in children with RD, no correlation was found. The results suggest that elision taps both phonemic awareness and cross-modal integration in typically developing readers, and that these processes are decoupled in children with reading difficulty.
Collapse
Affiliation(s)
- Chris McNorgan
- Developmental Cognitive Neuroscience Laboratory, Department of Communication Studies and Disorders, Northwestern UniversityEvanston, IL, USA
| | | | | |
Collapse
|
46
|
Fava E, Hull R, Baumbauer K, Bortfeld H. Hemodynamic responses to speech and music in preverbal infants. Child Neuropsychol 2013; 20:430-48. [PMID: 23777481 DOI: 10.1080/09297049.2013.803524] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Numerous studies have provided clues about the ontogeny of lateralization of auditory processing in humans, but most have employed specific subtypes of stimuli and/or have assessed responses in discrete temporal windows. The present study used near-infrared spectroscopy (NIRS) to establish changes in hemodynamic activity in the neocortex of preverbal infants (aged 4-11 months) while they were exposed to two distinct types of complex auditory stimuli (full sentences and musical phrases). Measurements were taken from bilateral temporal regions, including both anterior and posterior superior temporal gyri. When the infant sample was treated as a homogenous group, no significant effects emerged for stimulus type. However, when infants' hemodynamic responses were categorized according to their overall changes in volume, two very clear neurophysiological patterns emerged. A high-responder group showed a pattern of early and increasing activation, primarily in the left hemisphere, similar to that observed in comparable studies with adults. In contrast, a low-responder group showed a pattern of gradual decreases in activation over time. Although age did track with responder type, no significant differences between these groups emerged for stimulus type, suggesting that the high- versus low-responder characterization generalizes across classes of auditory stimuli. These results highlight a new way to conceptualize the variable cortical blood flow patterns that are frequently observed across infants and stimuli, with hemodynamic response volumes potentially serving as an early indicator of developmental changes in auditory-processing sensitivity.
Collapse
Affiliation(s)
- Eswen Fava
- a Department of Psychology , University of Massachusetts Amherst , Amherst , MA , USA
| | | | | | | |
Collapse
|
47
|
McNorgan C, Awati N, Desroches AS, Booth JR. Multimodal lexical processing in auditory cortex is literacy skill dependent. Cereb Cortex 2013; 24:2464-75. [PMID: 23588185 DOI: 10.1093/cercor/bht100] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Literacy is a uniquely human cross-modal cognitive process wherein visual orthographic representations become associated with auditory phonological representations through experience. Developmental studies provide insight into how experience-dependent changes in brain organization influence phonological processing as a function of literacy. Previous investigations show a synchrony-dependent influence of letter presentation on individual phoneme processing in superior temporal sulcus; others demonstrate recruitment of primary and associative auditory cortex during cross-modal processing. We sought to determine whether brain regions supporting phonological processing of larger lexical units (monosyllabic words) over larger time windows is sensitive to cross-modal information, and whether such effects are literacy dependent. Twenty-two children (age 8-14 years) made rhyming judgments for sequentially presented word and pseudoword pairs presented either unimodally (auditory- or visual-only) or cross-modally (audiovisual). Regression analyses examined the relationship between literacy and congruency effects (overlapping orthography and phonology vs. overlapping phonology-only). We extend previous findings by showing that higher literacy is correlated with greater congruency effects in auditory cortex (i.e., planum temporale) only for cross-modal processing. These skill effects were specific to known words and occurred over a large time window, suggesting that multimodal integration in posterior auditory cortex is critical for fluent reading.
Collapse
Affiliation(s)
- Chris McNorgan
- Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA
| | - Neha Awati
- Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA
| | - Amy S Desroches
- Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA
| | - James R Booth
- Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
48
|
Hickok G. The cortical organization of speech processing: feedback control and predictive coding the context of a dual-stream model. JOURNAL OF COMMUNICATION DISORDERS 2012; 45:393-402. [PMID: 22766458 PMCID: PMC3468690 DOI: 10.1016/j.jcomdis.2012.06.004] [Citation(s) in RCA: 148] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the lexical-conceptual system. Here I provide an overview of the dual-stream model of speech processing and then discuss evidence concerning the source of predictive coding during speech recognition. I conclude that, in contrast to recent theoretical trends, the dorsal sensory-motor stream is not a source of forward prediction that can facilitate speech recognition. Rather, it is forward prediction coming out of the ventral stream that serves this function.
Collapse
Affiliation(s)
- Gregory Hickok
- Department of Cognitive Sciences, University of California, Irvine, Irvine, CA 92697, United States.
| |
Collapse
|
49
|
de Haas B, Kanai R, Jalkanen L, Rees G. Grey matter volume in early human visual cortex predicts proneness to the sound-induced flash illusion. Proc Biol Sci 2012; 279:4955-61. [PMID: 23097516 PMCID: PMC3497249 DOI: 10.1098/rspb.2012.2132] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Visual perception can be modulated by sounds. A drastic example of this is the sound-induced flash illusion: when a single flash is accompanied by two bleeps, it is sometimes perceived in an illusory fashion as two consecutive flashes. However, there are strong individual differences in proneness to this illusion. Some participants experience the illusion on almost every trial, whereas others almost never do. We investigated whether such individual differences in proneness to the sound-induced flash illusion were reflected in structural differences in brain regions whose activity is modulated by the illusion. We found that individual differences in proneness to the illusion were strongly and significantly correlated with local grey matter volume in early retinotopic visual cortex. Participants with smaller early visual cortices were more prone to the illusion. We propose that strength of auditory influences on visual perception is determined by individual differences in recurrent connections, cross-modal attention and/or optimal weighting of sensory channels.
Collapse
Affiliation(s)
- Benjamin de Haas
- University College London Institute of Cognitive Neuroscience, 17 Queen Square, London WC1N 3BG, UK.
| | | | | | | |
Collapse
|
50
|
Baum SH, Martin RC, Hamilton AC, Beauchamp MS. Multisensory speech perception without the left superior temporal sulcus. Neuroimage 2012; 62:1825-32. [PMID: 22634292 DOI: 10.1016/j.neuroimage.2012.05.034] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2012] [Revised: 05/10/2012] [Accepted: 05/14/2012] [Indexed: 10/28/2022] Open
Abstract
Converging evidence suggests that the left superior temporal sulcus (STS) is a critical site for multisensory integration of auditory and visual information during speech perception. We report a patient, SJ, who suffered a stroke that damaged the left tempo-parietal area, resulting in mild anomic aphasia. Structural MRI showed complete destruction of the left middle and posterior STS, as well as damage to adjacent areas in the temporal and parietal lobes. Surprisingly, SJ demonstrated preserved multisensory integration measured with two independent tests. First, she perceived the McGurk effect, an illusion that requires integration of auditory and visual speech. Second, her perception of morphed audiovisual speech with ambiguous auditory or visual information was significantly influenced by the opposing modality. To understand the neural basis for this preserved multisensory integration, blood-oxygen level dependent functional magnetic resonance imaging (BOLD fMRI) was used to examine brain responses to audiovisual speech in SJ and 23 healthy age-matched controls. In controls, bilateral STS activity was observed. In SJ, no activity was observed in the damaged left STS but in the right STS, more cortex was active in SJ than in any of the normal controls. Further, the amplitude of the BOLD response in right STS response to McGurk stimuli was significantly greater in SJ than in controls. The simplest explanation of these results is a reorganization of SJ's cortical language networks such that the right STS now subserves multisensory integration of speech.
Collapse
Affiliation(s)
- Sarah H Baum
- Department of Neurobiology and Anatomy, University of Texas Medical School at Houston, TX, USA
| | | | | | | |
Collapse
|