1
|
Jertberg RM, Wienicke FJ, Andruszkiewicz K, Begeer S, Chakrabarti B, Geurts HM, de Vries R, Van der Burg E. Differences between autistic and non-autistic individuals in audiovisual speech integration: A systematic review and meta-analysis. Neurosci Biobehav Rev 2024; 164:105787. [PMID: 38945419 DOI: 10.1016/j.neubiorev.2024.105787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 05/15/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]
Abstract
Research has indicated unique challenges in audiovisual integration of speech among autistic individuals, although methodological differences have led to divergent findings. We conducted a systematic literature search to identify studies that measured audiovisual speech integration among both autistic and non-autistic individuals. Across the 18 identified studies (combined N = 952), autistic individuals showed impaired audiovisual integration compared to their non-autistic peers (g = 0.69, 95 % CI [0.53, 0.85], p <.001). This difference was not found to be influenced by participants' mean ages, studies' sample sizes, risk-of-bias scores, or paradigms employed. However, a subgroup analysis suggested that child studies may show larger between-group differences than adult ones. The prevailing pattern of impaired audiovisual speech integration in autism may have cascading effects on communicative and social behavior. However, small samples and inconsistency in designs/analyses translated into considerable heterogeneity in findings and opacity regarding the influence of underlying unisensory and attentional factors. We recommend three key directions for future research: larger samples, more research with adults, and standardization of methodology and analytical approaches.
Collapse
Affiliation(s)
- Robert M Jertberg
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, the Netherlands.
| | - Frederik J Wienicke
- Department of Clinical Psychology, Behavioural Science Institute, Radboud University, Nijmegen, the Netherlands
| | - Krystian Andruszkiewicz
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
| | - Sander Begeer
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
| | - Bhismadev Chakrabarti
- Centre for Autism, School of Psychology and Clinical Language Sciences, University of Reading, UK; India Autism Center, Kolkata, India; Department of Psychology, Ashoka University, India
| | - Hilde M Geurts
- Department of Psychology, Universiteit van Amsterdam, the Netherlands; Leo Kannerhuis (Youz/Parnassiagroup), the Netherlands
| | - Ralph de Vries
- Medical Library, Vrije Universiteit, Amsterdam, the Netherlands
| | - Erik Van der Burg
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, the Netherlands; Department of Psychology, Universiteit van Amsterdam, the Netherlands
| |
Collapse
|
2
|
Jertberg RM, Begeer S, Geurts HM, Chakrabarti B, Van der Burg E. Age, not autism, influences multisensory integration of speech stimuli among adults in a McGurk/MacDonald paradigm. Eur J Neurosci 2024; 59:2979-2994. [PMID: 38570828 DOI: 10.1111/ejn.16319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 02/27/2024] [Accepted: 02/28/2024] [Indexed: 04/05/2024]
Abstract
Differences between autistic and non-autistic individuals in perception of the temporal relationships between sights and sounds are theorized to underlie difficulties in integrating relevant sensory information. These, in turn, are thought to contribute to problems with speech perception and higher level social behaviour. However, the literature establishing this connection often involves limited sample sizes and focuses almost entirely on children. To determine whether these differences persist into adulthood, we compared 496 autistic and 373 non-autistic adults (aged 17 to 75 years). Participants completed an online version of the McGurk/MacDonald paradigm, a multisensory illusion indicative of the ability to integrate audiovisual speech stimuli. Audiovisual asynchrony was manipulated, and participants responded both to the syllable they perceived (revealing their susceptibility to the illusion) and to whether or not the audio and video were synchronized (allowing insight into temporal processing). In contrast with prior research with smaller, younger samples, we detected no evidence of impaired temporal or multisensory processing in autistic adults. Instead, we found that in both groups, multisensory integration correlated strongly with age. This contradicts prior presumptions that differences in multisensory perception persist and even increase in magnitude over the lifespan of autistic individuals. It also suggests that the compensatory role multisensory integration may play as the individual senses decline with age is intact. These findings challenge existing theories and provide an optimistic perspective on autistic development. They also underline the importance of expanding autism research to better reflect the age range of the autistic population.
Collapse
Affiliation(s)
- Robert M Jertberg
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, Netherlands
| | - Sander Begeer
- Department of Clinical and Developmental Psychology, Vrije Universiteit Amsterdam, The Netherlands and Amsterdam Public Health Research Institute, Amsterdam, Netherlands
| | - Hilde M Geurts
- Dutch Autism and ADHD Research Center (d'Arc), Brain & Cognition, Department of Psychology, Universiteit van Amsterdam, Amsterdam, The Netherlands
- Leo Kannerhuis (Youz/Parnassiagroup), Den Haag, The Netherlands
| | - Bhismadev Chakrabarti
- Centre for Autism, School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
- India Autism Center, Kolkata, India
- Department of Psychology, Ashoka University, Sonipat, India
| | - Erik Van der Burg
- Dutch Autism and ADHD Research Center (d'Arc), Brain & Cognition, Department of Psychology, Universiteit van Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
3
|
Hong S, Wang R, Zeng B. Incongruent visual cues affect the perception of Mandarin vowel but not tone. Front Psychol 2023; 13:971979. [PMID: 36687891 PMCID: PMC9846355 DOI: 10.3389/fpsyg.2022.971979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 12/01/2022] [Indexed: 01/06/2023] Open
Abstract
Over the recent few decades, a large number of audiovisual speech studies have been focusing on the visual cues of consonants and vowels but neglecting those relating to lexical tones. In this study, we investigate whether incongruent audiovisual information interfered with the perception of lexical tones. We found that, for both Chinese and English speakers, incongruence between auditory and visemic mouth shape (i.e., visual form information) significantly interfered with reaction time and reduced the identification accuracy of vowels. However, incongruent lip movements (i.e., visual timing information) did not interfere with the perception of auditory lexical tone. We conclude that, in contrast to vowel perception, auditory tone perception seems relatively impervious to visual congruence cues, at least under these restricted laboratory conditions. The salience of visual form and timing information is discussed based on this finding.
Collapse
Affiliation(s)
- Shanhu Hong
- Institute of Foreign Language and Tourism, Quanzhou Preschool Education College, Quanzhou, China,Department of Psychology, Bournemouth University, Poole, United Kingdom
| | - Rui Wang
- School of Foreign Languages, Guangdong Pharmaceutical University, Guangzhou, China
| | - Biao Zeng
- Department of Psychology, Bournemouth University, Poole, United Kingdom,EEG Lab, Department of Psychology, University of South Wales, Newport, United Kingdom,*Correspondence: Biao Zeng ✉
| |
Collapse
|
4
|
Van Engen KJ, Dey A, Sommers MS, Peelle JE. Audiovisual speech perception: Moving beyond McGurk. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3216. [PMID: 36586857 PMCID: PMC9894660 DOI: 10.1121/10.0015262] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 10/26/2022] [Accepted: 11/05/2022] [Indexed: 05/29/2023]
Abstract
Although it is clear that sighted listeners use both auditory and visual cues during speech perception, the manner in which multisensory information is combined is a matter of debate. One approach to measuring multisensory integration is to use variants of the McGurk illusion, in which discrepant auditory and visual cues produce auditory percepts that differ from those based on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we join the voices of others in the field to argue that McGurk tasks are ill-suited for studying real-life multisensory speech perception: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility to McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication: namely, words, sentences, and narratives with congruent auditory and visual speech cues.
Collapse
Affiliation(s)
- Kristin J Van Engen
- Department of Psychological and Brain Sciences, Washington University, St. Louis, Missouri 63130, USA
| | - Avanti Dey
- PLOS ONE, 1265 Battery Street, San Francisco, California 94111, USA
| | - Mitchell S Sommers
- Department of Psychological and Brain Sciences, Washington University, St. Louis, Missouri 63130, USA
| | - Jonathan E Peelle
- Department of Otolaryngology, Washington University, St. Louis, Missouri 63130, USA
| |
Collapse
|
5
|
Lozano I, López Pérez D, Laudańska Z, Malinowska‐Korczak A, Szmytke M, Radkowska A, Tomalski P. Changes in selective attention to articulating mouth across infancy: Sex differences and associations with language outcomes. INFANCY 2022; 27:1132-1153. [DOI: 10.1111/infa.12496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/27/2022] [Accepted: 07/15/2022] [Indexed: 11/29/2022]
Affiliation(s)
- Itziar Lozano
- Department of Cognitive Psychology and Neurocognitive Science Faculty of Psychology, University of Warsaw Warsaw Poland
- Universidad Autónoma de Madrid, Faculty of Psychology Madrid Spain
| | - David López Pérez
- Neurocognitive Development Lab, Institute of Psychology, Polish Academy of Sciences Warsaw Poland
| | - Zuzanna Laudańska
- Neurocognitive Development Lab, Institute of Psychology, Polish Academy of Sciences Warsaw Poland
| | - Anna Malinowska‐Korczak
- Neurocognitive Development Lab, Institute of Psychology, Polish Academy of Sciences Warsaw Poland
| | - Magdalena Szmytke
- Neurocognitive Development Lab, Faculty of Psychology, University of Warsaw Warsaw Poland
| | - Alicja Radkowska
- Neurocognitive Development Lab, Institute of Psychology, Polish Academy of Sciences Warsaw Poland
- Neurocognitive Development Lab, Faculty of Psychology, University of Warsaw Warsaw Poland
| | - Przemysław Tomalski
- Neurocognitive Development Lab, Institute of Psychology, Polish Academy of Sciences Warsaw Poland
| |
Collapse
|
6
|
Irwin J, Avery T, Kleinman D, Landi N. Audiovisual Speech Perception in Children with Autism Spectrum Disorders: Evidence from Visual Phonemic Restoration. J Autism Dev Disord 2021; 52:28-37. [DOI: 10.1007/s10803-021-04916-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2021] [Indexed: 10/22/2022]
|
7
|
Plass J, Brang D, Suzuki S, Grabowecky M. Vision perceptually restores auditory spectral dynamics in speech. Proc Natl Acad Sci U S A 2020; 117:16920-16927. [PMID: 32632010 PMCID: PMC7382243 DOI: 10.1073/pnas.2002887117] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require lossy remapping of speech signals onto abstracted representations. Because visible articulators shape the spectral content of speech, we hypothesized that the perceptual system might exploit natural correlations between midlevel visual (oral deformations) and auditory speech features (frequency modulations) to extract detailed spectrotemporal information from visual speech without employing high-level abstractions. Consistent with this hypothesis, we found that the time-frequency dynamics of oral resonances (formants) could be predicted with unexpectedly high precision from the changing shape of the mouth during speech. When isolated from other speech cues, speech-based shape deformations improved perceptual sensitivity for corresponding frequency modulations, suggesting that listeners could exploit this cross-modal correspondence to facilitate perception. To test whether this type of correspondence could improve speech comprehension, we selectively degraded the spectral or temporal dimensions of auditory sentence spectrograms to assess how well visual speech facilitated comprehension under each degradation condition. Visual speech produced drastically larger enhancements during spectral degradation, suggesting a condition-specific facilitation effect driven by cross-modal recovery of auditory speech spectra. The perceptual system may therefore use audiovisual correlations rooted in oral acoustics to extract detailed spectrotemporal information from visual speech.
Collapse
Affiliation(s)
- John Plass
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109;
- Department of Psychology, Northwestern University, Evanston, IL 60208
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, IL 60208
- Interdepartmental Neuroscience Program, Northwestern University, Chicago, IL 60611
| |
Collapse
|
8
|
Dunham K, Feldman JI, Liu Y, Cassidy M, Conrad JG, Santapuram P, Suzman E, Tu A, Butera I, Simon DM, Broderick N, Wallace MT, Lewkowicz D, Woynaroski TG. Stability of Variables Derived From Measures of Multisensory Function in Children With Autism Spectrum Disorder. AMERICAN JOURNAL ON INTELLECTUAL AND DEVELOPMENTAL DISABILITIES 2020; 125:287-303. [PMID: 32609807 PMCID: PMC8903073 DOI: 10.1352/1944-7558-125.4.287] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 10/11/2019] [Indexed: 06/11/2023]
Abstract
Children with autism spectrum disorder (ASD) display differences in multisensory function as quantified by several different measures. This study estimated the stability of variables derived from commonly used measures of multisensory function in school-aged children with ASD. Participants completed: a simultaneity judgment task for audiovisual speech, tasks designed to elicit the McGurk effect, listening-in-noise tasks, electroencephalographic recordings, and eye-tracking tasks. Results indicate the stability of indices derived from tasks tapping multisensory processing is variable. These findings have important implications for measurement in future research. Averaging scores across repeated observations will often be required to obtain acceptably stable estimates and, thus, to increase the likelihood of detecting effects of interest, as it relates to multisensory processing in children with ASD.
Collapse
Affiliation(s)
- Kacie Dunham
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
| | - Jacob I. Feldman
- Department of Hearing & Speech Sciences, Vanderbilt University, Nashville, TN, USA
| | - Yupeng Liu
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
| | - Margaret Cassidy
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
| | - Julie G. Conrad
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
- Present Address: College of Medicine, University of Illinois, Chicago, IL, USA
| | - Pooja Santapuram
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
- Present Address: School of Medicine, Vanderbilt University, Nashville, TN, USA
| | - Evan Suzman
- Department of Biomedical Sciences, Vanderbilt University, Nashville, TN, USA
| | - Alexander Tu
- Neuroscience Undergraduate Program, Vanderbilt University, Nashville, TN, USA
- Present Address: College of Medicine, University of Nebraska Medical Center, Omaha, NE, USA
| | - Iliza Butera
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
| | - David M. Simon
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Present Address: axialHealthcare, Nashville, TN, USA
| | - Neill Broderick
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mark T. Wallace
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Hearing & Speech Sciences, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychology, Vanderbilt University, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - David Lewkowicz
- Department of Communication Sciences & Disorders, Northeastern University, Boston, MA, USA
| | - Tiffany G. Woynaroski
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Hearing & Speech Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
9
|
Irwin J, Avery T, Turcios J, Brancazio L, Cook B, Landi N. Electrophysiological Indices of Audiovisual Speech Perception in the Broader Autism Phenotype. Brain Sci 2017; 7:E60. [PMID: 28574442 PMCID: PMC5483633 DOI: 10.3390/brainsci7060060] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2017] [Revised: 05/16/2017] [Accepted: 05/26/2017] [Indexed: 12/05/2022] Open
Abstract
When a speaker talks, the consequences of this can both be heard (audio) and seen (visual). A novel visual phonemic restoration task was used to assess behavioral discrimination and neural signatures (event-related potentials, or ERP) of audiovisual processing in typically developing children with a range of social and communicative skills assessed using the social responsiveness scale, a measure of traits associated with autism. An auditory oddball design presented two types of stimuli to the listener, a clear exemplar of an auditory consonant-vowel syllable /ba/ (the more frequently occurring standard stimulus), and a syllable in which the auditory cues for the consonant were substantially weakened, creating a stimulus which is more like /a/ (the infrequently presented deviant stimulus). All speech tokens were paired with a face producing /ba/ or a face with a pixelated mouth containing motion but no visual speech. In this paradigm, the visual /ba/ should cause the auditory /a/ to be perceived as /ba/, creating an attenuated oddball response; in contrast, a pixelated video (without articulatory information) should not have this effect. Behaviorally, participants showed visual phonemic restoration (reduced accuracy in detecting deviant /a/) in the presence of a speaking face. In addition, ERPs were observed in both an early time window (N100) and a later time window (P300) that were sensitive to speech context (/ba/ or /a/) and modulated by face context (speaking face with visible articulation or with pixelated mouth). Specifically, the oddball responses for the N100 and P300 were attenuated in the presence of a face producing /ba/ relative to a pixelated face, representing a possible neural correlate of the phonemic restoration effect. Notably, those individuals with more traits associated with autism (yet still in the non-clinical range) had smaller P300 responses overall, regardless of face context, suggesting generally reduced phonemic discrimination.
Collapse
Affiliation(s)
- Julia Irwin
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Psychology, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Trey Avery
- Haskins Laboratories, New Haven, CT 06511, USA.
| | - Jacqueline Turcios
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Communication Disorders, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Lawrence Brancazio
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Psychology, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Barbara Cook
- Department of Communication Disorders, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Nicole Landi
- Haskins Laboratories, New Haven, CT 06511, USA.
- Psychological Sciences, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|