1
|
Kaganovich N, Schumaker J, Christ S. Long-term phonemic representations become audiovisual by mid-childhood. Neuropsychologia 2023; 188:108633. [PMID: 37394134 PMCID: PMC10530328 DOI: 10.1016/j.neuropsychologia.2023.108633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 05/25/2023] [Accepted: 06/27/2023] [Indexed: 07/04/2023]
Abstract
In earlier work with adults, we showed that long-term phonemic representations are audiovisual, meaning that they contain information on typical mouth shape during articulation. Many aspects of audiovisual processing have a prolonged developmental course, often not reaching maturity until late adolescence. In this study, we examined the status of phonemic representations in two groups of children - 8-9-year-olds and 11-12-year-olds. We used the same audiovisual oddball paradigm as in the earlier study with adults (Kaganovich and Christ, 2021). On each trial, participants saw a face and heard one of two vowels. One vowel occurred frequently (standard), while another occurred rarely (deviant). In one condition (neutral), the face had a closed, non-articulating mouth. In the other condition (audiovisual violation), the mouth shape matched the frequent vowel. Although stimuli were audiovisual in both conditions, we hypothesized that identical auditory changes would be perceived differently by participants. Namely, in the neutral condition, deviants violated only the audiovisual pattern specific to each experimental block. By contrast, in the audiovisual violation condition, deviants additionally violated long-term representations for how a speaker's mouth looks during articulation. We compared the amplitude of MMN and P3 components elicited by deviants in the two conditions. In the 11-12-year-old group, the pattern of neural responses was similar to that in adults - namely, they had a larger MMN component in the audiovisual compared to neutral condition, with no major difference in the P3 amplitude. In contrast, in the 8-9-year-old group, we saw a posterior MMN in the neutral condition only and a larger P3 in the audiovisual violation compared to the neutral condition. The larger P3 in the audiovisual violation condition suggests that younger children did perceive deviants as being more attention-grabbing when they violated the typical combination of sound and mouth shape. Yet, at this age, the earlier, more automatic stages of phonemic processing indexed by the MMN component may not yet encode visual speech elements the same way they do in older children and adults. We conclude that phonemic representations do not become audiovisual until 11-12 years of age.
Collapse
Affiliation(s)
- Natalya Kaganovich
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive West Lafayette, IN, 47907-2038, USA; Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN, 47907-2038, USA.
| | - Jennifer Schumaker
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive West Lafayette, IN, 47907-2038, USA
| | - Sharon Christ
- Department of Statistics, 250 N. University Street, West Lafayette, IN, 47907-2066, USA; Department of Human Development and Family Studies, 1202 West State St, West Lafayette, IN, 47907-2055, USA
| |
Collapse
|
2
|
Zamuner TS, Rabideau T, McDonald M, Yeung HH. Developmental change in children's speech processing of auditory and visual cues: An eyetracking study. JOURNAL OF CHILD LANGUAGE 2023; 50:27-51. [PMID: 36503546 DOI: 10.1017/s0305000921000684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
This study investigates how children aged two to eight years (N = 129) and adults (N = 29) use auditory and visual speech for word recognition. The goal was to bridge the gap between apparent successes of visual speech processing in young children in visual-looking tasks, with apparent difficulties of speech processing in older children from explicit behavioural measures. Participants were presented with familiar words in audio-visual (AV), audio-only (A-only) or visual-only (V-only) speech modalities, then presented with target and distractor images, and looking to targets was measured. Adults showed high accuracy, with slightly less target-image looking in the V-only modality. Developmentally, looking was above chance for both AV and A-only modalities, but not in the V-only modality until 6 years of age (earlier on /k/-initial words). Flexible use of visual cues for lexical access develops throughout childhood.
Collapse
Affiliation(s)
| | | | - Margarethe McDonald
- Department of Linguistics, University of Ottawa, Canada
- School of Psychology, University of Ottawa, Canada
| | - H Henny Yeung
- Department of Linguistics, Simon Fraser University, Canada
- Integrative Neuroscience and Cognition Centre, UMR 8002, CNRS and University of Paris, France
| |
Collapse
|
3
|
Bastianello T, Keren-Portnoy T, Majorano M, Vihman M. Infant looking preferences towards dynamic faces: A systematic review. Infant Behav Dev 2022; 67:101709. [PMID: 35338995 DOI: 10.1016/j.infbeh.2022.101709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 02/28/2022] [Accepted: 03/06/2022] [Indexed: 11/25/2022]
Abstract
Although the pattern of visual attention towards the region of the eyes is now well-established for infants at an early stage of development, less is known about the extent to which the mouth attracts an infant's attention. Even less is known about the extent to which these specific looking behaviours towards different regions of the talking face (i.e., the eyes or the mouth) may impact on or account for aspects of language development. The aim of the present systematic review is to synthesize and analyse (i) which factors might determine different looking patterns in infants during audio-visual tasks using dynamic faces and (ii) how these patterns have been studied in relation to aspects of the baby's development. Four bibliographic databases were explored, and the records were selected following specified inclusion criteria. The search led to the identification of 19 papers (October 2021). Some studies have tried to clarify the role played by audio-visual support in speech perception and early production based on directly related factors such as the age or language background of the participants, while others have tested the child's competence in terms of linguistic or social skills. Several hypotheses have been advanced to explain the selective attention phenomenon. The results of the selected studies have led to different lines of interpretation. Some suggestions for future research are outlined.
Collapse
Affiliation(s)
| | | | | | - Marilyn Vihman
- Department of Language and Linguistic Science, University of York, UK
| |
Collapse
|
4
|
Gijbels L, Yeatman JD, Lalonde K, Lee AKC. Audiovisual Speech Processing in Relationship to Phonological and Vocabulary Skills in First Graders. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:5022-5040. [PMID: 34735292 PMCID: PMC9150669 DOI: 10.1044/2021_jslhr-21-00196] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 07/06/2021] [Accepted: 08/11/2021] [Indexed: 06/13/2023]
Abstract
PURPOSE It is generally accepted that adults use visual cues to improve speech intelligibility in noisy environments, but findings regarding visual speech benefit in children are mixed. We explored factors that contribute to audiovisual (AV) gain in young children's speech understanding. We examined whether there is an AV benefit to speech-in-noise recognition in children in first grade and if visual salience of phonemes influences their AV benefit. We explored if individual differences in AV speech enhancement could be explained by vocabulary knowledge, phonological awareness, or general psychophysical testing performance. METHOD Thirty-seven first graders completed online psychophysical experiments. We used an online single-interval, four-alternative forced-choice picture-pointing task with age-appropriate consonant-vowel-consonant words to measure auditory-only, visual-only, and AV word recognition in noise at -2 and -8 dB SNR. We obtained standard measures of vocabulary and phonological awareness and included a general psychophysical test to examine correlations with AV benefits. RESULTS We observed a significant overall AV gain among children in first grade. This effect was mainly attributed to the benefit at -8 dB SNR, for visually distinct targets. Individual differences were not explained by any of the child variables. Boys showed lower auditory-only performances, leading to significantly larger AV gains. CONCLUSIONS This study shows AV benefit, of distinctive visual cues, to word recognition in challenging noisy conditions in first graders. The cognitive and linguistic constraints of the task may have minimized the impact of individual differences of vocabulary and phonological awareness on AV benefit. The gender difference should be studied on a larger sample and age range.
Collapse
Affiliation(s)
- Liesbeth Gijbels
- Department of Speech & Hearing Sciences, University of Washington, Seattle
- Institute for Learning & Brain Sciences, University of Washington, Seattle
| | - Jason D. Yeatman
- Division of Developmental-Behavioral Pediatrics, School of Medicine, Stanford University, CA
- Graduate School of Education, Stanford University, CA
| | - Kaylah Lalonde
- Boys Town National Research Hospital, Center for Hearing Research, Omaha, NE
| | - Adrian K. C. Lee
- Department of Speech & Hearing Sciences, University of Washington, Seattle
- Institute for Learning & Brain Sciences, University of Washington, Seattle
| |
Collapse
|
5
|
Gijbels L, Cai R, Donnelly PM, Kuhl PK. Designing Virtual, Moderated Studies of Early Childhood Development. Front Psychol 2021; 12:740290. [PMID: 34707545 PMCID: PMC8542922 DOI: 10.3389/fpsyg.2021.740290] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 09/06/2021] [Indexed: 11/13/2022] Open
Abstract
With increased public access to the Internet and digital tools, web-based research has gained prevalence over the past decades. However, digital adaptations for developmental research involving children have received relatively little attention. In 2020, as the COVID-19 pandemic led to reduced social contact, causing many developmental university research laboratories to close, the scientific community began to investigate online research methods that would allow continued work. Limited resources and documentation of factors that are essential for developmental research (e.g., caregiver involvement, informed assent, controlling environmental distractions at home for children) make the transition from in-person to online research especially difficult for developmental scientists. Recognizing this, we aim to contribute to the field by describing three separate moderated virtual behavioral assessments in children ranging from 4 to 13years of age that were highly successful. The three studies encompass speech production, speech perception, and reading fluency. However varied the domains we chose, the different age groups targeted by each study and different methodological approaches, the success of our virtual adaptations shared certain commonalities with regard to how to achieve informed consent, how to plan parental involvement, how to design studies that attract and hold children's attention and valid data collection procedures. Our combined work suggests principles for future facilitation of online developmental work. Considerations derived from these studies can serve as documented points of departure that inform and encourage additional virtual adaptations in this field.
Collapse
Affiliation(s)
- Liesbeth Gijbels
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, United States
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
| | - Ruofan Cai
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, United States
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
| | - Patrick M. Donnelly
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, United States
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
| | - Patricia K. Kuhl
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, United States
- Institute for Learning & Brain Sciences, University of Washington, Seattle, WA, United States
| |
Collapse
|
6
|
Lalonde K, McCreery RW. Audiovisual Enhancement of Speech Perception in Noise by School-Age Children Who Are Hard of Hearing. Ear Hear 2021; 41:705-719. [PMID: 32032226 PMCID: PMC7822589 DOI: 10.1097/aud.0000000000000830] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this study was to examine age- and hearing-related differences in school-age children's benefit from visual speech cues. The study addressed three questions: (1) Do age and hearing loss affect degree of audiovisual (AV) speech enhancement in school-age children? (2) Are there age- and hearing-related differences in the mechanisms underlying AV speech enhancement in school-age children? (3) What cognitive and linguistic variables predict individual differences in AV benefit among school-age children? DESIGN Forty-eight children between 6 and 13 years of age (19 with mild to severe sensorineural hearing loss; 29 with normal hearing) and 14 adults with normal hearing completed measures of auditory and AV syllable detection and/or sentence recognition in a two-talker masker type and a spectrally matched noise. Children also completed standardized behavioral measures of receptive vocabulary, visuospatial working memory, and executive attention. Mixed linear modeling was used to examine effects of modality, listener group, and masker on sentence recognition accuracy and syllable detection thresholds. Pearson correlations were used to examine the relationship between individual differences in children's AV enhancement (AV-auditory-only) and age, vocabulary, working memory, executive attention, and degree of hearing loss. RESULTS Significant AV enhancement was observed across all tasks, masker types, and listener groups. AV enhancement of sentence recognition was similar across maskers, but children with normal hearing exhibited less AV enhancement of sentence recognition than adults with normal hearing and children with hearing loss. AV enhancement of syllable detection was greater in the two-talker masker than the noise masker, but did not vary significantly across listener groups. Degree of hearing loss positively correlated with individual differences in AV benefit on the sentence recognition task in noise, but not on the detection task. None of the cognitive and linguistic variables correlated with individual differences in AV enhancement of syllable detection or sentence recognition. CONCLUSIONS Although AV benefit to syllable detection results from the use of visual speech to increase temporal expectancy, AV benefit to sentence recognition requires that an observer extracts phonetic information from the visual speech signal. The findings from this study suggest that all listener groups were equally good at using temporal cues in visual speech to detect auditory speech, but that adults with normal hearing and children with hearing loss were better than children with normal hearing at extracting phonetic information from the visual signal and/or using visual speech information to access phonetic/lexical representations in long-term memory. These results suggest that standard, auditory-only clinical speech recognition measures likely underestimate real-world speech recognition skills of children with mild to severe hearing loss.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| | - Ryan W. McCreery
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| |
Collapse
|
7
|
Kaganovich N, Schumaker J, Christ S. Impaired Audiovisual Representation of Phonemes in Children with Developmental Language Disorder. Brain Sci 2021; 11:brainsci11040507. [PMID: 33923647 PMCID: PMC8073635 DOI: 10.3390/brainsci11040507] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 04/06/2021] [Accepted: 04/10/2021] [Indexed: 11/25/2022] Open
Abstract
We examined whether children with developmental language disorder (DLD) differed from their peers with typical development (TD) in the degree to which they encode information about a talker’s mouth shape into long-term phonemic representations. Children watched a talker’s face and listened to rare changes from [i] to [u] or the reverse. In the neutral condition, the talker’s face had a closed mouth throughout. In the audiovisual violation condition, the mouth shape always matched the frequent vowel, even when the rare vowel was played. We hypothesized that in the neutral condition no long-term audiovisual memory traces for speech sounds would be activated. Therefore, the neural response elicited by deviants would reflect only a violation of the observed audiovisual sequence. In contrast, we expected that in the audiovisual violation condition, a long-term memory trace for the speech sound/lip configuration typical for the frequent vowel would be activated. In this condition then, the neural response elicited by rare sound changes would reflect a violation of not only observed audiovisual patterns but also of a long-term memory representation for how a given vowel looks when articulated. Children pressed a response button whenever they saw a talker’s face assume a silly expression. We found that in children with TD, rare auditory changes produced a significant mismatch negativity (MMN) event-related potential (ERP) component over the posterior scalp in the audiovisual violation condition but not in the neutral condition. In children with DLD, no MMN was present in either condition. Rare vowel changes elicited a significant P3 in both groups and conditions, indicating that all children noticed auditory changes. Our results suggest that children with TD, but not children with DLD, incorporate visual information into long-term phonemic representations and detect violations in audiovisual phonemic congruency even when they perform a task that is unrelated to phonemic processing.
Collapse
Affiliation(s)
- Natalya Kaganovich
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, IN 47907-2038, USA;
- Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN 47907-2038, USA
- Correspondence: ; Tel.: +1-(765)-494-4233; Fax: +1-(765)-494-0771
| | - Jennifer Schumaker
- Department of Speech, Language, and Hearing Sciences, Purdue University, 715 Clinic Drive, West Lafayette, IN 47907-2038, USA;
| | - Sharon Christ
- Department of Statistics, Purdue University, 250 N. University Street, West Lafayette, IN 47907-2066, USA;
- Department of Human Development and Family Studies, Purdue University, 1202 West State Street, West Lafayette, IN 47907-2055, USA
| |
Collapse
|
8
|
Irwin J, Avery T, Kleinman D, Landi N. Audiovisual Speech Perception in Children with Autism Spectrum Disorders: Evidence from Visual Phonemic Restoration. J Autism Dev Disord 2021; 52:28-37. [DOI: 10.1007/s10803-021-04916-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2021] [Indexed: 10/22/2022]
|
9
|
Lalonde K, Werner LA. Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit. Brain Sci 2021; 11:49. [PMID: 33466253 PMCID: PMC7824772 DOI: 10.3390/brainsci11010049] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Revised: 12/30/2020] [Accepted: 12/30/2020] [Indexed: 02/07/2023] Open
Abstract
The natural environments in which infants and children learn speech and language are noisy and multimodal. Adults rely on the multimodal nature of speech to compensate for noisy environments during speech communication. Multiple mechanisms underlie mature audiovisual benefit to speech perception, including reduced uncertainty as to when auditory speech will occur, use of correlations between the amplitude envelope of auditory and visual signals in fluent speech, and use of visual phonetic knowledge for lexical access. This paper reviews evidence regarding infants' and children's use of temporal and phonetic mechanisms in audiovisual speech perception benefit. The ability to use temporal cues for audiovisual speech perception benefit emerges in infancy. Although infants are sensitive to the correspondence between auditory and visual phonetic cues, the ability to use this correspondence for audiovisual benefit may not emerge until age four. A more cohesive account of the development of audiovisual speech perception may follow from a more thorough understanding of the development of sensitivity to and use of various temporal and phonetic cues.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE 68131, USA
| | - Lynne A. Werner
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98105, USA;
| |
Collapse
|
10
|
Thézé R, Gadiri MA, Albert L, Provost A, Giraud AL, Mégevand P. Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments. Sci Rep 2020; 10:15540. [PMID: 32968127 PMCID: PMC7511320 DOI: 10.1038/s41598-020-72375-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 08/31/2020] [Indexed: 11/09/2022] Open
Abstract
Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.
Collapse
Affiliation(s)
- Raphaël Thézé
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Mehdi Ali Gadiri
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Louis Albert
- Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland
| | - Antoine Provost
- Human Neuroscience Platform, Fondation Campus Biotech Geneva, Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland
| | - Pierre Mégevand
- Department of Basic Neurosciences, University of Geneva, Campus Biotech, Chemin des Mines 9, 1202, Geneva, Switzerland. .,Division of Neurology, Geneva University Hospitals, Geneva, Switzerland.
| |
Collapse
|
11
|
Havy M, Zesiger PE. Bridging ears and eyes when learning spoken words: On the effects of bilingual experience at 30 months. Dev Sci 2020; 24:e13002. [PMID: 32506622 DOI: 10.1111/desc.13002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 05/15/2020] [Accepted: 05/15/2020] [Indexed: 10/24/2022]
Abstract
From the very first moments of their lives, infants selectively attend to the visible orofacial movements of their social partners and apply their exquisite speech perception skills to the service of lexical learning. Here we explore how early bilingual experience modulates children's ability to use visible speech as they form new lexical representations. Using a cross-modal word-learning task, bilingual children aged 30 months were tested on their ability to learn new lexical mappings in either the auditory or the visual modality. Lexical recognition was assessed either in the same modality as the one used at learning ('same modality' condition: auditory test after auditory learning, visual test after visual learning) or in the other modality ('cross-modality' condition: visual test after auditory learning, auditory test after visual learning). The results revealed that like their monolingual peers, bilingual children successfully learn new words in either the auditory or the visual modality and show cross-modal recognition of words following auditory learning. Interestingly, as opposed to monolinguals, they also demonstrate cross-modal recognition of words upon visual learning. Collectively, these findings indicate a bilingual edge in visual word learning, expressed in the capacity to form a recoverable cross-modal representation of visually learned words.
Collapse
Affiliation(s)
- Mélanie Havy
- Faculty of Psychology and Educational Sciences, Geneva University, Geneva, Switzerland
| | - Pascal E Zesiger
- Faculty of Psychology and Educational Sciences, Geneva University, Geneva, Switzerland
| |
Collapse
|
12
|
Kaganovich N, Ancel E. Different neural processes underlie visual speech perception in school-age children and adults: An event-related potentials study. J Exp Child Psychol 2019; 184:98-122. [PMID: 31015101 PMCID: PMC6857813 DOI: 10.1016/j.jecp.2019.03.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 03/15/2019] [Accepted: 03/26/2019] [Indexed: 11/18/2022]
Abstract
The ability to use visual speech cues does not fully develop until late adolescence. The cognitive and neural processes underlying this slow maturation are not yet understood. We examined electrophysiological responses of younger (8-9 years) and older (11-12 years) children as well as adults elicited by visually perceived articulations in an audiovisual word matching task and related them to the amount of benefit gained during a speech-in-noise (SIN) perception task when seeing the talker's face. On each trial, participants first heard a word and, after a short pause, saw a speaker silently articulate a word. In half of the trials the articulated word matched the auditory word (congruent trials), whereas in the other half it did not (incongruent trials). In all three age groups, incongruent articulations elicited the N400 component and congruent articulations elicited the late positive complex (LPC). Groups did not differ in the mean amplitude of N400. The mean amplitude of LPC was larger in younger children compared with older children and adults. Importantly, the relationship between event-related potential measures and SIN performance varied by group. In 8- and 9-year-olds, neither component was predictive of SIN gain. The LPC amplitude predicted the SIN gain in older children but not in adults. Conversely, the N400 amplitude predicted the SIN gain in adults. We argue that although all groups were able to detect correspondences between auditory and visual word onsets at the phonemic/syllabic level, only adults could use this information for lexical access.
Collapse
Affiliation(s)
- Natalya Kaganovich
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA; Department of Psychological Sciences, Purdue University, West Lafayette, IN 47907, USA.
| | - Elizabeth Ancel
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
13
|
Jerger S, Damian MF, Karl C, Abdi H. Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:3095-3112. [PMID: 30515515 PMCID: PMC6440305 DOI: 10.1044/2018_jslhr-h-17-0343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2017] [Revised: 01/02/2018] [Accepted: 07/16/2018] [Indexed: 06/09/2023]
Abstract
PURPOSE Successful speech processing depends on our ability to detect and integrate multisensory cues, yet there is minimal research on multisensory speech detection and integration by children. To address this need, we studied the development of speech detection for auditory (A), visual (V), and audiovisual (AV) input. METHOD Participants were 115 typically developing children clustered into age groups between 4 and 14 years. Speech detection (quantified by response times [RTs]) was determined for 1 stimulus, /buh/, presented in A, V, and AV modes (articulating vs. static facial conditions). Performance was analyzed not only in terms of traditional mean RTs but also in terms of the faster versus slower RTs (defined by the 1st vs. 3rd quartiles of RT distributions). These time regions were conceptualized respectively as reflecting optimal detection with efficient focused attention versus less optimal detection with inefficient focused attention due to attentional lapses. RESULTS Mean RTs indicated better detection (a) of multisensory AV speech than A speech only in 4- to 5-year-olds and (b) of A and AV inputs than V input in all age groups. The faster RTs revealed that AV input did not improve detection in any group. The slower RTs indicated that (a) the processing of silent V input was significantly faster for the articulating than static face and (b) AV speech or facial input significantly minimized attentional lapses in all groups except 6- to 7-year-olds (a peaked U-shaped curve). Apparently, the AV benefit observed for mean performance in 4- to 5-year-olds arose from effects of attention. CONCLUSIONS The faster RTs indicated that AV input did not enhance detection in any group, but the slower RTs indicated that AV speech and dynamic V speech (mouthing) significantly minimized attentional lapses and thus did influence performance. Overall, A and AV inputs were detected consistently faster than V input; this result endorsed stimulus-bound auditory processing by these children.
Collapse
Affiliation(s)
- Susan Jerger
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
- Callier Center for Communication Disorders, Richardson, TX
| | - Markus F. Damian
- School of Experimental Psychology, University of Bristol, United Kingdom
| | - Cassandra Karl
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
- Callier Center for Communication Disorders, Richardson, TX
| | - Hervé Abdi
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, Richardson
| |
Collapse
|
14
|
Abstract
This study aimed to investigate how individuals with bipolar disorder integrate auditory and visual speech information compared to healthy individuals. Furthermore, we wanted to see whether there were any differences between manic and depressive episode bipolar disorder patients with respect to auditory and visual speech integration. It was hypothesized that the bipolar group’s auditory–visual speech integration would be weaker than that of the control group. Further, it was predicted that those in the manic phase of bipolar disorder would integrate visual speech information more robustly than their depressive phase counterparts. To examine these predictions, a McGurk effect paradigm with an identification task was used with typical auditory–visual (AV) speech stimuli. Additionally, auditory-only (AO) and visual-only (VO, lip-reading) speech perceptions were also tested. The dependent variable for the AV stimuli was the amount of visual speech influence. The dependent variables for AO and VO stimuli were accurate modality-based responses. Results showed that the disordered and control groups did not differ in AV speech integration and AO speech perception. However, there was a striking difference in favour of the healthy group with respect to the VO stimuli. The results suggest the need for further research whereby both behavioural and physiological data are collected simultaneously. This will help us understand the full dynamics of how auditory and visual speech information are integrated in people with bipolar disorder.
Collapse
|
15
|
Erdener D, Burnham D. Auditory-visual speech perception in three- and four-year-olds and its relationship to perceptual attunement and receptive vocabulary. JOURNAL OF CHILD LANGUAGE 2018; 45:273-289. [PMID: 28585512 DOI: 10.1017/s0305000917000174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Despite the body of research on auditory-visual speech perception in infants and schoolchildren, development in the early childhood period remains relatively uncharted. In this study, English-speaking children between three and four years of age were investigated for: (i) the development of visual speech perception - lip-reading and visual influence in auditory-visual integration; (ii) the development of auditory speech perception and native language perceptual attunement; and (iii) the relationship between these and a language skill relevant at this age, receptive vocabulary. Visual speech perception skills improved even over this relatively short time period. However, regression analyses revealed that vocabulary was predicted by auditory-only speech perception, and native language attunement, but not by visual speech perception ability. The results suggest that, in contrast to infants and schoolchildren, in three- to four-year-olds the relationship between speech perception and language ability is based on auditory and not visual or auditory-visual speech perception ability. Adding these results to existing findings allows elaboration of a more complete account of the developmental course of auditory-visual speech perception.
Collapse
Affiliation(s)
- Doğu Erdener
- Psychology Program,Middle East Technical University,Northern Cyprus Campus,Güzelyurt/Morphou,Northern Cyprus
| | - Denis Burnham
- The MARCS Institute for Brain,Behaviour and Development,Western Sydney University,Australia
| |
Collapse
|
16
|
Jerger S, Damian MF, McAlpine RP, Abdi H. Visual speech fills in both discrimination and identification of non-intact auditory speech in children. JOURNAL OF CHILD LANGUAGE 2018; 45:392-414. [PMID: 28724465 PMCID: PMC5775942 DOI: 10.1017/s0305000917000265] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. bæz) coupled to non-intact (excised onsets) auditory speech (signified by /-b/æz). Children discriminated syllable pairs that differed in intactness (i.e. bæz:/-b/æz) and identified non-intact nonwords (/-b/æz). We predicted that visual speech would cause children to perceive the non-intact onsets as intact, resulting in more same responses for discrimination and more intact (i.e. bæz) responses for identification in the audiovisual than auditory mode. Visual speech for the easy-to-speechread /b/ but not for the difficult-to-speechread /g/ boosted discrimination and identification (about 35-45%) in children from four to fourteen years. The influence of visual speech on discrimination was uniquely associated with the influence of visual speech on identification and receptive vocabulary skills.
Collapse
Affiliation(s)
- Susan Jerger
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX 75080
- Callier Center for Communication Disorders, 811 Synergy Park Blvd., Richardson, TX 75080
| | - Markus F. Damian
- University of Bristol, School of Experimental Psychology, 12a Priory Road, Room 1D20, Bristol BS8 1TU, United Kingdom
| | - Rachel P. McAlpine
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX 75080
- Callier Center for Communication Disorders, 811 Synergy Park Blvd., Richardson, TX 75080
| | - Hervé Abdi
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX 75080
| |
Collapse
|
17
|
Irwin J, Avery T, Brancazio L, Turcios J, Ryherd K, Landi N. Electrophysiological Indices of Audiovisual Speech Perception: Beyond the McGurk Effect and Speech in Noise. Multisens Res 2018; 31:39-56. [PMID: 31264595 DOI: 10.1163/22134808-00002580] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 05/15/2017] [Indexed: 11/19/2022]
Abstract
Visual information on a talker's face can influence what a listener hears. Commonly used approaches to study this include mismatched audiovisual stimuli (e.g., McGurk type stimuli) or visual speech in auditory noise. In this paper we discuss potential limitations of these approaches and introduce a novel visual phonemic restoration method. This method always presents the same visual stimulus (e.g., /ba/) dubbed with a matched auditory stimulus (/ba/) or one that has weakened consonantal information and sounds more /a/-like). When this reduced auditory stimulus (or /a/) is dubbed with the visual /ba/, a visual influence will result in effectively 'restoring' the weakened auditory cues so that the stimulus is perceived as a /ba/. An oddball design in which participants are asked to detect the /a/ among a stream of more frequently occurring /ba/s while either a speaking face or face with no visual speech was used. In addition, the same paradigm was presented for a second contrast in which participants detected /pa/ among /ba/s, a contrast which should be unaltered by the presence of visual speech. Behavioral and some ERP findings reflect the expected phonemic restoration for the /ba/ vs. /a/ contrast; specifically, we observed reduced accuracy and P300 response in the presence of visual speech. Further, we report an unexpected finding of reduced accuracy and P300 response for both speech contrasts in the presence of visual speech, suggesting overall modulation of the auditory signal in the presence of visual speech. Consistent with this, we observed a mismatch negativity (MMN) effect for the /ba/ vs. /pa/ contrast only that was larger in absence of visual speech. We discuss the potential utility for this paradigm for listeners who cannot respond actively, such as infants and individuals with developmental disabilities.
Collapse
Affiliation(s)
- Julia Irwin
- Haskins Laboratories, New Haven, CT, USA.,Southern Connecticut State University, New Haven, CT, USA
| | - Trey Avery
- Haskins Laboratories, New Haven, CT, USA
| | - Lawrence Brancazio
- Haskins Laboratories, New Haven, CT, USA.,Southern Connecticut State University, New Haven, CT, USA
| | - Jacqueline Turcios
- Haskins Laboratories, New Haven, CT, USA.,Southern Connecticut State University, New Haven, CT, USA
| | - Kayleigh Ryherd
- Haskins Laboratories, New Haven, CT, USA.,University of Connecticut, Storrs, CT, USA
| | - Nicole Landi
- Haskins Laboratories, New Haven, CT, USA.,University of Connecticut, Storrs, CT, USA
| |
Collapse
|
18
|
Looking Behavior and Audiovisual Speech Understanding in Children With Normal Hearing and Children With Mild Bilateral or Unilateral Hearing Loss. Ear Hear 2017; 39:783-794. [PMID: 29252979 DOI: 10.1097/aud.0000000000000534] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Visual information from talkers facilitates speech intelligibility for listeners when audibility is challenged by environmental noise and hearing loss. Less is known about how listeners actively process and attend to visual information from different talkers in complex multi-talker environments. This study tracked looking behavior in children with normal hearing (NH), mild bilateral hearing loss (MBHL), and unilateral hearing loss (UHL) in a complex multi-talker environment to examine the extent to which children look at talkers and whether looking patterns relate to performance on a speech-understanding task. It was hypothesized that performance would decrease as perceptual complexity increased and that children with hearing loss would perform more poorly than their peers with NH. Children with MBHL or UHL were expected to demonstrate greater attention to individual talkers during multi-talker exchanges, indicating that they were more likely to attempt to use visual information from talkers to assist in speech understanding in adverse acoustics. It also was of interest to examine whether MBHL, versus UHL, would differentially affect performance and looking behavior. DESIGN Eighteen children with NH, eight children with MBHL, and 10 children with UHL participated (8-12 years). They followed audiovisual instructions for placing objects on a mat under three conditions: a single talker providing instructions via a video monitor, four possible talkers alternately providing instructions on separate monitors in front of the listener, and the same four talkers providing both target and nontarget information. Multi-talker background noise was presented at a 5 dB signal-to-noise ratio during testing. An eye tracker monitored looking behavior while children performed the experimental task. RESULTS Behavioral task performance was higher for children with NH than for either group of children with hearing loss. There were no differences in performance between children with UHL and children with MBHL. Eye-tracker analysis revealed that children with NH looked more at the screens overall than did children with MBHL or UHL, though individual differences were greater in the groups with hearing loss. Listeners in all groups spent a small proportion of time looking at relevant screens as talkers spoke. Although looking was distributed across all screens, there was a bias toward the right side of the display. There was no relationship between overall looking behavior and performance on the task. CONCLUSIONS The present study examined the processing of audiovisual speech in the context of a naturalistic task. Results demonstrated that children distributed their looking to a variety of sources during the task, but that children with NH were more likely to look at screens than were those with MBHL/UHL. However, all groups looked at the relevant talkers as they were speaking only a small proportion of the time. Despite variability in looking behavior, listeners were able to follow the audiovisual instructions and children with NH demonstrated better performance than children with MBHL/UHL. These results suggest that performance on some challenging multi-talker audiovisual tasks is not dependent on visual fixation to relevant talkers for children with NH or with MBHL/UHL.
Collapse
|
19
|
Havy M, Zesiger P. Learning Spoken Words via the Ears and Eyes: Evidence from 30-Month-Old Children. Front Psychol 2017; 8:2122. [PMID: 29276493 PMCID: PMC5727082 DOI: 10.3389/fpsyg.2017.02122] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 11/21/2017] [Indexed: 12/02/2022] Open
Abstract
From the very first moments of their lives, infants are able to link specific movements of the visual articulators to auditory speech signals. However, recent evidence indicates that infants focus primarily on auditory speech signals when learning new words. Here, we ask whether 30-month-old children are able to learn new words based solely on visible speech information, and whether information from both auditory and visual modalities is available after learning in only one modality. To test this, children were taught new lexical mappings. One group of children experienced the words in the auditory modality (i.e., acoustic form of the word with no accompanying face). Another group experienced the words in the visual modality (seeing a silent talking face). Lexical recognition was tested in either the learning modality or in the other modality. Results revealed successful word learning in either modality. Results further showed cross-modal recognition following an auditory-only, but not a visual-only, experience of the words. Together, these findings suggest that visible speech becomes increasingly informative for the purpose of lexical learning, but that an auditory-only experience evokes a cross-modal representation of the words.
Collapse
Affiliation(s)
- Mélanie Havy
- Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland
| | | |
Collapse
|
20
|
Irwin J, Avery T, Turcios J, Brancazio L, Cook B, Landi N. Electrophysiological Indices of Audiovisual Speech Perception in the Broader Autism Phenotype. Brain Sci 2017; 7:E60. [PMID: 28574442 PMCID: PMC5483633 DOI: 10.3390/brainsci7060060] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2017] [Revised: 05/16/2017] [Accepted: 05/26/2017] [Indexed: 12/05/2022] Open
Abstract
When a speaker talks, the consequences of this can both be heard (audio) and seen (visual). A novel visual phonemic restoration task was used to assess behavioral discrimination and neural signatures (event-related potentials, or ERP) of audiovisual processing in typically developing children with a range of social and communicative skills assessed using the social responsiveness scale, a measure of traits associated with autism. An auditory oddball design presented two types of stimuli to the listener, a clear exemplar of an auditory consonant-vowel syllable /ba/ (the more frequently occurring standard stimulus), and a syllable in which the auditory cues for the consonant were substantially weakened, creating a stimulus which is more like /a/ (the infrequently presented deviant stimulus). All speech tokens were paired with a face producing /ba/ or a face with a pixelated mouth containing motion but no visual speech. In this paradigm, the visual /ba/ should cause the auditory /a/ to be perceived as /ba/, creating an attenuated oddball response; in contrast, a pixelated video (without articulatory information) should not have this effect. Behaviorally, participants showed visual phonemic restoration (reduced accuracy in detecting deviant /a/) in the presence of a speaking face. In addition, ERPs were observed in both an early time window (N100) and a later time window (P300) that were sensitive to speech context (/ba/ or /a/) and modulated by face context (speaking face with visible articulation or with pixelated mouth). Specifically, the oddball responses for the N100 and P300 were attenuated in the presence of a face producing /ba/ relative to a pixelated face, representing a possible neural correlate of the phonemic restoration effect. Notably, those individuals with more traits associated with autism (yet still in the non-clinical range) had smaller P300 responses overall, regardless of face context, suggesting generally reduced phonemic discrimination.
Collapse
Affiliation(s)
- Julia Irwin
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Psychology, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Trey Avery
- Haskins Laboratories, New Haven, CT 06511, USA.
| | - Jacqueline Turcios
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Communication Disorders, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Lawrence Brancazio
- Haskins Laboratories, New Haven, CT 06511, USA.
- Department of Psychology, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Barbara Cook
- Department of Communication Disorders, Southern Connecticut State University, New Haven, CT 06515, USA.
| | - Nicole Landi
- Haskins Laboratories, New Haven, CT 06511, USA.
- Psychological Sciences, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|
21
|
Irwin J, DiBlasi L. Audiovisual speech perception: A new approach and implications for clinical populations. LANGUAGE AND LINGUISTICS COMPASS 2017; 11:77-91. [PMID: 29520300 PMCID: PMC5839512 DOI: 10.1111/lnc3.12237] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 01/25/2017] [Indexed: 06/01/2023]
Abstract
This selected overview of audiovisual (AV) speech perception examines the influence of visible articulatory information on what is heard. Thought to be a cross-cultural phenomenon that emerges early in typical language development, variables that influence AV speech perception include properties of the visual and the auditory signal, attentional demands, and individual differences. A brief review of the existing neurobiological evidence on how visual information influences heard speech indicates potential loci, timing, and facilitatory effects of AV over auditory only speech. The current literature on AV speech in certain clinical populations (individuals with an autism spectrum disorder, developmental language disorder, or hearing loss) reveals differences in processing that may inform interventions. Finally, a new method of assessing AV speech that does not require obvious cross-category mismatch or auditory noise was presented as a novel approach for investigators.
Collapse
Affiliation(s)
- Julia Irwin
- LEARN Center, Haskins Laboratories Inc., USA
| | | |
Collapse
|
22
|
Jerger S, Damian MF, McAlpine RP, Abdi H. Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss. Int J Pediatr Otorhinolaryngol 2017; 94:127-137. [PMID: 28167003 PMCID: PMC5308867 DOI: 10.1016/j.ijporl.2017.01.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 01/05/2017] [Accepted: 01/06/2017] [Indexed: 11/18/2022]
Abstract
OBJECTIVES Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. METHODS Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/-B/aa or/-B/az). The items started with an easy-to-speechread/B/or difficult-to-speechread/G/onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/-B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same-as opposed to different-responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g.,/-B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz-as opposed to az- responses in the audiovisual than auditory mode. RESULTS Performance in the audiovisual mode showed more same responses for the intact vs. non-intact different pairs (e.g., Baa:/-B/aa) and more intact onset responses for nonword repetition (Baz for/-B/az). Thus visual speech altered both discrimination and identification in the CHL-to a large extent for the/B/onsets but only minimally for the/G/onsets. The CHL identified the stimuli similarly to the CNH but did not discriminate the stimuli similarly. A bias-free measure of the children's discrimination skills (i.e., d' analysis) revealed that the CHL had greater difficulty discriminating intact from non-intact speech in both modes. As the degree of HL worsened, the ability to discriminate the intact vs. non-intact onsets in the auditory mode worsened. Discrimination ability in CHL significantly predicted their identification of the onsets-even after variation due to the other variables was controlled. CONCLUSIONS These results clearly established that visual speech can fill in non-intact auditory speech, and this effect, in turn, made the non-intact onsets more difficult to discriminate from intact speech and more likely to be perceived as intact. Such results 1) demonstrate the value of visual speech at multiple levels of linguistic processing and 2) support intervention programs that view visual speech as a powerful asset for developing spoken language in CHL.
Collapse
Affiliation(s)
- Susan Jerger
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX, 75080, USA; Callier Center for Communication Disorders, 811 Synergy Park Blvd., Richardson, TX, 75080, USA.
| | - Markus F Damian
- University of Bristol, School of Experimental Psychology, 12a Priory Road, Room 1D20, Bristol, BS8 1TU, United Kingdom.
| | - Rachel P McAlpine
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX, 75080, USA; Callier Center for Communication Disorders, 811 Synergy Park Blvd., Richardson, TX, 75080, USA.
| | - Hervé Abdi
- School of Behavioral and Brain Sciences, GR4.1, University of Texas at Dallas, 800 W. Campbell Rd, Richardson, TX, 75080, USA.
| |
Collapse
|
23
|
Havy M, Foroud A, Fais L, Werker JF. The Role of Auditory and Visual Speech in Word Learning at 18 Months and in Adulthood. Child Dev 2017; 88:2043-2059. [PMID: 28124795 DOI: 10.1111/cdev.12715] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Visual information influences speech perception in both infants and adults. It is still unknown whether lexical representations are multisensory. To address this question, we exposed 18-month-old infants (n = 32) and adults (n = 32) to new word-object pairings: Participants either heard the acoustic form of the words or saw the talking face in silence. They were then tested on recognition in the same or the other modality. Both 18-month-old infants and adults learned the lexical mappings when the words were presented auditorily and recognized the mapping at test when the word was presented in either modality, but only adults learned new words in a visual-only presentation. These results suggest developmental changes in the sensory format of lexical representations.
Collapse
Affiliation(s)
- Mélanie Havy
- University of British Columbia.,Université de Genève
| | | | | | | |
Collapse
|
24
|
Jerger S, Damian MF, Tye-Murray N, Abdi H. Children perceive speech onsets by ear and eye. JOURNAL OF CHILD LANGUAGE 2017; 44:185-215. [PMID: 26752548 PMCID: PMC4940343 DOI: 10.1017/s030500091500077x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: -b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children - like adults - perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception.
Collapse
Affiliation(s)
- Susan Jerger
- School of Behavioral and Brain Sciences,GR4·1,University of Texas at Dallas, andCallier Center for Communication Disorders,Richardson,Texas
| | | | - Nancy Tye-Murray
- Department of Otolaryngology-Head and Neck Surgery,Washington University School of Medicine
| | - Hervé Abdi
- School of Behavioral and Brain Sciences,GR4·1,University of Texas at Dallas
| |
Collapse
|
25
|
Phonological Priming in Children with Hearing Loss: Effect of Speech Mode, Fidelity, and Lexical Status. Ear Hear 2016; 37:623-633. [PMID: 27438867 DOI: 10.1097/aud.0000000000000334] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES This research determined (1) how phonological priming of picture naming was affected by the mode (auditory-visual [AV] versus auditory), fidelity (intact versus nonintact auditory onsets), and lexical status (words versus nonwords) of speech stimuli in children with prelingual sensorineural hearing impairment (CHI) versus children with normal hearing (CNH) and (2) how the degree of HI, auditory word recognition, and age influenced results in CHI. Note that the AV stimuli were not the traditional bimodal input but instead they consisted of an intact consonant/rhyme in the visual track coupled to a nonintact onset/rhyme in the auditory track. Example stimuli for the word bag are (1) AV: intact visual (b/ag) coupled to nonintact auditory (-b/ag) and 2) auditory: static face coupled to the same nonintact auditory (-b/ag). The question was whether the intact visual speech would "restore or fill-in" the nonintact auditory speech in which case performance for the same auditory stimulus would differ depending on the presence/absence of visual speech. DESIGN Participants were 62 CHI and 62 CNH whose ages had a group mean and group distribution akin to that in the CHI group. Ages ranged from 4 to 14 years. All participants met the following criteria: (1) spoke English as a native language, (2) communicated successfully aurally/orally, and (3) had no diagnosed or suspected disabilities other than HI and its accompanying verbal problems. The phonological priming of picture naming was assessed with the multimodal picture word task. RESULTS Both CHI and CNH showed greater phonological priming from high than low-fidelity stimuli and from AV than auditory speech. These overall fidelity and mode effects did not differ in the CHI versus CNH-thus these CHI appeared to have sufficiently well-specified phonological onset representations to support priming, and visual speech did not appear to be a disproportionately important source of the CHI's phonological knowledge. Two exceptions occurred, however. First-with regard to lexical status-both the CHI and CNH showed significantly greater phonological priming from the nonwords than words, a pattern consistent with the prediction that children are more aware of phonetics-phonology content for nonwords. This overall pattern of similarity between the groups was qualified by the finding that CHI showed more nearly equal priming by the high- versus low-fidelity nonwords than the CNH; in other words, the CHI were less affected by the fidelity of the auditory input for nonwords. Second, auditory word recognition-but not degree of HI or age-uniquely influenced phonological priming by the AV nonwords. CONCLUSIONS With minor exceptions, phonological priming in CHI and CNH showed more similarities than differences. Importantly, this research documented that the addition of visual speech significantly increased phonological priming in both groups. Clinically these data support intervention programs that view visual speech as a powerful asset for developing spoken language in CHI.
Collapse
|