1
|
Nidiffer AR, Cao CZ, O'Sullivan A, Lalor EC. A representation of abstract linguistic categories in the visual system underlies successful lipreading. Neuroimage 2023; 282:120391. [PMID: 37757989 DOI: 10.1016/j.neuroimage.2023.120391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/22/2023] [Accepted: 09/24/2023] [Indexed: 09/29/2023] Open
Abstract
There is considerable debate over how visual speech is processed in the absence of sound and whether neural activity supporting lipreading occurs in visual brain areas. Much of the ambiguity stems from a lack of behavioral grounding and neurophysiological analyses that cannot disentangle high-level linguistic and phonetic/energetic contributions from visual speech. To address this, we recorded EEG from human observers as they watched silent videos, half of which were novel and half of which were previously rehearsed with the accompanying audio. We modeled how the EEG responses to novel and rehearsed silent speech reflected the processing of low-level visual features (motion, lip movements) and a higher-level categorical representation of linguistic units, known as visemes. The ability of these visemes to account for the EEG - beyond the motion and lip movements - was significantly enhanced for rehearsed videos in a way that correlated with participants' trial-by-trial ability to lipread that speech. Source localization of viseme processing showed clear contributions from visual cortex, with no strong evidence for the involvement of auditory areas. We interpret this as support for the idea that the visual system produces its own specialized representation of speech that is (1) well-described by categorical linguistic features, (2) dissociable from lip movements, and (3) predictive of lipreading ability. We also suggest a reinterpretation of previous findings of auditory cortical activation during silent speech that is consistent with hierarchical accounts of visual and audiovisual speech perception.
Collapse
Affiliation(s)
- Aaron R Nidiffer
- Department of Biomedical Engineering, Department of Neuroscience, Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
| | - Cody Zhewei Cao
- Department of Psychology, University of Michigan, Ann Arbor, MI, USA
| | - Aisling O'Sullivan
- School of Engineering, Trinity College Institute of Neuroscience, Trinity Centre for Biomedical Engineering, Trinity College, Dublin, Ireland
| | - Edmund C Lalor
- Department of Biomedical Engineering, Department of Neuroscience, Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA; School of Engineering, Trinity College Institute of Neuroscience, Trinity Centre for Biomedical Engineering, Trinity College, Dublin, Ireland.
| |
Collapse
|
2
|
Bernstein LE, Auer ET, Eberhardt SP. Modality-Specific Perceptual Learning of Vocoded Auditory versus Lipread Speech: Different Effects of Prior Information. Brain Sci 2023; 13:1008. [PMID: 37508940 PMCID: PMC10377548 DOI: 10.3390/brainsci13071008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/26/2023] [Accepted: 06/08/2023] [Indexed: 07/30/2023] Open
Abstract
Traditionally, speech perception training paradigms have not adequately taken into account the possibility that there may be modality-specific requirements for perceptual learning with auditory-only (AO) versus visual-only (VO) speech stimuli. The study reported here investigated the hypothesis that there are modality-specific differences in how prior information is used by normal-hearing participants during vocoded versus VO speech training. Two different experiments, one with vocoded AO speech (Experiment 1) and one with VO, lipread, speech (Experiment 2), investigated the effects of giving different types of prior information to trainees on each trial during training. The training was for four ~20 min sessions, during which participants learned to label novel visual images using novel spoken words. Participants were assigned to different types of prior information during training: Word Group trainees saw a printed version of each training word (e.g., "tethon"), and Consonant Group trainees saw only its consonants (e.g., "t_th_n"). Additional groups received no prior information (i.e., Experiment 1, AO Group; Experiment 2, VO Group) or a spoken version of the stimulus in a different modality from the training stimuli (Experiment 1, Lipread Group; Experiment 2, Vocoder Group). That is, in each experiment, there was a group that received prior information in the modality of the training stimuli from the other experiment. In both experiments, the Word Groups had difficulty retaining the novel words they attempted to learn during training. However, when the training stimuli were vocoded, the Word Group improved their phoneme identification. When the training stimuli were visual speech, the Consonant Group improved their phoneme identification and their open-set sentence lipreading. The results are considered in light of theoretical accounts of perceptual learning in relationship to perceptual modality.
Collapse
Affiliation(s)
- Lynne E Bernstein
- Speech, Language, and Hearing Sciences Department, George Washington University, Washington, DC 20052, USA
| | - Edward T Auer
- Speech, Language, and Hearing Sciences Department, George Washington University, Washington, DC 20052, USA
| | - Silvio P Eberhardt
- Speech, Language, and Hearing Sciences Department, George Washington University, Washington, DC 20052, USA
| |
Collapse
|
3
|
Van Bogaert L, Machart L, Gerber S, Lœvenbruck H, Vilain A. Speech rehabilitation in children with cochlear implants using a multisensory (French Cued Speech) or a hearing-focused (Auditory Verbal Therapy) approach. Front Hum Neurosci 2023; 17:1152516. [PMID: 37250702 PMCID: PMC10219235 DOI: 10.3389/fnhum.2023.1152516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 03/31/2023] [Indexed: 05/31/2023] Open
Abstract
Introduction Early exposure to a rich linguistic environment is essential as soon as the diagnosis of deafness is made. Cochlear implantation (CI) allows children to have access to speech perception in their early years. However, it provides only partial acoustic information, which can lead to difficulties in perceiving some phonetic contrasts. This study investigates the contribution of two spoken speech and language rehabilitation approaches to speech perception in children with CI using a lexicality judgment task from the EULALIES battery. Auditory Verbal Therapy (AVT) is an early intervention program that relies on auditory learning to enhance hearing skills in deaf children with CI. French Cued Speech, also called Cued French (CF), is a multisensory communication tool that disambiguates lip reading by adding a manual gesture. Methods In this study, 124 children aged from 60 to 140 months were included: 90 children with typical hearing skills (TH), 9 deaf children with CI who had participated in an AVT program (AVT), 6 deaf children with CI with high Cued French reading skills (CF+), and 19 deaf children with CI with low Cued French reading skills (CF-). Speech perception was assessed using sensitivity (d') using both the hit and false alarm rates, as defined in signal-detection theory. Results The results show that children with cochlear implants from the CF- and CF+ groups have significantly lower performance compared to children with typical hearing (TH) (p < 0.001 and p = 0.033, respectively). Additionally, children in the AVT group also tended to have lower scores compared to TH children (p = 0.07). However, exposition to AVT and CF seems to improve speech perception. The scores of the children in the AVT and CF+ groups are closer to typical scores than those of children in the CF- group, as evidenced by a distance measure. Discussion Overall, the findings of this study provide evidence for the effectiveness of these two speech and language rehabilitation approaches, and highlight the importance of using a specific approach in addition to a cochlear implant to improve speech perception in children with cochlear implants.
Collapse
Affiliation(s)
- Lucie Van Bogaert
- Université Grenoble Alpes, Université Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Laura Machart
- Université Grenoble Alpes, Université Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Silvain Gerber
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Hélène Lœvenbruck
- Université Grenoble Alpes, Université Savoie Mont Blanc, CNRS, LPNC, Grenoble, France
| | - Anne Vilain
- Université Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France
| | - Consortium EULALIESCostaMaudGillet-PerretEstelleMacLeodAndrea A. N.MeloniGenevièvePuissantClarisseRoseYvanUniversité Grenoble Alpes, France; CRTLA, Centre Hospitalier Universitaire Grenoble Alpes, France; University of Alberta, Edmonton, Canada; Université Grenoble Alpes, France and Université de Montréal, Montréal, Canada; Université Grenoble Alpes, France; Memorial University, Newfoundland, Canada
| |
Collapse
|
4
|
Influence of linguistic properties and hearing impairment on visual speech perception skills in the German language. PLoS One 2022; 17:e0275585. [PMID: 36178907 PMCID: PMC9524625 DOI: 10.1371/journal.pone.0275585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 09/20/2022] [Indexed: 11/19/2022] Open
Abstract
Visual input is crucial for understanding speech under noisy conditions, but there are hardly any tools to assess the individual ability to lipread. With this study, we wanted to (1) investigate how linguistic characteristics of language on the one hand and hearing impairment on the other hand have an impact on lipreading abilities and (2) provide a tool to assess lipreading abilities for German speakers. 170 participants (22 prelingually deaf) completed the online assessment, which consisted of a subjective hearing impairment scale and silent videos in which different item categories (numbers, words, and sentences) were spoken. The task for our participants was to recognize the spoken stimuli just by visual inspection. We used different versions of one test and investigated the impact of item categories, word frequency in the spoken language, articulation, sentence frequency in the spoken language, sentence length, and differences between speakers on the recognition score. We found an effect of item categories, articulation, sentence frequency, and sentence length on the recognition score. With respect to hearing impairment we found that higher subjective hearing impairment is associated with higher test score. We did not find any evidence that prelingually deaf individuals show enhanced lipreading skills over people with postlingual acquired hearing impairment. However, we see an interaction with education only in the prelingual deaf, but not in the population with postlingual acquired hearing loss. This points to the fact that there are different factors contributing to enhanced lipreading abilities depending on the onset of hearing impairment (prelingual vs. postlingual). Overall, lipreading skills vary strongly in the general population independent of hearing impairment. Based on our findings we constructed a new and efficient lipreading assessment tool (SaLT) that can be used to test behavioral lipreading abilities in the German speaking population.
Collapse
|
5
|
Bernstein LE, Jordan N, Auer ET, Eberhardt SP. Lipreading: A Review of Its Continuing Importance for Speech Recognition With an Acquired Hearing Loss and Possibilities for Effective Training. Am J Audiol 2022; 31:453-469. [PMID: 35316072 DOI: 10.1044/2021_aja-21-00112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE The goal of this review article is to reinvigorate interest in lipreading and lipreading training for adults with acquired hearing loss. Most adults benefit from being able to see the talker when speech is degraded; however, the effect size is related to their lipreading ability, which is typically poor in adults who have experienced normal hearing through most of their lives. Lipreading training has been viewed as a possible avenue for rehabilitation of adults with an acquired hearing loss, but most training approaches have not been particularly successful. Here, we describe lipreading and theoretically motivated approaches to its training, as well as examples of successful training paradigms. We discuss some extensions to auditory-only (AO) and audiovisual (AV) speech recognition. METHOD Visual speech perception and word recognition are described. Traditional and contemporary views of training and perceptual learning are outlined. We focus on the roles of external and internal feedback and the training task in perceptual learning, and we describe results of lipreading training experiments. RESULTS Lipreading is commonly characterized as limited to viseme perception. However, evidence demonstrates subvisemic perception of visual phonetic information. Lipreading words also relies on lexical constraints, not unlike auditory spoken word recognition. Lipreading has been shown to be difficult to improve through training, but under specific feedback and task conditions, training can be successful, and learning can generalize to untrained materials, including AV sentence stimuli in noise. The results on lipreading have implications for AO and AV training and for use of acoustically processed speech in face-to-face communication. CONCLUSION Given its importance for speech recognition with a hearing loss, we suggest that the research and clinical communities integrate lipreading in their efforts to improve speech recognition in adults with acquired hearing loss.
Collapse
Affiliation(s)
- Lynne E. Bernstein
- Department of Speech, Language & Hearing Sciences, George Washington University, Washington, DC
| | - Nicole Jordan
- Department of Speech, Language & Hearing Sciences, George Washington University, Washington, DC
| | - Edward T. Auer
- Department of Speech, Language & Hearing Sciences, George Washington University, Washington, DC
| | - Silvio P. Eberhardt
- Department of Speech, Language & Hearing Sciences, George Washington University, Washington, DC
| |
Collapse
|
6
|
Bastianello T, Keren-Portnoy T, Majorano M, Vihman M. Infant looking preferences towards dynamic faces: A systematic review. Infant Behav Dev 2022; 67:101709. [PMID: 35338995 DOI: 10.1016/j.infbeh.2022.101709] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 02/28/2022] [Accepted: 03/06/2022] [Indexed: 11/25/2022]
Abstract
Although the pattern of visual attention towards the region of the eyes is now well-established for infants at an early stage of development, less is known about the extent to which the mouth attracts an infant's attention. Even less is known about the extent to which these specific looking behaviours towards different regions of the talking face (i.e., the eyes or the mouth) may impact on or account for aspects of language development. The aim of the present systematic review is to synthesize and analyse (i) which factors might determine different looking patterns in infants during audio-visual tasks using dynamic faces and (ii) how these patterns have been studied in relation to aspects of the baby's development. Four bibliographic databases were explored, and the records were selected following specified inclusion criteria. The search led to the identification of 19 papers (October 2021). Some studies have tried to clarify the role played by audio-visual support in speech perception and early production based on directly related factors such as the age or language background of the participants, while others have tested the child's competence in terms of linguistic or social skills. Several hypotheses have been advanced to explain the selective attention phenomenon. The results of the selected studies have led to different lines of interpretation. Some suggestions for future research are outlined.
Collapse
Affiliation(s)
| | | | | | - Marilyn Vihman
- Department of Language and Linguistic Science, University of York, UK
| |
Collapse
|
7
|
Bernstein LE, Auer ET, Eberhardt SP. During Lipreading Training With Sentence Stimuli, Feedback Controls Learning and Generalization to Audiovisual Speech in Noise. Am J Audiol 2022; 31:57-77. [PMID: 34965362 DOI: 10.1044/2021_aja-21-00034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE This study investigated the effects of external feedback on perceptual learning of visual speech during lipreading training with sentence stimuli. The goal was to improve visual-only (VO) speech recognition and increase accuracy of audiovisual (AV) speech recognition in noise. The rationale was that spoken word recognition depends on the accuracy of sublexical (phonemic/phonetic) speech perception; effective feedback during training must support sublexical perceptual learning. METHOD Normal-hearing (NH) adults were assigned to one of three types of feedback: Sentence feedback was the entire sentence printed after responding to the stimulus. Word feedback was the correct response words and perceptually near but incorrect response words. Consonant feedback was correct response words and consonants in incorrect but perceptually near response words. Six training sessions were given. Pre- and posttraining testing included an untrained control group. Test stimuli were disyllable nonsense words for forced-choice consonant identification, and isolated words and sentences for open-set identification. Words and sentences were VO, AV, and audio-only (AO) with the audio in speech-shaped noise. RESULTS Lipreading accuracy increased during training. Pre- and posttraining tests of consonant identification showed no improvement beyond test-retest increases obtained by untrained controls. Isolated word recognition with a talker not seen during training showed that the control group improved more than the sentence group. Tests of untrained sentences showed that the consonant group significantly improved in all of the stimulus conditions (VO, AO, and AV). Its mean words correct scores increased by 9.2 percentage points for VO, 3.4 percentage points for AO, and 9.8 percentage points for AV stimuli. CONCLUSIONS Consonant feedback during training with sentences stimuli significantly increased perceptual learning. The training generalized to untrained VO, AO, and AV sentence stimuli. Lipreading training has potential to significantly improve adults' face-to-face communication in noisy settings in which the talker can be seen.
Collapse
Affiliation(s)
- Lynne E. Bernstein
- Department of Speech, Language, and Hearing Sciences, George Washington University, DC
| | - Edward T. Auer
- Department of Speech, Language, and Hearing Sciences, George Washington University, DC
| | - Silvio P. Eberhardt
- Department of Speech, Language, and Hearing Sciences, George Washington University, DC
| |
Collapse
|
8
|
Pragt L, van Hengel P, Grob D, Wasmann JWA. Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf. Front Digit Health 2022; 4:806076. [PMID: 35252959 PMCID: PMC8889114 DOI: 10.3389/fdgth.2022.806076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 01/18/2022] [Indexed: 11/26/2022] Open
Abstract
Objective Automated speech recognition (ASR) systems have become increasingly sophisticated, accurate, and deployable on many digital devices, including on a smartphone. This pilot study aims to examine the speech recognition performance of ASR apps using audiological speech tests. In addition, we compare ASR speech recognition performance to normal hearing and hearing impaired listeners and evaluate if standard clinical audiological tests are a meaningful and quick measure of the performance of ASR apps. Methods Four apps have been tested on a smartphone, respectively AVA, Earfy, Live Transcribe, and Speechy. The Dutch audiological speech tests performed were speech audiometry in quiet (Dutch CNC-test), Digits-in-Noise (DIN)-test with steady-state speech-shaped noise, sentences in quiet and in averaged long-term speech-shaped spectrum noise (Plomp-test). For comparison, the app's ability to transcribe a spoken dialogue (Dutch and English) was tested. Results All apps scored at least 50% phonemes correct on the Dutch CNC-test for a conversational speech intensity level (65 dB SPL) and achieved 90–100% phoneme recognition at higher intensity levels. On the DIN-test, AVA and Live Transcribe had the lowest (best) signal-to-noise ratio +8 dB. The lowest signal-to-noise measured with the Plomp-test was +8 to 9 dB for Earfy (Android) and Live Transcribe (Android). Overall, the word error rate for the dialogue in English (19–34%) was lower (better) than for the Dutch dialogue (25–66%). Conclusion The performance of the apps was limited on audiological tests that provide little linguistic context or use low signal to noise levels. For Dutch audiological speech tests in quiet, ASR apps performed similarly to a person with a moderate hearing loss. In noise, the ASR apps performed more poorly than most profoundly deaf people using a hearing aid or cochlear implant. Adding new performance metrics including the semantic difference as a function of SNR and reverberation time could help to monitor and further improve ASR performance.
Collapse
Affiliation(s)
- Leontien Pragt
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
- *Correspondence: Leontien Pragt
| | - Peter van Hengel
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
- Pento Audiological Center Twente, Hengelo, Netherlands
| | - Dagmar Grob
- Department of Medical Imaging, Radboud University Medical Center, Nijmegen, Netherlands
| | - Jan-Willem A. Wasmann
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center Nijmegen, Nijmegen, Netherlands
| |
Collapse
|
9
|
Ratnanather JT, Wang LC, Bae SH, O'Neill ER, Sagi E, Tward DJ. Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study. Front Neurol 2022; 12:724800. [PMID: 35087462 PMCID: PMC8787339 DOI: 10.3389/fneur.2021.724800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 12/13/2021] [Indexed: 11/13/2022] Open
Abstract
Objective: Speech tests assess the ability of people with hearing loss to comprehend speech with a hearing aid or cochlear implant. The tests are usually at the word or sentence level. However, few tests analyze errors at the phoneme level. So, there is a need for an automated program to visualize in real time the accuracy of phonemes in these tests. Method: The program reads in stimulus-response pairs and obtains their phonemic representations from an open-source digital pronouncing dictionary. The stimulus phonemes are aligned with the response phonemes via a modification of the Levenshtein Minimum Edit Distance algorithm. Alignment is achieved via dynamic programming with modified costs based on phonological features for insertion, deletions and substitutions. The accuracy for each phoneme is based on the F1-score. Accuracy is visualized with respect to place and manner (consonants) or height (vowels). Confusion matrices for the phonemes are used in an information transfer analysis of ten phonological features. A histogram of the information transfer for the features over a frequency-like range is presented as a phonemegram. Results: The program was applied to two datasets. One consisted of test data at the sentence and word levels. Stimulus-response sentence pairs from six volunteers with different degrees of hearing loss and modes of amplification were analyzed. Four volunteers listened to sentences from a mobile auditory training app while two listened to sentences from a clinical speech test. Stimulus-response word pairs from three lists were also analyzed. The other dataset consisted of published stimulus-response pairs from experiments of 31 participants with cochlear implants listening to 400 Basic English Lexicon sentences via different talkers at four different SNR levels. In all cases, visualization was obtained in real time. Analysis of 12,400 actual and random pairs showed that the program was robust to the nature of the pairs. Conclusion: It is possible to automate the alignment of phonemes extracted from stimulus-response pairs from speech tests in real time. The alignment then makes it possible to visualize the accuracy of responses via phonological features in two ways. Such visualization of phoneme alignment and accuracy could aid clinicians and scientists.
Collapse
Affiliation(s)
- J Tilak Ratnanather
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Lydia C Wang
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Seung-Ho Bae
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Erin R O'Neill
- Center for Applied and Translational Sensory Sciences, University of Minnesota, Minneapolis, MN, United States
| | - Elad Sagi
- Department of Otolaryngology, New York University School of Medicine, New York, NY, United States
| | - Daniel J Tward
- Center for Imaging Science and Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States.,Departments of Computational Medicine and Neurology, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
10
|
Alzaher M, Vannson N, Deguine O, Marx M, Barone P, Strelnikov K. Brain plasticity and hearing disorders. Rev Neurol (Paris) 2021; 177:1121-1132. [PMID: 34657730 DOI: 10.1016/j.neurol.2021.09.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 09/06/2021] [Accepted: 09/10/2021] [Indexed: 11/30/2022]
Abstract
Permanently changed sensory stimulation can modify functional connectivity patterns in the healthy brain and in pathology. In the pathology case, these adaptive modifications of the brain are referred to as compensation, and the subsequent configurations of functional connectivity are called compensatory plasticity. The variability and extent of auditory deficits due to the impairments in the hearing system determine the related brain reorganization and rehabilitation. In this review, we consider cross-modal and intra-modal brain plasticity related to bilateral and unilateral hearing loss and their restoration using cochlear implantation. Cross-modal brain plasticity may have both beneficial and detrimental effects on hearing disorders. It has a beneficial effect when it serves to improve a patient's adaptation to the visuo-auditory environment. However, the occupation of the auditory cortex by visual functions may be a negative factor for the restoration of hearing with cochlear implants. In what concerns intra-modal plasticity, the loss of interhemispheric asymmetry in asymmetric hearing loss is deleterious for the auditory spatial localization. Research on brain plasticity in hearing disorders can advance our understanding of brain plasticity and improve the rehabilitation of the patients using prognostic, evidence-based approaches from cognitive neuroscience combined with post-rehabilitation objective biomarkers of this plasticity utilizing neuroimaging.
Collapse
Affiliation(s)
- M Alzaher
- Université de Toulouse, UPS, centre de recherche cerveau et cognition, Toulouse, France; CNRS, CerCo, France
| | - N Vannson
- Université de Toulouse, UPS, centre de recherche cerveau et cognition, Toulouse, France; CNRS, CerCo, France
| | - O Deguine
- Université de Toulouse, UPS, centre de recherche cerveau et cognition, Toulouse, France; CNRS, CerCo, France; Faculté de médecine de Purpan, CHU Toulouse, université de Toulouse 3, France
| | - M Marx
- Université de Toulouse, UPS, centre de recherche cerveau et cognition, Toulouse, France; CNRS, CerCo, France; Faculté de médecine de Purpan, CHU Toulouse, université de Toulouse 3, France
| | - P Barone
- Université de Toulouse, UPS, centre de recherche cerveau et cognition, Toulouse, France; CNRS, CerCo, France.
| | - K Strelnikov
- Faculté de médecine de Purpan, CHU Toulouse, université de Toulouse 3, France
| |
Collapse
|
11
|
Dias JW, McClaskey CM, Harris KC. Early auditory cortical processing predicts auditory speech in noise identification and lipreading. Neuropsychologia 2021; 161:108012. [PMID: 34474065 DOI: 10.1016/j.neuropsychologia.2021.108012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 08/20/2021] [Accepted: 08/26/2021] [Indexed: 10/20/2022]
Abstract
Individuals typically exhibit better cross-sensory perception following unisensory loss, demonstrating improved perception of information available from the remaining senses and increased cross-sensory use of neural resources. Even individuals with no sensory loss will exhibit such changes in cross-sensory processing following temporary sensory deprivation, suggesting that the brain's capacity for recruiting cross-sensory sources to compensate for degraded unisensory input is a general characteristic of the perceptual process. Many studies have investigated how auditory and visual neural structures respond to within- and cross-sensory input. However, little attention has been given to how general auditory and visual neural processing relates to within and cross-sensory perception. The current investigation examines the extent to which individual differences in general auditory neural processing accounts for variability in auditory, visual, and audiovisual speech perception in a sample of young healthy adults. Auditory neural processing was assessed using a simple click stimulus. We found that individuals with a smaller P1 peak amplitude in their auditory-evoked potential (AEP) had more difficulty identifying speech sounds in difficult listening conditions, but were better lipreaders. The results suggest that individual differences in the auditory neural processing of healthy adults can account for variability in the perception of information available from the auditory and visual modalities, similar to the cross-sensory perceptual compensation observed in individuals with sensory loss.
Collapse
Affiliation(s)
- James W Dias
- Medical University of South Carolina, United States.
| | | | | |
Collapse
|
12
|
van de Rijt LPH, van Opstal AJ, van Wanrooij MM. Multisensory Integration-Attention Trade-Off in Cochlear-Implanted Deaf Individuals. Front Neurosci 2021; 15:683804. [PMID: 34393707 PMCID: PMC8358073 DOI: 10.3389/fnins.2021.683804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
The cochlear implant (CI) allows profoundly deaf individuals to partially recover hearing. Still, due to the coarse acoustic information provided by the implant, CI users have considerable difficulties in recognizing speech, especially in noisy environments. CI users therefore rely heavily on visual cues to augment speech recognition, more so than normal-hearing individuals. However, it is unknown how attention to one (focused) or both (divided) modalities plays a role in multisensory speech recognition. Here we show that unisensory speech listening and reading were negatively impacted in divided-attention tasks for CI users—but not for normal-hearing individuals. Our psychophysical experiments revealed that, as expected, listening thresholds were consistently better for the normal-hearing, while lipreading thresholds were largely similar for the two groups. Moreover, audiovisual speech recognition for normal-hearing individuals could be described well by probabilistic summation of auditory and visual speech recognition, while CI users were better integrators than expected from statistical facilitation alone. Our results suggest that this benefit in integration comes at a cost. Unisensory speech recognition is degraded for CI users when attention needs to be divided across modalities. We conjecture that CI users exhibit an integration-attention trade-off. They focus solely on a single modality during focused-attention tasks, but need to divide their limited attentional resources in situations with uncertainty about the upcoming stimulus modality. We argue that in order to determine the benefit of a CI for speech recognition, situational factors need to be discounted by presenting speech in realistic or complex audiovisual environments.
Collapse
Affiliation(s)
- Luuk P H van de Rijt
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition and Behaviour, Radboudumc, Nijmegen, Netherlands
| | - A John van Opstal
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Marc M van Wanrooij
- Department of Biophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
13
|
Abstract
OBJECTIVES When auditory and visual speech information are presented together, listeners obtain an audiovisual (AV) benefit or a speech understanding improvement compared with auditory-only (AO) or visual-only (VO) presentations. Cochlear-implant (CI) listeners, who receive degraded speech input and therefore understand speech using primarily temporal information, seem to readily use visual cues and can achieve a larger AV benefit than normal-hearing (NH) listeners. It is unclear, however, if the AV benefit remains relatively large for CI listeners when trying to understand foreign-accented speech when compared with unaccented speech. Accented speech can introduce changes to temporal auditory cues and visual cues, which could decrease the usefulness of AV information. Furthermore, we sought to determine if the AV benefit was relatively larger in CI compared with NH listeners for both unaccented and accented speech. DESIGN AV benefit was investigated for unaccented and Spanish-accented speech by presenting English sentences in AO, VO, and AV conditions to 15 CI and 15 age- and performance-matched NH listeners. Performance matching between NH and CI listeners was achieved by varying the number of channels of a noise vocoder for the NH listeners. Because of the differences in age and hearing history of the CI listeners, the effects of listener-related variables on speech understanding performance and AV benefit were also examined. RESULTS AV benefit was observed for both unaccented and accented conditions and for both CI and NH listeners. The two groups showed similar performance for the AO and AV conditions, and the normalized AV benefit was relatively smaller for the accented than the unaccented conditions. In the CI listeners, older age was associated with significantly poorer performance with the accented speaker compared with the unaccented speaker. The negative impact of age was somewhat reduced by a significant improvement in performance with access to AV information. CONCLUSIONS When auditory speech information is degraded by CI sound processing, visual cues can be used to improve speech understanding, even in the presence of a Spanish accent. The AV benefit of the CI listeners closely matched that of the NH listeners presented with vocoded speech, which was unexpected given that CI listeners appear to rely more on visual information to communicate. This result is perhaps due to the one-to-one age and performance matching of the listeners. While aging decreased CI listener performance with the accented speaker, access to visual cues boosted performance and could partially overcome the age-related speech understanding deficits for the older CI listeners.
Collapse
|
14
|
Buchanan-Worster E, Hulme C, Dennan R, MacSweeney M. Speechreading in hearing children can be improved by training. Dev Sci 2021; 24:e13124. [PMID: 34060185 PMCID: PMC7612880 DOI: 10.1111/desc.13124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 03/05/2021] [Accepted: 05/06/2021] [Indexed: 12/02/2022]
Abstract
Visual information conveyed by a speaking face aids speech perception. In addition, children’s ability to comprehend visual-only speech (speechreading ability) is related to phonological awareness and reading skills in both deaf and hearing children. We tested whether training speechreading would improve speechreading, phoneme blending, and reading ability in hearing children. Ninety-two hearing 4- to 5-year-old children were randomised into two groups: business-as-usual controls, and an intervention group, who completed three weeks of computerised speechreading training. The intervention group showed greater improvements in speechreading than the control group at post-test both immediately after training and 3 months later. This was the case for both trained and untrained words. There were no group effects on the phonological awareness or single-word reading tasks, although those with the lowest phoneme blending scores did show greater improvements in blending as a result of training. The improvement in speechreading in hearing children following brief training is encouraging. The results are also important in suggesting a hypothesis for future investigation: that a focus on visual speech information may contribute to phonological skills, not only in deaf children but also in hearing children who are at risk of reading difficulties. A video abstract of this article can be viewed at https://www.youtube.com/watch?v=bBdpliGkbkY.
Collapse
Affiliation(s)
- Elizabeth Buchanan-Worster
- Institute of Cognitive Neuroscience, University College London, London, UK.,Deafness, Cognition and Language Research Centre, University College London, London, UK
| | - Charles Hulme
- Department of Education, Oxford University, Oxford, Oxfordshire, UK
| | - Rachel Dennan
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Mairéad MacSweeney
- Institute of Cognitive Neuroscience, University College London, London, UK.,Deafness, Cognition and Language Research Centre, University College London, London, UK
| |
Collapse
|
15
|
Errors on a Speech-in-Babble Sentence Recognition Test Reveal Individual Differences in Acoustic Phonetic Perception and Babble Misallocations. Ear Hear 2021; 42:673-690. [PMID: 33928926 DOI: 10.1097/aud.0000000000001020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
OBJECTIVES The ability to recognize words in connected speech under noisy listening conditions is critical to everyday communication. Many processing levels contribute to the individual listener's ability to recognize words correctly against background speech, and there is clinical need for measures of individual differences at different levels. Typical listening tests of speech recognition in noise require a list of items to obtain a single threshold score. Diverse abilities measures could be obtained through mining various open-set recognition errors during multi-item tests. This study sought to demonstrate that an error mining approach using open-set responses from a clinical sentence-in-babble-noise test can be used to characterize abilities beyond signal-to-noise ratio (SNR) threshold. A stimulus-response phoneme-to-phoneme sequence alignment software system was used to achieve automatic, accurate quantitative error scores. The method was applied to a database of responses from normal-hearing (NH) adults. Relationships between two types of response errors and words correct scores were evaluated through use of mixed models regression. DESIGN Two hundred thirty-three NH adults completed three lists of the Quick Speech in Noise test. Their individual open-set speech recognition responses were automatically phonemically transcribed and submitted to a phoneme-to-phoneme stimulus-response sequence alignment system. The computed alignments were mined for a measure of acoustic phonetic perception, a measure of response text that could not be attributed to the stimulus, and a count of words correct. The mined data were statistically analyzed to determine whether the response errors were significant factors beyond stimulus SNR in accounting for the number of words correct per response from each participant. This study addressed two hypotheses: (1) Individuals whose perceptual errors are less severe recognize more words correctly under difficult listening conditions due to babble masking and (2) Listeners who are better able to exclude incorrect speech information such as from background babble and filling in recognize more stimulus words correctly. RESULTS Statistical analyses showed that acoustic phonetic accuracy and exclusion of babble background were significant factors, beyond the stimulus sentence SNR, in accounting for the number of words a participant recognized. There was also evidence that poorer acoustic phonetic accuracy could occur along with higher words correct scores. This paradoxical result came from a subset of listeners who had also performed subjective accuracy judgments. Their results suggested that they recognized more words while also misallocating acoustic cues from the background into the stimulus, without realizing their errors. Because the Quick Speech in Noise test stimuli are locked to their own babble sample, misallocations of whole words from babble into the responses could be investigated in detail. The high rate of common misallocation errors for some sentences supported the view that the functional stimulus was the combination of the target sentence and its babble. CONCLUSIONS Individual differences among NH listeners arise both in terms of words accurately identified and errors committed during open-set recognition of sentences in babble maskers. Error mining to characterize individual listeners can be done automatically at the levels of acoustic phonetic perception and the misallocation of background babble words into open-set responses. Error mining can increase test information and the efficiency and accuracy of characterizing individual listeners.
Collapse
|
16
|
Lu H, McKinney MF, Zhang T, Oxenham AJ. Investigating age, hearing loss, and background noise effects on speaker-targeted head and eye movements in three-way conversations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1889. [PMID: 33765809 DOI: 10.1121/10.0003707] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 02/19/2021] [Indexed: 06/12/2023]
Abstract
Although beamforming algorithms for hearing aids can enhance performance, the wearer's head may not always face the target talker, potentially limiting real-world benefits. This study aimed to determine the extent to which eye tracking improves the accuracy of locating the current talker in three-way conversations and to test the hypothesis that eye movements become more likely to track the target talker with increasing background noise levels, particularly in older and/or hearing-impaired listeners. Conversations between a participant and two confederates were held around a small table in quiet and with background noise levels of 50, 60, and 70 dB sound pressure level, while the participant's eye and head movements were recorded. Ten young normal-hearing listeners were tested, along with ten older normal-hearing listeners and eight hearing-impaired listeners. Head movements generally undershot the talker's position by 10°-15°, but head and eye movements together predicted the talker's position well. Contrary to our original hypothesis, no major differences in listening behavior were observed between the groups or between noise levels, although the hearing-impaired listeners tended to spend less time looking at the current talker than the other groups, especially at the highest noise level.
Collapse
Affiliation(s)
- Hao Lu
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Martin F McKinney
- Starkey Hearing Technologies, 6700 Washington Avenue South, Eden Prairie, Minnesota 55344, USA
| | - Tao Zhang
- Starkey Hearing Technologies, 6700 Washington Avenue South, Eden Prairie, Minnesota 55344, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
17
|
Buchanan-Worster E, MacSweeney M, Pimperton H, Kyle F, Harris M, Beedie I, Ralph-Lewis A, Hulme C. Speechreading Ability Is Related to Phonological Awareness and Single-Word Reading in Both Deaf and Hearing Children. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3775-3785. [PMID: 33108258 PMCID: PMC8530507 DOI: 10.1044/2020_jslhr-20-00159] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 06/24/2020] [Accepted: 08/15/2020] [Indexed: 06/11/2023]
Abstract
Purpose Speechreading (lipreading) is a correlate of reading ability in both deaf and hearing children. We investigated whether the relationship between speechreading and single-word reading is mediated by phonological awareness in deaf and hearing children. Method In two separate studies, 66 deaf children and 138 hearing children, aged 5-8 years old, were assessed on measures of speechreading, phonological awareness, and single-word reading. We assessed the concurrent relationships between latent variables measuring speechreading, phonological awareness, and single-word reading. Results In both deaf and hearing children, there was a strong relationship between speechreading and single-word reading, which was fully mediated by phonological awareness. Conclusions These results are consistent with ideas from previous studies that visual speech information contributes to the development of phonological representations in both deaf and hearing children, which, in turn, support learning to read. Future longitudinal and training studies are required to establish whether these relationships reflect causal effects.
Collapse
Affiliation(s)
- Elizabeth Buchanan-Worster
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness Cognition and Language Research Centre, University College London, United Kingdom
| | - Mairéad MacSweeney
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness Cognition and Language Research Centre, University College London, United Kingdom
| | - Hannah Pimperton
- Institute of Cognitive Neuroscience, University College London, United Kingdom
| | - Fiona Kyle
- Deafness Cognition and Language Research Centre, University College London, United Kingdom
| | - Margaret Harris
- Faculty of Health and Life Sciences, Oxford Brookes University, United Kingdom
| | - Indie Beedie
- Deafness Cognition and Language Research Centre, University College London, United Kingdom
| | - Amelia Ralph-Lewis
- Deafness Cognition and Language Research Centre, University College London, United Kingdom
| | - Charles Hulme
- Department of Education, University of Oxford, United Kingdom
| |
Collapse
|
18
|
Michon M, Boncompte G, López V. Electrophysiological Dynamics of Visual Speech Processing and the Role of Orofacial Effectors for Cross-Modal Predictions. Front Hum Neurosci 2020; 14:538619. [PMID: 33192386 PMCID: PMC7653187 DOI: 10.3389/fnhum.2020.538619] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 09/29/2020] [Indexed: 11/13/2022] Open
Abstract
The human brain generates predictions about future events. During face-to-face conversations, visemic information is used to predict upcoming auditory input. Recent studies suggest that the speech motor system plays a role in these cross-modal predictions, however, usually only audio-visual paradigms are employed. Here we tested whether speech sounds can be predicted on the basis of visemic information only, and to what extent interfering with orofacial articulatory effectors can affect these predictions. We registered EEG and employed N400 as an index of such predictions. Our results show that N400's amplitude was strongly modulated by visemic salience, coherent with cross-modal speech predictions. Additionally, N400 ceased to be evoked when syllables' visemes were presented backwards, suggesting that predictions occur only when the observed viseme matched an existing articuleme in the observer's speech motor system (i.e., the articulatory neural sequence required to produce a particular phoneme/viseme). Importantly, we found that interfering with the motor articulatory system strongly disrupted cross-modal predictions. We also observed a late P1000 that was evoked only for syllable-related visual stimuli, but whose amplitude was not modulated by interfering with the motor system. The present study provides further evidence of the importance of the speech production system for speech sounds predictions based on visemic information at the pre-lexical level. The implications of these results are discussed in the context of a hypothesized trimodal repertoire for speech, in which speech perception is conceived as a highly interactive process that involves not only your ears but also your eyes, lips and tongue.
Collapse
Affiliation(s)
- Maëva Michon
- Laboratorio de Neurociencia Cognitiva y Evolutiva, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile
- Laboratorio de Neurociencia Cognitiva y Social, Facultad de Psicología, Universidad Diego Portales, Santiago, Chile
| | - Gonzalo Boncompte
- Laboratorio de Neurodinámicas de la Cognición, Escuela de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Vladimir López
- Laboratorio de Psicología Experimental, Escuela de Psicología, Pontificia Universidad Católica de Chile, Santiago, Chile
| |
Collapse
|
19
|
ten Brinke L, Weisbuch M. How verbal-nonverbal consistency shapes the truth. JOURNAL OF EXPERIMENTAL SOCIAL PSYCHOLOGY 2020. [DOI: 10.1016/j.jesp.2020.103978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
20
|
Maffei V, Indovina I, Mazzarella E, Giusti MA, Macaluso E, Lacquaniti F, Viviani P. Sensitivity of occipito-temporal cortex, premotor and Broca's areas to visible speech gestures in a familiar language. PLoS One 2020; 15:e0234695. [PMID: 32559213 PMCID: PMC7304574 DOI: 10.1371/journal.pone.0234695] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 06/01/2020] [Indexed: 11/18/2022] Open
Abstract
When looking at a speaking person, the analysis of facial kinematics contributes to language discrimination and to the decoding of the time flow of visual speech. To disentangle these two factors, we investigated behavioural and fMRI responses to familiar and unfamiliar languages when observing speech gestures with natural or reversed kinematics. Twenty Italian volunteers viewed silent video-clips of speech shown as recorded (Forward, biological motion) or reversed in time (Backward, non-biological motion), in Italian (familiar language) or Arabic (non-familiar language). fMRI revealed that language (Italian/Arabic) and time-rendering (Forward/Backward) modulated distinct areas in the ventral occipito-temporal cortex, suggesting that visual speech analysis begins in this region, earlier than previously thought. Left premotor ventral (superior subdivision) and dorsal areas were preferentially activated with the familiar language independently of time-rendering, challenging the view that the role of these regions in speech processing is purely articulatory. The left premotor ventral region in the frontal operculum, thought to include part of the Broca's area, responded to the natural familiar language, consistent with the hypothesis of motor simulation of speech gestures.
Collapse
Affiliation(s)
- Vincenzo Maffei
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy.,Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy.,Data Lake & BI, DOT - Technology, Poste Italiane, Rome, Italy
| | - Iole Indovina
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy.,Departmental Faculty of Medicine and Surgery, Saint Camillus International University of Health and Medical Sciences, Rome, Italy
| | | | - Maria Assunta Giusti
- Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Emiliano Macaluso
- ImpAct Team, Lyon Neuroscience Research Center, Lyon, France.,Laboratory of Neuroimaging, IRCCS Santa Lucia Foundation, Rome, Italy
| | - Francesco Lacquaniti
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy.,Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| | - Paolo Viviani
- Laboratory of Neuromotor Physiology, IRCCS Santa Lucia Foundation, Rome, Italy.,Centre of Space BioMedicine and Department of Systems Medicine, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|
21
|
Zhou X, Innes-Brown H, McKay CM. Audio-visual integration in cochlear implant listeners and the effect of age difference. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4144. [PMID: 31893708 DOI: 10.1121/1.5134783] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Accepted: 10/30/2019] [Indexed: 06/10/2023]
Abstract
This study aimed to investigate differences in audio-visual (AV) integration between cochlear implant (CI) listeners and normal-hearing (NH) adults. A secondary aim was to investigate the effect of age differences by examining AV integration in groups of older and younger NH adults. Seventeen CI listeners, 13 similarly aged NH adults, and 16 younger NH adults were recruited. Two speech identification experiments were conducted to evaluate AV integration of speech cues. In the first experiment, reaction times in audio-alone (A-alone), visual-alone (V-alone), and AV conditions were measured during a speeded task in which participants were asked to identify a target sound /aSa/ among 11 alternatives. A race model was applied to evaluate AV integration. In the second experiment, identification accuracies were measured using a closed set of consonants and an open set of consonant-nucleus-consonant words. The authors quantified AV integration using a combination of a probability model and a cue integration model (which model participants' AV accuracy by assuming no or optimal integration, respectively). The results found that experienced CI listeners showed no better AV integration than their similarly aged NH adults. Further, there was no significant difference in AV integration between the younger and older NH adults.
Collapse
Affiliation(s)
- Xin Zhou
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| | - Hamish Innes-Brown
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| | - Colette M McKay
- Bionics Institute of Australia, 384-388 East Melbourne, Melbourne, Victoria 3002, Australia
| |
Collapse
|
22
|
Anderson CA, Wiggins IM, Kitterick PT, Hartley DEH. Pre-operative Brain Imaging Using Functional Near-Infrared Spectroscopy Helps Predict Cochlear Implant Outcome in Deaf Adults. J Assoc Res Otolaryngol 2019; 20:511-528. [PMID: 31286300 PMCID: PMC6797684 DOI: 10.1007/s10162-019-00729-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2018] [Accepted: 06/13/2019] [Indexed: 11/26/2022] Open
Abstract
Currently, it is not possible to accurately predict how well a deaf individual will be able to understand speech when hearing is (re)introduced via a cochlear implant. Differences in brain organisation following deafness are thought to contribute to variability in speech understanding with a cochlear implant and may offer unique insights that could help to more reliably predict outcomes. An emerging optical neuroimaging technique, functional near-infrared spectroscopy (fNIRS), was used to determine whether a pre-operative measure of brain activation could explain variability in cochlear implant (CI) outcomes and offer additional prognostic value above that provided by known clinical characteristics. Cross-modal activation to visual speech was measured in bilateral superior temporal cortex of pre- and post-lingually deaf adults before cochlear implantation. Behavioural measures of auditory speech understanding were obtained in the same individuals following 6 months of cochlear implant use. The results showed that stronger pre-operative cross-modal activation of auditory brain regions by visual speech was predictive of poorer auditory speech understanding after implantation. Further investigation suggested that this relationship may have been driven primarily by the inclusion of, and group differences between, pre- and post-lingually deaf individuals. Nonetheless, pre-operative cortical imaging provided additional prognostic value above that of influential clinical characteristics, including the age-at-onset and duration of auditory deprivation, suggesting that objectively assessing the physiological status of the brain using fNIRS imaging pre-operatively may support more accurate prediction of individual CI outcomes. Whilst activation of auditory brain regions by visual speech prior to implantation was related to the CI user's clinical history of deafness, activation to visual speech did not relate to the future ability of these brain regions to respond to auditory speech stimulation with a CI. Greater pre-operative activation of left superior temporal cortex by visual speech was associated with enhanced speechreading abilities, suggesting that visual speech processing may help to maintain left temporal lobe specialisation for language processing during periods of profound deafness.
Collapse
Affiliation(s)
- Carly A Anderson
- National Institute for Health Research (NIHR), Nottingham Biomedical Research Centre, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, UK.
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK.
| | - Ian M Wiggins
- National Institute for Health Research (NIHR), Nottingham Biomedical Research Centre, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Pádraig T Kitterick
- National Institute for Health Research (NIHR), Nottingham Biomedical Research Centre, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Douglas E H Hartley
- National Institute for Health Research (NIHR), Nottingham Biomedical Research Centre, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
- Nottingham University Hospitals NHS Trust, Derby Road, Nottingham, NG7 2UH, UK
| |
Collapse
|
23
|
van de Rijt LPH, Roye A, Mylanus EAM, van Opstal AJ, van Wanrooij MM. The Principle of Inverse Effectiveness in Audiovisual Speech Perception. Front Hum Neurosci 2019; 13:335. [PMID: 31611780 PMCID: PMC6775866 DOI: 10.3389/fnhum.2019.00335] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 09/11/2019] [Indexed: 11/13/2022] Open
Abstract
We assessed how synchronous speech listening and lipreading affects speech recognition in acoustic noise. In simple audiovisual perceptual tasks, inverse effectiveness is often observed, which holds that the weaker the unimodal stimuli, or the poorer their signal-to-noise ratio, the stronger the audiovisual benefit. So far, however, inverse effectiveness has not been demonstrated for complex audiovisual speech stimuli. Here we assess whether this multisensory integration effect can also be observed for the recognizability of spoken words. To that end, we presented audiovisual sentences to 18 native-Dutch normal-hearing participants, who had to identify the spoken words from a finite list. Speech-recognition performance was determined for auditory-only, visual-only (lipreading), and auditory-visual conditions. To modulate acoustic task difficulty, we systematically varied the auditory signal-to-noise ratio. In line with a commonly observed multisensory enhancement on speech recognition, audiovisual words were more easily recognized than auditory-only words (recognition thresholds of -15 and -12 dB, respectively). We here show that the difficulty of recognizing a particular word, either acoustically or visually, determines the occurrence of inverse effectiveness in audiovisual word integration. Thus, words that are better heard or recognized through lipreading, benefit less from bimodal presentation. Audiovisual performance at the lowest acoustic signal-to-noise ratios (45%) fell below the visual recognition rates (60%), reflecting an actual deterioration of lipreading in the presence of excessive acoustic noise. This suggests that the brain may adopt a strategy in which attention has to be divided between listening and lipreading.
Collapse
Affiliation(s)
- Luuk P. H. van de Rijt
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Medical Center, Nijmegen, Netherlands
| | - Anja Roye
- Department of Biophysics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Emmanuel A. M. Mylanus
- Department of Otorhinolaryngology, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Medical Center, Nijmegen, Netherlands
| | - A. John van Opstal
- Department of Biophysics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Marc M. van Wanrooij
- Department of Biophysics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
24
|
Pimperton H, Kyle F, Hulme C, Harris M, Beedie I, Ralph-Lewis A, Worster E, Rees R, Donlan C, MacSweeney M. Computerized Speechreading Training for Deaf Children: A Randomized Controlled Trial. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:2882-2894. [PMID: 31336055 PMCID: PMC6839416 DOI: 10.1044/2019_jslhr-h-19-0073] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 04/04/2019] [Accepted: 04/11/2019] [Indexed: 06/10/2023]
Abstract
Purpose We developed and evaluated in a randomized controlled trial a computerized speechreading training program to determine (a) whether it is possible to train speechreading in deaf children and (b) whether speechreading training results in improvements in phonological and reading skills. Previous studies indicate a relationship between speechreading and reading skill and further suggest this relationship may be mediated by improved phonological representations. This is important since many deaf children find learning to read to be very challenging. Method Sixty-six deaf 5- to 7-year-olds were randomized into speechreading and maths training arms. Each training program was composed of a 10-min sessions a day, 4 days a week for 12 weeks. Children were assessed on a battery of language and literacy measures before training, immediately after training, and 3 months and 11 months after training. Results We found no significant benefits for participants who completed the speechreading training, compared to those who completed the maths training, on the speechreading primary outcome measure. However, significantly greater gains were observed in the speechreading training group on one of the secondary measures of speechreading. There was also some evidence of beneficial effects of the speechreading training on phonological representations; however, these effects were weaker. No benefits were seen to word reading. Conclusions Speechreading skill is trainable in deaf children. However, to support early reading, training may need to be longer or embedded in a broader literacy program. Nevertheless, a training tool that can improve speechreading is likely to be of great interest to professionals working with deaf children. Supplemental Material https://doi.org/10.23641/asha.8856356.
Collapse
Affiliation(s)
- Hannah Pimperton
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness, Cognition and Language Research Centre, University College London, United Kingdom
| | - Fiona Kyle
- Division of Language and Communication Science, City University of London, United Kingdom
| | - Charles Hulme
- Department of Education, University of Oxford, United Kingdom
| | - Margaret Harris
- Faculty of Health and Life Sciences, Oxford Brookes University, United Kingdom
| | - Indie Beedie
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness, Cognition and Language Research Centre, University College London, United Kingdom
| | - Amelia Ralph-Lewis
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness, Cognition and Language Research Centre, University College London, United Kingdom
| | - Elizabeth Worster
- Institute of Cognitive Neuroscience, University College London, United Kingdom
| | - Rachel Rees
- Department of Language and Cognition, University College London, United Kingdom
| | - Chris Donlan
- Department of Language and Cognition, University College London, United Kingdom
| | - Mairéad MacSweeney
- Institute of Cognitive Neuroscience, University College London, United Kingdom
- Deafness, Cognition and Language Research Centre, University College London, United Kingdom
| |
Collapse
|
25
|
Bayard C, Machart L, Strauß A, Gerber S, Aubanel V, Schwartz JL. Cued Speech Enhances Speech-in-Noise Perception. JOURNAL OF DEAF STUDIES AND DEAF EDUCATION 2019; 24:223-233. [PMID: 30809665 DOI: 10.1093/deafed/enz003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 01/28/2019] [Accepted: 01/31/2019] [Indexed: 06/09/2023]
Abstract
Speech perception in noise remains challenging for Deaf/Hard of Hearing people (D/HH), even fitted with hearing aids or cochlear implants. The perception of sentences in noise by 20 implanted or aided D/HH subjects mastering Cued Speech (CS), a system of hand gestures complementing lip movements, was compared with the perception of 15 typically hearing (TH) controls in three conditions: audio only, audiovisual, and audiovisual + CS. Similar audiovisual scores were obtained for signal-to-noise ratios (SNRs) 11 dB higher in D/HH participants compared with TH ones. Adding CS information enabled D/HH participants to reach a mean score of 83% in the audiovisual + CS condition at a mean SNR of 0 dB, similar to the usual audio score for TH participants at this SNR. This confirms that the combination of lipreading and Cued Speech system remains extremely important for persons with hearing loss, particularly in adverse hearing conditions.
Collapse
Affiliation(s)
| | | | - Antje Strauß
- Zukunftskolleg, FB Sprachwissenschaft, University of Konstanz
| | | | | | | |
Collapse
|
26
|
Mastrantuono E, Burigo M, Rodríguez-Ortiz IR, Saldaña D. The Role of Multiple Articulatory Channels of Sign-Supported Speech Revealed by Visual Processing. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:1625-1656. [PMID: 31095442 DOI: 10.1044/2019_jslhr-s-17-0433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Purpose The use of sign-supported speech (SSS) in the education of deaf students has been recently discussed in relation to its usefulness with deaf children using cochlear implants. To clarify the benefits of SSS for comprehension, 2 eye-tracking experiments aimed to detect the extent to which signs are actively processed in this mode of communication. Method Participants were 36 deaf adolescents, including cochlear implant users and native deaf signers. Experiment 1 attempted to shift observers' foveal attention to the linguistic source in SSS from which most information is extracted, lip movements or signs, by magnifying the face area, thus modifying lip movements perceptual accessibility (magnified condition), and by constraining the visual field to either the face or the sign through a moving window paradigm (gaze contingent condition). Experiment 2 aimed to explore the reliance on signs in SSS by occasionally producing a mismatch between sign and speech. Participants were required to concentrate upon the orally transmitted message. Results In Experiment 1, analyses revealed a greater number of fixations toward the signs and a reduction in accuracy in the gaze contingent condition across all participants. Fixations toward signs were also increased in the magnified condition. In Experiment 2, results indicated less accuracy in the mismatching condition across all participants. Participants looked more at the sign when it was inconsistent with speech. Conclusions All participants, even those with residual hearing, rely on signs when attending SSS, either peripherally or through overt attention, depending on the perceptual conditions. Supplemental Material https://doi.org/10.23641/asha.8121191.
Collapse
Affiliation(s)
- Eliana Mastrantuono
- Departamento de Psicología Evolutiva y de la Educación, Universidad de Sevilla, Spain
| | - Michele Burigo
- Cognitive Interaction Technology, University of Bielefeld, Germany
| | | | - David Saldaña
- Departamento de Psicología Evolutiva y de la Educación, Universidad de Sevilla, Spain
| |
Collapse
|
27
|
Mastrantuono E, Saldaña D, Rodríguez-Ortiz IR. Inferencing in Deaf Adolescents during Sign-Supported Speech Comprehension. DISCOURSE PROCESSES 2019. [DOI: 10.1080/0163853x.2018.1490133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Eliana Mastrantuono
- Departamento de Psicología Evolutiva y de la Educación, Universidad de Sevilla, Sevilla, Spain
| | - David Saldaña
- Departamento de Psicología Evolutiva y de la Educación, Universidad de Sevilla, Sevilla, Spain
| | | |
Collapse
|
28
|
Schmitz J, Bartoli E, Maffongelli L, Fadiga L, Sebastian-Galles N, D’Ausilio A. Motor cortex compensates for lack of sensory and motor experience during auditory speech perception. Neuropsychologia 2019; 128:290-296. [DOI: 10.1016/j.neuropsychologia.2018.01.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 12/18/2017] [Accepted: 01/05/2018] [Indexed: 10/18/2022]
|
29
|
Deaf signers outperform hearing non-signers in recognizing happy facial expressions. PSYCHOLOGICAL RESEARCH 2019; 84:1485-1494. [DOI: 10.1007/s00426-019-01160-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 02/25/2019] [Indexed: 01/21/2023]
|
30
|
Origin and evolution of human speech: Emergence from a trimodal auditory, visual and vocal network. PROGRESS IN BRAIN RESEARCH 2019; 250:345-371. [PMID: 31703907 DOI: 10.1016/bs.pbr.2019.01.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
In recent years, there have been important additions to the classical model of speech processing as originally depicted by the Broca-Wernicke model consisting of an anterior, productive region and a posterior, perceptive region, both connected via the arcuate fasciculus. The modern view implies a separation into a dorsal and a ventral pathway conveying different kinds of linguistic information, which parallels the organization of the visual system. Furthermore, this organization is highly conserved in evolution and can be seen as the neural scaffolding from which the speech networks originated. In this chapter we emphasize that the speech networks are embedded in a multimodal system encompassing audio-vocal and visuo-vocal connections, which can be referred to an ancestral audio-visuo-motor pathway present in nonhuman primates. Likewise, we propose a trimodal repertoire for speech processing and acquisition involving auditory, visual and motor representations of the basic elements of speech: phoneme, observation of mouth movements, and articulatory processes. Finally, we discuss this proposal in the context of a scenario for early speech acquisition in infants and in human evolution.
Collapse
|
31
|
Bernstein LE, Besser J, Maidment DW, Swanepoel DW. Innovation in the Context of Audiology and in the Context of the Internet. Am J Audiol 2018; 27:376-384. [PMID: 30452742 PMCID: PMC6437706 DOI: 10.1044/2018_aja-imia3-18-0018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 06/15/2018] [Accepted: 06/18/2018] [Indexed: 12/27/2022] Open
Abstract
PURPOSE This article explores different meanings of innovation within the context of audiology and the Internet. Case studies are used to illustrate and elaborate on the new types of innovation and their levels of impact. METHOD The article defines innovation, providing case studies illustrating a taxonomy of innovation types. RESULTS Innovation ranges from minor changes in technology implemented on existing platforms to radical or disruptive changes that provide exceptional benefits and transform markets. Innovations within the context of audiology and the Internet can be found across that range. The case studies presented demonstrate that innovations in hearing care can span across a number of innovation types and levels of impact. Considering the global need for improved access and efficiency in hearing care, innovations that demonstrate a sustainable impact on a large scale, with the potential to rapidly upscale this impact, should be prioritized. CONCLUSIONS It is unclear presently what types of innovations are likely to have the most profound impacts on audiology in the coming years. In the best case, they will lead to more efficient, effective, and widespread availability of hearing health on a global scale.
Collapse
Affiliation(s)
- Lynne E. Bernstein
- Department of Speech, Language, and Hearing Sciences, George Washington University, Washington, DC
| | - Jana Besser
- Department of Science and Technology, Sonova AG, Stäfa, Switzerland
| | - David W. Maidment
- National Institute for Health Research, Nottingham Biomedical Research Centre, Nottingham, United Kingdom
- Hearing Sciences Section, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, United Kingdom
| | - De Wet Swanepoel
- Department of Speech-Language Pathology and Audiology, University of Pretoria, Gauteng, South Africa
- Ear Sciences Centre, School of Surgery, University of Western Australia, Nedlands, Australia
- Ear Science Institute Australia, Subiaco, Western Australia
| |
Collapse
|
32
|
Chen L, Lei J, Gong H. The effect of hearing status on speechreading performance of Chinese adolescents. CLINICAL LINGUISTICS & PHONETICS 2018; 32:1090-1102. [PMID: 30183411 DOI: 10.1080/02699206.2018.1510986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The effect of hearing status on the ability to speechread is poorly understood, and current findings are inconclusive regarding differences in speechreading performance between children and adults with hearing impairment and those with normal hearing. In this study, we investigated the effect of hearing status on speechreading skills in Chinese adolescents. Thirty seven severely deaf students with a mean pure-tone average of 93 dB hearing threshold level and 21 hearing controls aged 16 completed tasks measuring their speechreading of simplex finals (monophthongs), complex finals (diphthongs or vowel + nasal constellations) and initials (consonants) in Chinese. Both accuracy rate and response time data were collected. Results showed no significant difference in accuracy between groups. By contrast, deaf individuals were significantly faster at speechreading than their hearing controls. In addition, for both groups, performance on speechreading simplex finals was faster and more accurate than complex finals, which in turn was better than initial consonants. We conclude that speechreading skills in Chinese adolescents are influenced by hearing status, characteristics of sounds to be identified, as well as the measures used.
Collapse
Affiliation(s)
- Liang Chen
- a Communication Sciences and Special Education , University of Georgia , Athens , GA , USA
| | - Jianghua Lei
- b Department of Special Education , Central China Normal University , Wuhan , China
| | - Huina Gong
- b Department of Special Education , Central China Normal University , Wuhan , China
| |
Collapse
|
33
|
Dobs K, Bülthoff I, Schultz J. Use and Usefulness of Dynamic Face Stimuli for Face Perception Studies-a Review of Behavioral Findings and Methodology. Front Psychol 2018; 9:1355. [PMID: 30123162 PMCID: PMC6085596 DOI: 10.3389/fpsyg.2018.01355] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 07/13/2018] [Indexed: 01/01/2023] Open
Abstract
Faces that move contain rich information about facial form, such as facial features and their configuration, alongside the motion of those features. During social interactions, humans constantly decode and integrate these cues. To fully understand human face perception, it is important to investigate what information dynamic faces convey and how the human visual system extracts and processes information from this visual input. However, partly due to the difficulty of designing well-controlled dynamic face stimuli, many face perception studies still rely on static faces as stimuli. Here, we focus on evidence demonstrating the usefulness of dynamic faces as stimuli, and evaluate different types of dynamic face stimuli to study face perception. Studies based on dynamic face stimuli revealed a high sensitivity of the human visual system to natural facial motion and consistently reported dynamic advantages when static face information is insufficient for the task. These findings support the hypothesis that the human perceptual system integrates sensory cues for robust perception. In the present paper, we review the different types of dynamic face stimuli used in these studies, and assess their usefulness for several research questions. Natural videos of faces are ecological stimuli but provide limited control of facial form and motion. Point-light faces allow for good control of facial motion but are highly unnatural. Image-based morphing is a way to achieve control over facial motion while preserving the natural facial form. Synthetic facial animations allow separation of facial form and motion to study aspects such as identity-from-motion. While synthetic faces are less natural than videos of faces, recent advances in photo-realistic rendering may close this gap and provide naturalistic stimuli with full control over facial motion. We believe that many open questions, such as what dynamic advantages exist beyond emotion and identity recognition and which dynamic aspects drive these advantages, can be addressed adequately with different types of stimuli and will improve our understanding of face perception in more ecological settings.
Collapse
Affiliation(s)
- Katharina Dobs
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, United States.,Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Isabelle Bülthoff
- Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Johannes Schultz
- Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Tübingen, Germany.,Division of Medical Psychology and Department of Psychiatry, University of Bonn, Bonn, Germany
| |
Collapse
|
34
|
Garnier M, Ménard L, Alexandre B. Hyper-articulation in Lombard speech: An active communicative strategy to enhance visible speech cues? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1059. [PMID: 30180713 DOI: 10.1121/1.5051321] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 08/02/2018] [Indexed: 06/08/2023]
Abstract
This study investigates the hypothesis that speakers make active use of the visual modality in production to improve their speech intelligibility in noisy conditions. Six native speakers of Canadian French produced speech in quiet conditions and in 85 dB of babble noise, in three situations: interacting face-to-face with the experimenter (AV), using the auditory modality only (AO), or reading aloud (NI, no interaction). The audio signal was recorded with the three-dimensional movements of their lips and tongue, using electromagnetic articulography. All the speakers reacted similarly to the presence vs absence of communicative interaction, showing significant speech modifications with noise exposure in both interactive and non-interactive conditions, not only for parameters directly related to voice intensity or for lip movements (very visible) but also for tongue movements (less visible); greater adaptation was observed in interactive conditions, though. However, speakers reacted differently to the availability or unavailability of visual information: only four speakers enhanced their visible articulatory movements more in the AV condition. These results support the idea that the Lombard effect is at least partly a listener-oriented adaptation. However, to clarify their speech in noisy conditions, only some speakers appear to make active use of the visual modality.
Collapse
Affiliation(s)
- Maëva Garnier
- Centre National de la Recherche Scientifique, Laboratoire Grenoble Images Parole Signal Automatique, 11 rue des Mathématiques, Grenoble Campus, Boîte Postale 46, F-38402 Saint Martin d'Hères Cedex, France
| | - Lucie Ménard
- Département de Linguistique, Laboratoire de Phonétique, Center for Research on Brain, Language, and Music, Université du Québec à Montréal, 320, Ste-Catherine Est, Montréal, Quebec H2X 1L7, Canada
| | - Boris Alexandre
- Centre National de la Recherche Scientifique, Laboratoire Grenoble Images Parole Signal Automatique, 11 rue des Mathématiques, Grenoble Campus, Boîte Postale 46, F-38402 Saint Martin d'Hères Cedex, France
| |
Collapse
|
35
|
Worster E, Pimperton H, Ralph-Lewis A, Monroy L, Hulme C, MacSweeney M. Eye Movements During Visual Speech Perception in Deaf and Hearing Children. LANGUAGE LEARNING 2018; 68:159-179. [PMID: 29937576 PMCID: PMC6001475 DOI: 10.1111/lang.12264] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Revised: 08/10/2017] [Accepted: 08/11/2017] [Indexed: 06/08/2023]
Abstract
For children who are born deaf, lipreading (speechreading) is an important source of access to spoken language. We used eye tracking to investigate the strategies used by deaf (n = 33) and hearing 5-8-year-olds (n = 59) during a sentence speechreading task. The proportion of time spent looking at the mouth during speech correlated positively with speechreading accuracy. In addition, all children showed a tendency to watch the mouth during speech and watch the eyes when the model was not speaking. The extent to which the children used this communicative pattern, which we refer to as social-tuning, positively predicted their speechreading performance, with the deaf children showing a stronger relationship than the hearing children. These data suggest that better speechreading skills are seen in those children, both deaf and hearing, who are able to guide their visual attention to the appropriate part of the image and in those who have a good understanding of conversational turn-taking.
Collapse
Affiliation(s)
| | | | - Amelia Ralph-Lewis
- Deafness, Cognition, and Language Research Centre University College London
| | - Laura Monroy
- Institute of Cognitive Neuroscience University College London
| | | | - Mairéad MacSweeney
- Institute of Cognitive Neuroscience University College London
- Deafness, Cognition, and Language Research Centre University College London
| |
Collapse
|
36
|
Bernstein LE. Response Errors in Females' and Males' Sentence Lipreading Necessitate Structurally Different Models for Predicting Lipreading Accuracy. LANGUAGE LEARNING 2018; 68:127-158. [PMID: 31485084 PMCID: PMC6724546 DOI: 10.1111/lang.12281] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Lipreaders recognize words with phonetically impoverished stimuli, an ability that is generally poor in normal-hearing adults. Individual sentence lipreading trials from 341 young adults were modeled to predict words and phonemes correct in terms of measures of phoneme response dissimilarity (PRD), number of inserted incorrect response phonemes, lipreader gender, and a measure of speech perception in noise. Interactions with lipreaders' gender necessitated structurally different models of males' and females' lipreading. Overall, female lipreaders are more accurate, their ability to recognize words with impoverished or degraded input is consistent across visual and auditory modalities, and they amplify their correct responding through top-down insertion of text. Males' responses suggest that individuals with poorer auditory speech perception in noise amplify their responses by shifting towards including text in their response that is more perceptually discrepant from the stimulus. Attention to gender differences merits attention in future studies that use visual speech stimuli.
Collapse
Affiliation(s)
- Lynne E Bernstein
- Department of Speech, Language, and Hearing Science, George Washington University, 2121 I St NW, Washington, DC 20052
| |
Collapse
|
37
|
Díaz B, Blank H, von Kriegstein K. Task-dependent modulation of the visual sensory thalamus assists visual-speech recognition. Neuroimage 2018; 178:721-734. [PMID: 29772380 DOI: 10.1016/j.neuroimage.2018.05.032] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Revised: 04/12/2018] [Accepted: 05/12/2018] [Indexed: 11/19/2022] Open
Abstract
The cerebral cortex modulates early sensory processing via feed-back connections to sensory pathway nuclei. The functions of this top-down modulation for human behavior are poorly understood. Here, we show that top-down modulation of the visual sensory thalamus (the lateral geniculate body, LGN) is involved in visual-speech recognition. In two independent functional magnetic resonance imaging (fMRI) studies, LGN response increased when participants processed fast-varying features of articulatory movements required for visual-speech recognition, as compared to temporally more stable features required for face identification with the same stimulus material. The LGN response during the visual-speech task correlated positively with the visual-speech recognition scores across participants. In addition, the task-dependent modulation was present for speech movements and did not occur for control conditions involving non-speech biological movements. In face-to-face communication, visual speech recognition is used to enhance or even enable understanding what is said. Speech recognition is commonly explained in frameworks focusing on cerebral cortex areas. Our findings suggest that task-dependent modulation at subcortical sensory stages has an important role for communication: Together with similar findings in the auditory modality the findings imply that task-dependent modulation of the sensory thalami is a general mechanism to optimize speech recognition.
Collapse
Affiliation(s)
- Begoña Díaz
- Center for Brain and Cognition, Pompeu Fabra University, Barcelona, 08018, Spain; Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany; Department of Basic Sciences, Faculty of Medicine and Health Sciences, International University of Catalonia, 08195 Sant Cugat del Vallès, Spain.
| | - Helen Blank
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany; University Medical Center Hamburg-Eppendorf, 20246, Hamburg, Germany
| | - Katharina von Kriegstein
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, 04103, Germany; Faculty of Psychology, Technische Universität Dresden, 01187, Dresden, Germany
| |
Collapse
|
38
|
Validating a Method to Assess Lipreading, Audiovisual Gain, and Integration During Speech Reception With Cochlear-Implanted and Normal-Hearing Subjects Using a Talking Head. Ear Hear 2018; 39:503-516. [DOI: 10.1097/aud.0000000000000502] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
39
|
Hennequin A, Rochet-Capellan A, Gerber S, Dohen M. Does the Visual Channel Improve the Perception of Consonants Produced by Speakers of French With Down Syndrome? JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:957-972. [PMID: 29635399 DOI: 10.1044/2017_jslhr-h-17-0112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 12/08/2017] [Indexed: 06/08/2023]
Abstract
PURPOSE This work evaluates whether seeing the speaker's face could improve the speech intelligibility of adults with Down syndrome (DS). This is not straightforward because DS induces a number of anatomical and motor anomalies affecting the orofacial zone. METHOD A speech-in-noise perception test was used to evaluate the intelligibility of 16 consonants (Cs) produced in a vowel-consonant-vowel context (Vo = /a/) by 4 speakers with DS and 4 control speakers. Forty-eight naïve participants were asked to identify the stimuli in 3 modalities: auditory (A), visual (V), and auditory-visual (AV). The probability of correct responses was analyzed, as well as AV gain, confusions, and transmitted information as a function of modality and phonetic features. RESULTS The probability of correct response follows the trend AV > A > V, with smaller values for the DS than the control speakers in A and AV but not in V. This trend depended on the C: the V information particularly improved the transmission of place of articulation and to a lesser extent of manner, whereas voicing remained specifically altered in DS. CONCLUSIONS The results suggest that the V information is intact in the speech of people with DS and improves the perception of some phonetic features in Cs in a similar way as for control speakers. This result has implications for further studies, rehabilitation protocols, and specific training of caregivers. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.6002267.
Collapse
Affiliation(s)
| | | | - Silvain Gerber
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
| | - Marion Dohen
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
| |
Collapse
|
40
|
Hládek Ľ, Porr B, Brimijoin WO. Real-time estimation of horizontal gaze angle by saccade integration using in-ear electrooculography. PLoS One 2018; 13:e0190420. [PMID: 29304120 PMCID: PMC5755791 DOI: 10.1371/journal.pone.0190420] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 12/14/2017] [Indexed: 12/04/2022] Open
Abstract
The manuscript proposes and evaluates a real-time algorithm for estimating eye gaze angle based solely on single-channel electrooculography (EOG), which can be obtained directly from the ear canal using conductive ear moulds. In contrast to conventional high-pass filtering, we used an algorithm that calculates absolute eye gaze angle via statistical analysis of detected saccades. The estimated eye positions of the new algorithm were still noisy. However, the performance in terms of Pearson product-moment correlation coefficients was significantly better than the conventional approach in some instances. The results suggest that in-ear EOG signals captured with conductive ear moulds could serve as a basis for light-weight and portable horizontal eye gaze angle estimation suitable for a broad range of applications. For instance, for hearing aids to steer the directivity of microphones in the direction of the user's eye gaze.
Collapse
Affiliation(s)
- Ľuboš Hládek
- Medical Research Council/Chief Scientist Office Institute of Hearing Research - Scottish Section, Glasgow, United Kingdom
| | - Bernd Porr
- School of Engineering, University of Glasgow, Glasgow, United Kingdom
| | - W. Owen Brimijoin
- Medical Research Council/Chief Scientist Office Institute of Hearing Research - Scottish Section, Glasgow, United Kingdom
| |
Collapse
|
41
|
Ross LA, Del Bene VA, Molholm S, Woo YJ, Andrade GN, Abrahams BS, Foxe JJ. Common variation in the autism risk gene CNTNAP2, brain structural connectivity and multisensory speech integration. BRAIN AND LANGUAGE 2017; 174:50-60. [PMID: 28738218 DOI: 10.1016/j.bandl.2017.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 04/07/2017] [Accepted: 07/11/2017] [Indexed: 06/07/2023]
Abstract
Three lines of evidence motivated this study. 1) CNTNAP2 variation is associated with autism risk and speech-language development. 2) CNTNAP2 variations are associated with differences in white matter (WM) tracts comprising the speech-language circuitry. 3) Children with autism show impairment in multisensory speech perception. Here, we asked whether an autism risk-associated CNTNAP2 single nucleotide polymorphism in neurotypical adults was associated with multisensory speech perception performance, and whether such a genotype-phenotype association was mediated through white matter tract integrity in speech-language circuitry. Risk genotype at rs7794745 was associated with decreased benefit from visual speech and lower fractional anisotropy (FA) in several WM tracts (right precentral gyrus, left anterior corona radiata, right retrolenticular internal capsule). These structural connectivity differences were found to mediate the effect of genotype on audiovisual speech perception, shedding light on possible pathogenic pathways in autism and biological sources of inter-individual variation in audiovisual speech processing in neurotypicals.
Collapse
Affiliation(s)
- Lars A Ross
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA.
| | - Victor A Del Bene
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Ferkauf Graduate School of Psychology Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Sophie Molholm
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Young Jae Woo
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Gizely N Andrade
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA
| | - Brett S Abrahams
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - John J Foxe
- The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center (CERC), Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Intellectual and Developmental Disabilities Research Center, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Ernest J. Del Monte Institute for Neuroscience, Department of Neuroscience, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA.
| |
Collapse
|
42
|
Stropahl M, Debener S. Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration. Neuroimage Clin 2017; 16:514-523. [PMID: 28971005 PMCID: PMC5609862 DOI: 10.1016/j.nicl.2017.09.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Revised: 08/15/2017] [Accepted: 09/02/2017] [Indexed: 11/28/2022]
Abstract
There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users (n = 18), untreated mild to moderately hearing impaired individuals (n = 18) and normal hearing controls (n = 17). Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the auditory system even at early stages of hearing loss.
Collapse
Affiliation(s)
- Maren Stropahl
- Neuropsychology Lab, Department of Psychology, European Medical School, Carl von Ossietzky University Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, European Medical School, Carl von Ossietzky University Oldenburg, Germany
- Cluster of Excellence Hearing4all Oldenburg, Germany
| |
Collapse
|
43
|
Mastrantuono E, Saldaña D, Rodríguez-Ortiz IR. An Eye Tracking Study on the Perception and Comprehension of Unimodal and Bimodal Linguistic Inputs by Deaf Adolescents. Front Psychol 2017; 8:1044. [PMID: 28680416 PMCID: PMC5478736 DOI: 10.3389/fpsyg.2017.01044] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 06/07/2017] [Indexed: 11/13/2022] Open
Abstract
An eye tracking experiment explored the gaze behavior of deaf individuals when perceiving language in spoken and sign language only, and in sign-supported speech (SSS). Participants were deaf (n = 25) and hearing (n = 25) Spanish adolescents. Deaf students were prelingually profoundly deaf individuals with cochlear implants (CIs) used by age 5 or earlier, or prelingually profoundly deaf native signers with deaf parents. The effectiveness of SSS has rarely been tested within the same group of children for discourse-level comprehension. Here, video-recorded texts, including spatial descriptions, were alternately transmitted in spoken language, sign language and SSS. The capacity of these communicative systems to equalize comprehension in deaf participants with that of spoken language in hearing participants was tested. Within-group analyses of deaf participants tested if the bimodal linguistic input of SSS favored discourse comprehension compared to unimodal languages. Deaf participants with CIs achieved equal comprehension to hearing controls in all communicative systems while deaf native signers with no CIs achieved equal comprehension to hearing participants if tested in their native sign language. Comprehension of SSS was not increased compared to spoken language, even when spatial information was communicated. Eye movements of deaf and hearing participants were tracked and data of dwell times spent looking at the face or body area of the sign model were analyzed. Within-group analyses focused on differences between native and non-native signers. Dwell times of hearing participants were equally distributed across upper and lower areas of the face while deaf participants mainly looked at the mouth area; this could enable information to be obtained from mouthings in sign language and from lip-reading in SSS and spoken language. Few fixations were directed toward the signs, although these were more frequent when spatial language was transmitted. Both native and non-native signers looked mainly at the face when perceiving sign language, although non-native signers looked significantly more at the body than native signers. This distribution of gaze fixations suggested that deaf individuals – particularly native signers – mainly perceived signs through peripheral vision.
Collapse
Affiliation(s)
- Eliana Mastrantuono
- Departamento de Psicología Evolutiva y de la Educación, Universidad de SevillaSeville, Spain
| | - David Saldaña
- Departamento de Psicología Evolutiva y de la Educación, Universidad de SevillaSeville, Spain
| | | |
Collapse
|
44
|
Irwin J, DiBlasi L. Audiovisual speech perception: A new approach and implications for clinical populations. LANGUAGE AND LINGUISTICS COMPASS 2017; 11:77-91. [PMID: 29520300 PMCID: PMC5839512 DOI: 10.1111/lnc3.12237] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 01/25/2017] [Indexed: 06/01/2023]
Abstract
This selected overview of audiovisual (AV) speech perception examines the influence of visible articulatory information on what is heard. Thought to be a cross-cultural phenomenon that emerges early in typical language development, variables that influence AV speech perception include properties of the visual and the auditory signal, attentional demands, and individual differences. A brief review of the existing neurobiological evidence on how visual information influences heard speech indicates potential loci, timing, and facilitatory effects of AV over auditory only speech. The current literature on AV speech in certain clinical populations (individuals with an autism spectrum disorder, developmental language disorder, or hearing loss) reveals differences in processing that may inform interventions. Finally, a new method of assessing AV speech that does not require obvious cross-category mismatch or auditory noise was presented as a novel approach for investigators.
Collapse
Affiliation(s)
- Julia Irwin
- LEARN Center, Haskins Laboratories Inc., USA
| | | |
Collapse
|
45
|
Heikkilä J, Lonka E, Ahola S, Meronen A, Tiippana K. Lipreading Ability and Its Cognitive Correlates in Typically Developing Children and Children With Specific Language Impairment. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:485-493. [PMID: 28241193 DOI: 10.1044/2016_jslhr-s-15-0071] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 04/28/2016] [Indexed: 06/06/2023]
Abstract
PURPOSE Lipreading and its cognitive correlates were studied in school-age children with typical language development and delayed language development due to specific language impairment (SLI). METHOD Forty-two children with typical language development and 20 children with SLI were tested by using a word-level lipreading test and an extensive battery of standardized cognitive and linguistic tests. RESULTS Children with SLI were poorer lipreaders than their typically developing peers. Good phonological skills were associated with skilled lipreading in both typically developing children and in children with SLI. Lipreading was also found to correlate with several cognitive skills, for example, short-term memory capacity and verbal motor skills. CONCLUSIONS Speech processing deficits in SLI extend also to the perception of visual speech. Lipreading performance was associated with phonological skills. Poor lipreading in children with SLI may be, thus, related to problems in phonological processing.
Collapse
Affiliation(s)
- Jenni Heikkilä
- Division of Cognitive Psychology and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Finland
| | - Eila Lonka
- Division of Logopedics, Institute of Behavioural Sciences, University of Helsinki, Finland
| | - Sanna Ahola
- Division of Cognitive Psychology and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Finland
| | | | - Kaisa Tiippana
- Division of Cognitive Psychology and Neuropsychology, Institute of Behavioural Sciences, University of Helsinki, Finland
| |
Collapse
|
46
|
Pimperton H, Ralph-Lewis A, MacSweeney M. Speechreading in Deaf Adults with Cochlear Implants: Evidence for Perceptual Compensation. Front Psychol 2017; 8:106. [PMID: 28223951 PMCID: PMC5294775 DOI: 10.3389/fpsyg.2017.00106] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 01/16/2017] [Indexed: 11/13/2022] Open
Abstract
Previous research has provided evidence for a speechreading advantage in congenitally deaf adults compared to hearing adults. A 'perceptual compensation' account of this finding proposes that prolonged early onset deafness leads to a greater reliance on visual, as opposed to auditory, information when perceiving speech which in turn results in superior visual speech perception skills in deaf adults. In the current study we tested whether previous demonstrations of a speechreading advantage for profoundly congenitally deaf adults with hearing aids, or no amplificiation, were also apparent in adults with the same deafness profile but who have experienced greater access to the auditory elements of speech via a cochlear implant (CI). We also tested the prediction that, in line with the perceptual compensation account, receiving a CI at a later age is associated with superior speechreading skills due to later implanted individuals having experienced greater dependence on visual speech information. We designed a speechreading task in which participants viewed silent videos of 123 single words spoken by a model and were required to indicate which word they thought had been said via a free text response. We compared congenitally deaf adults who had received CIs in childhood or adolescence (N = 15) with a comparison group of hearing adults (N = 15) matched on age and education level. The adults with CI showed significantly better scores on the speechreading task than the hearing comparison group. Furthermore, within the group of adults with CI, there was a significant positive correlation between age at implantation and speechreading performance; earlier implantation was associated with lower speechreading scores. These results are both consistent with the hypothesis of perceptual compensation in the domain of speech perception, indicating that more prolonged dependence on visual speech information in speech perception may lead to improvements in the perception of visual speech. In addition our study provides metrics of the 'speechreadability' of 123 words produced in British English: one derived from hearing adults (N = 61) and one from deaf adults with CI (N = 15). Evidence for the validity of these 'speechreadability' metrics come from correlations with visual lexical competition data.
Collapse
Affiliation(s)
- Hannah Pimperton
- Institute of Cognitive Neuroscience, University College London London, UK
| | - Amelia Ralph-Lewis
- Institute of Cognitive Neuroscience, University College London London, UK
| | - Mairéad MacSweeney
- Institute of Cognitive Neuroscience, University College LondonLondon, UK; Deafness, Cognition and Language Centre, University College LondonLondon, UK
| |
Collapse
|
47
|
Francisco AA, Groen MA, Jesse A, McQueen JM. Beyond the usual cognitive suspects: The importance of speechreading and audiovisual temporal sensitivity in reading ability. LEARNING AND INDIVIDUAL DIFFERENCES 2017. [DOI: 10.1016/j.lindif.2017.01.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
48
|
O'Sullivan AE, Crosse MJ, Di Liberto GM, Lalor EC. Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading. Front Hum Neurosci 2017; 10:679. [PMID: 28123363 PMCID: PMC5225113 DOI: 10.3389/fnhum.2016.00679] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Accepted: 12/20/2016] [Indexed: 11/13/2022] Open
Abstract
Speech is a multisensory percept, comprising an auditory and visual component. While the content and processing pathways of audio speech have been well characterized, the visual component is less well understood. In this work, we expand current methodologies using system identification to introduce a framework that facilitates the study of visual speech in its natural, continuous form. Specifically, we use models based on the unheard acoustic envelope (E), the motion signal (M) and categorical visual speech features (V) to predict EEG activity during silent lipreading. Our results show that each of these models performs similarly at predicting EEG in visual regions and that respective combinations of the individual models (EV, MV, EM and EMV) provide an improved prediction of the neural activity over their constituent models. In comparing these different combinations, we find that the model incorporating all three types of features (EMV) outperforms the individual models, as well as both the EV and MV models, while it performs similarly to the EM model. Importantly, EM does not outperform EV and MV, which, considering the higher dimensionality of the V model, suggests that more data is needed to clarify this finding. Nevertheless, the performance of EMV, and comparisons of the subject performances for the three individual models, provides further evidence to suggest that visual regions are involved in both low-level processing of stimulus dynamics and categorical speech perception. This framework may prove useful for investigating modality-specific processing of visual speech under naturalistic conditions.
Collapse
Affiliation(s)
- Aisling E O'Sullivan
- School of Engineering, Trinity College DublinDublin, Ireland; Trinity Centre for Bioengineering, Trinity College DublinDublin, Ireland
| | - Michael J Crosse
- Department of Pediatrics and Department of Neuroscience, Albert Einstein College of Medicine Bronx, NY, USA
| | - Giovanni M Di Liberto
- School of Engineering, Trinity College DublinDublin, Ireland; Trinity Centre for Bioengineering, Trinity College DublinDublin, Ireland
| | - Edmund C Lalor
- School of Engineering, Trinity College DublinDublin, Ireland; Trinity Centre for Bioengineering, Trinity College DublinDublin, Ireland; Trinity College Institute of Neuroscience, Trinity College DublinDublin, Ireland; Department of Biomedical Engineering and Department of Neuroscience, University of RochesterRochester, NY, USA
| |
Collapse
|
49
|
Anderson CA, Lazard DS, Hartley DEH. Plasticity in bilateral superior temporal cortex: Effects of deafness and cochlear implantation on auditory and visual speech processing. Hear Res 2017; 343:138-149. [PMID: 27473501 DOI: 10.1016/j.heares.2016.07.013] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Revised: 07/20/2016] [Accepted: 07/25/2016] [Indexed: 12/01/2022]
Abstract
While many individuals can benefit substantially from cochlear implantation, the ability to perceive and understand auditory speech with a cochlear implant (CI) remains highly variable amongst adult recipients. Importantly, auditory performance with a CI cannot be reliably predicted based solely on routinely obtained information regarding clinical characteristics of the CI candidate. This review argues that central factors, notably cortical function and plasticity, should also be considered as important contributors to the observed individual variability in CI outcome. Superior temporal cortex (STC), including auditory association areas, plays a crucial role in the processing of auditory and visual speech information. The current review considers evidence of cortical plasticity within bilateral STC, and how these effects may explain variability in CI outcome. Furthermore, evidence of audio-visual interactions in temporal and occipital cortices is examined, and relation to CI outcome is discussed. To date, longitudinal examination of changes in cortical function and plasticity over the period of rehabilitation with a CI has been restricted by methodological challenges. The application of functional near-infrared spectroscopy (fNIRS) in studying cortical function in CI users is becoming increasingly recognised as a potential solution to these problems. Here we suggest that fNIRS offers a powerful neuroimaging tool to elucidate the relationship between audio-visual interactions, cortical plasticity during deafness and following cochlear implantation, and individual variability in auditory performance with a CI.
Collapse
Affiliation(s)
- Carly A Anderson
- National Institute for Health Research (NIHR) Nottingham Hearing Biomedical Research Unit, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, United Kingdom; Otology and Hearing Group, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, United Kingdom.
| | - Diane S Lazard
- Institut Arthur Vernes, ENT Surgery, Paris, 75006, France; Nottingham University Hospitals NHS Trust, Derby Road, Nottingham, NG7 2UH, United Kingdom.
| | - Douglas E H Hartley
- National Institute for Health Research (NIHR) Nottingham Hearing Biomedical Research Unit, Ropewalk House, 113 The Ropewalk, Nottingham, NG1 5DU, United Kingdom; Otology and Hearing Group, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, United Kingdom; Nottingham University Hospitals NHS Trust, Derby Road, Nottingham, NG7 2UH, United Kingdom; Medical Research Council (MRC) Institute of Hearing Research, The University of Nottingham, University Park, Nottingham, NG7 2RD, United Kingdom.
| |
Collapse
|
50
|
Wilson AH, Alsius A, Paré M, Munhall KG. Spatial Frequency Requirements and Gaze Strategy in Visual-Only and Audiovisual Speech Perception. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2016; 59:601-15. [PMID: 27537379 PMCID: PMC5280058 DOI: 10.1044/2016_jslhr-s-15-0092] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Revised: 09/16/2015] [Accepted: 10/07/2015] [Indexed: 06/06/2023]
Abstract
PURPOSE The aim of this article is to examine the effects of visual image degradation on performance and gaze behavior in audiovisual and visual-only speech perception tasks. METHOD We presented vowel-consonant-vowel utterances visually filtered at a range of frequencies in visual-only, audiovisual congruent, and audiovisual incongruent conditions (Experiment 1; N = 66). In Experiment 2 (N = 20), participants performed a visual-only speech perception task and in Experiment 3 (N = 20) an audiovisual task while having their gaze behavior monitored using eye-tracking equipment. RESULTS In the visual-only condition, increasing image resolution led to monotonic increases in performance, and proficient speechreaders were more affected by the removal of high spatial information than were poor speechreaders. The McGurk effect also increased with increasing visual resolution, although it was less affected by the removal of high-frequency information. Observers tended to fixate on the mouth more in visual-only perception, but gaze toward the mouth did not correlate with accuracy of silent speechreading or the magnitude of the McGurk effect. CONCLUSIONS The results suggest that individual differences in silent speechreading and the McGurk effect are not related. This conclusion is supported by differential influences of high-resolution visual information on the 2 tasks and differences in the pattern of gaze.
Collapse
Affiliation(s)
- Amanda H. Wilson
- Psychology Department, Queen's University, Kingston, Ontario, Canada
- Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada
| | - Agnès Alsius
- Psychology Department, Queen's University, Kingston, Ontario, Canada
| | - Martin Paré
- Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada
| | - Kevin G. Munhall
- Psychology Department, Queen's University, Kingston, Ontario, Canada
- Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada
| |
Collapse
|