1
|
Lew EC, Sares A, Gilbert AC, Zhang Y, Lehmann A, Deroche M. Differences Between French and English in the Use of Suprasegmental Cues for the Short-Term Recall of Word Lists. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024:1-14. [PMID: 39320319 DOI: 10.1044/2024_jslhr-23-00655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Abstract
PURPOSE Greater recognition of the impact of hearing loss on cognitive functions has led speech/hearing clinics to focus more on auditory memory outcomes. Typically evaluated by scoring participants' recall on a list of unrelated words after they have heard the list read out loud, this method implies pitch and timing variations across words. Here, we questioned whether these variations could impact performance differentially in one language or another. METHOD In a series of online studies evaluating auditory short-term memory in normally hearing adults, we examined how pitch patterns (Experiment 1), timing patterns (Experiment 2), and interactions between the two (Experiment 3) affected free recall of words, cued recall of forgotten words, and mental demand. Note that visual memory was never directly tested; written words were only used after auditory encoding in the cued recall part. Studies were administered in both French and English, always conducted with native listeners. RESULT Confirming prior work, grouping mechanisms facilitated free recall, but not cued recall (the latter being only affected by longer presentation time) or ratings of mental demand. Critically, grouping by pitch provided more benefit for French than for English listeners, while grouping by time was equally beneficial in both languages. CONCLUSION Pitch is more useful to French- than to English-speaking listeners for encoding spoken words in short-term memory, perhaps due to the syllable-based versus stress-based rhythms inherent to each language. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.27048328.
Collapse
Affiliation(s)
- Emilia C Lew
- Laboratory for Hearing and Cognition, Department of Psychology, Concordia University, Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music, Montréal, Québec, Canada
| | - Anastasia Sares
- Department of Psychology, Colorado State University, Fort Collins
| | - Annie C Gilbert
- Centre for Research on Brain, Language and Music, Montréal, Québec, Canada
- School of Communication Sciences and Disorders, McGill University, Montréal, Québec, Canada
| | | | - Alexandre Lehmann
- Centre for Research on Brain, Language and Music, Montréal, Québec, Canada
- Department of Otolaryngology - Head and Neck Surgery, Faculty of Medicine and Health Sciences, McGill University, Montréal, Québec, Canada
| | - Mickael Deroche
- Laboratory for Hearing and Cognition, Department of Psychology, Concordia University, Montréal, Quebec, Canada
- Centre for Research on Brain, Language and Music, Montréal, Québec, Canada
| |
Collapse
|
2
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Prelingually Deaf Children With Cochlear Implants Show Better Perception of Voice Cues and Speech in Competing Speech Than Postlingually Deaf Adults With Cochlear Implants. Ear Hear 2024; 45:952-968. [PMID: 38616318 PMCID: PMC11175806 DOI: 10.1097/aud.0000000000001489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 01/10/2024] [Indexed: 04/16/2024]
Abstract
OBJECTIVES Postlingually deaf adults with cochlear implants (CIs) have difficulties with perceiving differences in speakers' voice characteristics and benefit little from voice differences for the perception of speech in competing speech. However, not much is known yet about the perception and use of voice characteristics in prelingually deaf implanted children with CIs. Unlike CI adults, most CI children became deaf during the acquisition of language. Extensive neuroplastic changes during childhood could make CI children better at using the available acoustic cues than CI adults, or the lack of exposure to a normal acoustic speech signal could make it more difficult for them to learn which acoustic cues they should attend to. This study aimed to examine to what degree CI children can perceive voice cues and benefit from voice differences for perceiving speech in competing speech, comparing their abilities to those of normal-hearing (NH) children and CI adults. DESIGN CI children's voice cue discrimination (experiment 1), voice gender categorization (experiment 2), and benefit from target-masker voice differences for perceiving speech in competing speech (experiment 3) were examined in three experiments. The main focus was on the perception of mean fundamental frequency (F0) and vocal-tract length (VTL), the primary acoustic cues related to speakers' anatomy and perceived voice characteristics, such as voice gender. RESULTS CI children's F0 and VTL discrimination thresholds indicated lower sensitivity to differences compared with their NH-age-equivalent peers, but their mean discrimination thresholds of 5.92 semitones (st) for F0 and 4.10 st for VTL indicated higher sensitivity than postlingually deaf CI adults with mean thresholds of 9.19 st for F0 and 7.19 st for VTL. Furthermore, CI children's perceptual weighting of F0 and VTL cues for voice gender categorization closely resembled that of their NH-age-equivalent peers, in contrast with CI adults. Finally, CI children had more difficulties in perceiving speech in competing speech than their NH-age-equivalent peers, but they performed better than CI adults. Unlike CI adults, CI children showed a benefit from target-masker voice differences in F0 and VTL, similar to NH children. CONCLUSION Although CI children's F0 and VTL voice discrimination scores were overall lower than those of NH children, their weighting of F0 and VTL cues for voice gender categorization and their benefit from target-masker differences in F0 and VTL resembled that of NH children. Together, these results suggest that prelingually deaf implanted CI children can effectively utilize spectrotemporally degraded F0 and VTL cues for voice and speech perception, generally outperforming postlingually deaf CI adults in comparable tasks. These findings underscore the presence of F0 and VTL cues in the CI signal to a certain degree and suggest other factors contributing to the perception challenges faced by CI adults.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Sound Lab, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
3
|
Chatterjee M, Gajre S, Kulkarni AM, Barrett KC, Limb CJ. Predictors of Emotional Prosody Identification by School-Age Children With Cochlear Implants and Their Peers With Normal Hearing. Ear Hear 2024; 45:411-424. [PMID: 37811966 PMCID: PMC10922148 DOI: 10.1097/aud.0000000000001436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
OBJECTIVES Children with cochlear implants (CIs) vary widely in their ability to identify emotions in speech. The causes of this variability are unknown, but this knowledge will be crucial if we are to design improvements in technological or rehabilitative interventions that are effective for individual patients. The objective of this study was to investigate how well factors such as age at implantation, duration of device experience (hearing age), nonverbal cognition, vocabulary, and socioeconomic status predict prosody-based emotion identification in children with CIs, and how the key predictors in this population compare to children with normal hearing who are listening to either normal emotional speech or to degraded speech. DESIGN We measured vocal emotion identification in 47 school-age CI recipients aged 7 to 19 years in a single-interval, 5-alternative forced-choice task. None of the participants had usable residual hearing based on parent/caregiver report. Stimuli consisted of a set of semantically emotion-neutral sentences that were recorded by 4 talkers in child-directed and adult-directed prosody corresponding to five emotions: neutral, angry, happy, sad, and scared. Twenty-one children with normal hearing were also tested in the same tasks; they listened to both original speech and to versions that had been noise-vocoded to simulate CI information processing. RESULTS Group comparison confirmed the expected deficit in CI participants' emotion identification relative to participants with normal hearing. Within the CI group, increasing hearing age (correlated with developmental age) and nonverbal cognition outcomes predicted emotion recognition scores. Stimulus-related factors such as talker and emotional category also influenced performance and were involved in interactions with hearing age and cognition. Age at implantation was not predictive of emotion identification. Unlike the CI participants, neither cognitive status nor vocabulary predicted outcomes in participants with normal hearing, whether listening to original speech or CI-simulated speech. Age-related improvements in outcomes were similar in the two groups. Participants with normal hearing listening to original speech showed the greatest differences in their scores for different talkers and emotions. Participants with normal hearing listening to CI-simulated speech showed significant deficits compared with their performance with original speech materials, and their scores also showed the least effect of talker- and emotion-based variability. CI participants showed more variation in their scores with different talkers and emotions than participants with normal hearing listening to CI-simulated speech, but less so than participants with normal hearing listening to original speech. CONCLUSIONS Taken together, these results confirm previous findings that pediatric CI recipients have deficits in emotion identification based on prosodic cues, but they improve with age and experience at a rate that is similar to peers with normal hearing. Unlike participants with normal hearing, nonverbal cognition played a significant role in CI listeners' emotion identification. Specifically, nonverbal cognition predicted the extent to which individual CI users could benefit from some talkers being more expressive of emotions than others, and this effect was greater in CI users who had less experience with their device (or were younger) than CI users who had more experience with their device (or were older). Thus, in young prelingually deaf children with CIs performing an emotional prosody identification task, cognitive resources may be harnessed to a greater degree than in older prelingually deaf children with CIs or than children with normal hearing.
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Shivani Gajre
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Aditya M Kulkarni
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Karen C Barrett
- Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, San Francisco, California, USA
| | - Charles J Limb
- Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
4
|
Babaoğlu G, Rachman L, Ertürk P, Özkişi Yazgan B, Sennaroğlu G, Gaudrain E, Başkent D. Perception of voice cues in school-age children with hearing aids. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:722-741. [PMID: 38284822 DOI: 10.1121/10.0024356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 12/26/2023] [Indexed: 01/30/2024]
Abstract
The just-noticeable differences (JNDs) of the voice cues of voice pitch (F0) and vocal-tract length (VTL) were measured in school-aged children with bilateral hearing aids and children and adults with normal hearing. The JNDs were larger for hearing-aided than normal-hearing children up to the age of 12 for F0 and into adulthood for all ages for VTL. Age was a significant factor for both groups for F0 JNDs, but only for the hearing-aided group for VTL JNDs. Age of maturation was later for F0 than VTL. Individual JNDs of the two groups largely overlapped for F0, but little for VTL. Hearing thresholds (unaided or aided, 500-400 Hz, overlapping with mid-range speech frequencies) did not correlate with the JNDs. However, extended low-frequency hearing thresholds (unaided, 125-250 Hz, overlapping with voice F0 ranges) correlated with the F0 JNDs. Hence, age and hearing status differentially interact with F0 and VTL perception, and VTL perception seems challenging for hearing-aided children. On the other hand, even children with profound hearing loss could do the task, indicating a hearing aid benefit for voice perception. Given the significant age effect and that for F0 the hearing-aided children seem to be catching up with age-typical development, voice cue perception may continue developing in hearing-aided children.
Collapse
Affiliation(s)
- Gizem Babaoğlu
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Pınar Ertürk
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Başak Özkişi Yazgan
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Gonca Sennaroğlu
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| |
Collapse
|
5
|
Wheeler HJ, Hatch DR, Moody-Antonio SA, Nie Y. Music and Speech Perception in Prelingually Deafened Young Listeners With Cochlear Implants: A Preliminary Study Using Sung Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3951-3965. [PMID: 36179251 DOI: 10.1044/2022_jslhr-21-00271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE In the context of music and speech perception, this study aimed to assess the effect of variation in one of two auditory attributes-pitch contour and timbre-on the perception of the other in prelingually deafened young cochlear implant (CI) users, and the relationship between pitch contour perception and two cognitive functions of interest. METHOD Nine prelingually deafened CI users, aged 8.75-22.17 years, completed a melodic contour identification (MCI) task using stimuli of piano notes or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note), a speech perception task identifying matrix-styled sentences naturally intonated or sung with a fixed pitch (same pitch for each word) or a mixed pitch (different pitches for each word), a forward digit span test indexing auditory short-term memory (STM), and the matrices section of the Kaufman Brief Intelligence Test-Second Edition indexing nonverbal IQ. RESULTS MCI was significantly poorer for the mixed timbre condition. Speech perception was significantly poorer for the fixed and mixed pitch conditions than for the naturally intonated condition. Auditory STM positively correlated with MCI at 2- and 3-semitone note spacings. Relative to their normal-hearing peers from a related study using the same stimuli and tasks, the CI participants showed comparable MCI at 2- or 3-semitone note spacing, and a comparable level of significant decrement in speech perception across three pitch contour conditions. CONCLUSION Findings suggest that prelingually deafened CI users show similar trends of normal-hearing peers for the effect of variation in pitch contour or timbre on the perception of the other, and that cognitive functions may underlie these outcomes to some extent, at least for the perception of pitch contour. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21217937.
Collapse
Affiliation(s)
- Harley J Wheeler
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA
| | - Debora R Hatch
- Department of Otolaryngology, Eastern Virginia Medical School, Norfolk
| | | | - Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA
| |
Collapse
|
6
|
Age-Related Changes in Voice Emotion Recognition by Postlingually Deafened Listeners With Cochlear Implants. Ear Hear 2022; 43:323-334. [PMID: 34406157 PMCID: PMC8847542 DOI: 10.1097/aud.0000000000001095] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES Identification of emotional prosody in speech declines with age in normally hearing (NH) adults. Cochlear implant (CI) users have deficits in the perception of prosody, but the effects of age on vocal emotion recognition by adult postlingually deaf CI users are not known. The objective of the present study was to examine age-related changes in CI users' and NH listeners' emotion recognition. DESIGN Participants included 18 CI users (29.6 to 74.5 years) and 43 NH adults (25.8 to 74.8 years). Participants listened to emotion-neutral sentences spoken by a male and female talker in five emotions (happy, sad, scared, angry, neutral). NH adults heard them in four conditions: unprocessed (full spectrum) speech, 16-channel, 8-channel, and 4-channel noise-band vocoded speech. The adult CI users only listened to unprocessed (full spectrum) speech. Sensitivity (d') to emotions and Reaction Times were obtained using a single-interval, five-alternative, forced-choice paradigm. RESULTS For NH participants, results indicated age-related declines in Accuracy and d', and age-related increases in Reaction Time in all conditions. Results indicated an overall deficit, as well as age-related declines in overall d' for CI users, but Reaction Times were elevated compared with NH listeners and did not show age-related changes. Analysis of Accuracy scores (hit rates) were generally consistent with d' data. CONCLUSIONS Both CI users and NH listeners showed age-related deficits in emotion identification. The CI users' overall deficit in emotion perception, and their slower response times, suggest impaired social communication which may in turn impact overall well-being, particularly so for older CI users, as lower vocal emotion recognition scores have been associated with poorer subjective quality of life in CI patients.
Collapse
|
7
|
Huang W, Wong LLN, Chen F. Just-Noticeable Differences of Fundamental Frequency Change in Mandarin-Speaking Children with Cochlear Implants. Brain Sci 2022; 12:brainsci12040443. [PMID: 35447975 PMCID: PMC9031813 DOI: 10.3390/brainsci12040443] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/16/2022] [Accepted: 03/23/2022] [Indexed: 11/16/2022] Open
Abstract
Fundamental frequency (F0) provides the primary acoustic cue for lexical tone perception in tonal languages but remains poorly represented in cochlear implant (CI) systems. Currently, there is still a lack of understanding of sensitivity to F0 change in CI users who speak tonal languages. In the present study, just-noticeable differences (JNDs) of F0 contour and F0 level changes in Mandarin-speaking children with CIs were measured and compared with those in their age-matched normal-hearing (NH) peers. Results showed that children with CIs demonstrated significantly larger JND of F0 contour (JND-C) change and F0 level (JND-L) change compared to NH children. Further within-group comparison revealed that the JND-C change was significantly smaller than the JND-L change among children with CIs, whereas the opposite pattern was observed among NH children. No significant correlations were seen between JND-C change/JND-L change and age at implantation /duration of CI use. The contrast between children with CIs and NH children in sensitivity to F0 contour and F0 level change suggests different mechanisms of F0 processing in these two groups as a result of different hearing experiences.
Collapse
Affiliation(s)
- Wanting Huang
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China;
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, Hong Kong 999077, China;
| | - Lena L. N. Wong
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, Hong Kong 999077, China;
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China;
- Correspondence:
| |
Collapse
|
8
|
Lin Y, Wu C, Limb CJ, Lu H, Feng IJ, Peng S, Deroche MLD, Chatterjee M. Voice emotion recognition by Mandarin-speaking pediatric cochlear implant users in Taiwan. Laryngoscope Investig Otolaryngol 2022; 7:250-258. [PMID: 35155805 PMCID: PMC8823186 DOI: 10.1002/lio2.732] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 12/29/2021] [Indexed: 11/06/2022] Open
Abstract
OBJECTIVES To explore the effects of obligatory lexical tone learning on speech emotion recognition and the cross-culture differences between United States and Taiwan for speech emotion understanding in children with cochlear implant. METHODS This cohort study enrolled 60 cochlear-implanted (cCI) Mandarin-speaking, school-aged children who underwent cochlear implantation before 5 years of age and 53 normal-hearing children (cNH) in Taiwan. The emotion recognition and the sensitivity of fundamental frequency (F0) changes for those school-aged cNH and cCI (6-17 years old) were examined in a tertiary referred center. RESULTS The mean emotion recognition score of the cNH group was significantly better than the cCI. Female speakers' vocal emotions are more easily to be recognized than male speakers' emotion. There was a significant effect of age at test on voice recognition performance. The average score of cCI with full-spectrum speech was close to the average score of cNH with eight-channel narrowband vocoder speech. The average performance of voice emotion recognition across speakers for cCI could be predicted by their sensitivity to changes in F0. CONCLUSIONS Better pitch discrimination ability comes with better voice emotion recognition for Mandarin-speaking cCI. Besides the F0 cues, cCI are likely to adapt their voice emotion recognition by relying more on secondary cues such as intensity and duration. Although cross-culture differences exist for the acoustic features of voice emotion, Mandarin-speaking cCI and their English-speaking cCI peer expressed a positive effect for age at test on emotion recognition, suggesting the learning effect and brain plasticity. Therefore, further device/processor development to improve presentation of pitch information and more rehabilitative efforts are needed to improve the transmission and perception of voice emotion in Mandarin. LEVEL OF EVIDENCE 3.
Collapse
Affiliation(s)
- Yung‐Song Lin
- Department of OtolaryngologyChi Mei Medical CenterTainanTaiwan
- Department of OtolaryngologySchool of Medicine, College of Medicine, Taipei Medical UniversityTaipeiTaiwan
| | - Che‐Ming Wu
- Department of OtorhinolaryngologyNew Taipei Municipal TuCheng Hospital (built and operated by Chang Gung Medical Foundation)New Taipei CityTaiwan
- Department of OtorhinolaryngologyChang Gung Memorial HospitalTaoyuanTaiwan
- School of Medicine, Chang Gung UniversityTaoyuanTaiwan
| | - Charles J. Limb
- School of Medicine, University of California San FranciscoSan FranciscoCaliforniaUSA
| | - Hui‐Ping Lu
- Center of Speech and Hearing, Department of OtolaryngologyChi Mei Medical CenterTainanTaiwan
| | - I. Jung Feng
- Institute of Precision Medicine, National Sun Yat‐sen UniversityKaohsiungTaiwan
| | - Shu‐Chen Peng
- Center for Devices and Radiological HealthUnited States Food and Drug AdministrationSilver SpringMarylandUSA
| | | | | |
Collapse
|
9
|
Perception of Child-Directed Versus Adult-Directed Emotional Speech in Pediatric Cochlear Implant Users. Ear Hear 2021; 41:1372-1382. [PMID: 32149924 DOI: 10.1097/aud.0000000000000862] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Cochlear implants (CIs) are remarkable in allowing individuals with severe to profound hearing loss to perceive speech. Despite these gains in speech understanding, however, CI users often struggle to perceive elements such as vocal emotion and prosody, as CIs are unable to transmit the spectro-temporal detail needed to decode affective cues. This issue becomes particularly important for children with CIs, but little is known about their emotional development. In a previous study, pediatric CI users showed deficits in voice emotion recognition with child-directed stimuli featuring exaggerated prosody. However, the large intersubject variability and differential developmental trajectory known in this population incited us to question the extent to which exaggerated prosody would facilitate performance in this task. Thus, the authors revisited the question with both adult-directed and child-directed stimuli. DESIGN Vocal emotion recognition was measured using both child-directed (CDS) and adult-directed (ADS) speech conditions. Pediatric CI users, aged 7-19 years old, with no cognitive or visual impairments and who communicated through oral communication with English as the primary language participated in the experiment (n = 27). Stimuli comprised 12 sentences selected from the HINT database. The sentences were spoken by male and female talkers in a CDS or ADS manner, in each of the five target emotions (happy, sad, neutral, scared, and angry). The chosen sentences were semantically emotion-neutral. Percent correct emotion recognition scores were analyzed for each participant in each condition (CDS vs. ADS). Children also completed cognitive tests of nonverbal IQ and receptive vocabulary, while parents completed questionnaires of CI and hearing history. It was predicted that the reduced prosodic variations found in the ADS condition would result in lower vocal emotion recognition scores compared with the CDS condition. Moreover, it was hypothesized that cognitive factors, perceptual sensitivity to complex pitch changes, and elements of each child's hearing history may serve as predictors of performance on vocal emotion recognition. RESULTS Consistent with our hypothesis, pediatric CI users scored higher on CDS compared with ADS speech stimuli, suggesting that speaking with an exaggerated prosody-akin to "motherese"-may be a viable way to convey emotional content. Significant talker effects were also observed in that higher scores were found for the female talker for both conditions. Multiple regression analysis showed that nonverbal IQ was a significant predictor of CDS emotion recognition scores while Years using CI was a significant predictor of ADS scores. Confusion matrix analyses revealed a dependence of results on specific emotions; for the CDS condition's female talker, participants had high sensitivity (d' scores) to happy and low sensitivity to the neutral sentences while for the ADS condition, low sensitivity was found for the scared sentences. CONCLUSIONS In general, participants had higher vocal emotion recognition to the CDS condition which also had more variability in pitch and intensity and thus more exaggerated prosody, in comparison to the ADS condition. Results suggest that pediatric CI users struggle with vocal emotion perception in general, particularly to adult-directed speech. The authors believe these results have broad implications for understanding how CI users perceive emotions both from an auditory communication standpoint and a socio-developmental perspective.
Collapse
|
10
|
Weighting of Prosodic and Lexical-Semantic Cues for Emotion Identification in Spectrally Degraded Speech and With Cochlear Implants. Ear Hear 2021; 42:1727-1740. [PMID: 34294630 PMCID: PMC8545870 DOI: 10.1097/aud.0000000000001057] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Normally-hearing (NH) listeners rely more on prosodic cues than on lexical-semantic cues for emotion perception in speech. In everyday spoken communication, the ability to decipher conflicting information between prosodic and lexical-semantic cues to emotion can be important: for example, in identifying sarcasm or irony. Speech degradation in cochlear implants (CIs) can be sufficiently overcome to identify lexical-semantic cues, but the distortion of voice pitch cues makes it particularly challenging to hear prosody with CIs. The purpose of this study was to examine changes in relative reliance on prosodic and lexical-semantic cues in NH adults listening to spectrally degraded speech and adult CI users. We hypothesized that, compared with NH counterparts, CI users would show increased reliance on lexical-semantic cues and reduced reliance on prosodic cues for emotion perception. We predicted that NH listeners would show a similar pattern when listening to CI-simulated versions of emotional speech. DESIGN Sixteen NH adults and 8 postlingually deafened adult CI users participated in the study. Sentences were created to convey five lexical-semantic emotions (angry, happy, neutral, sad, and scared), with five sentences expressing each category of emotion. Each of these 25 sentences was then recorded with the 5 (angry, happy, neutral, sad, and scared) prosodic emotions by 2 adult female talkers. The resulting stimulus set included 125 recordings (25 Sentences × 5 Prosodic Emotions) per talker, of which 25 were congruent (consistent lexical-semantic and prosodic cues to emotion) and the remaining 100 were incongruent (conflicting lexical-semantic and prosodic cues to emotion). The recordings were processed to have 3 levels of spectral degradation: full-spectrum, CI-simulated (noise-vocoded) to have 8 channels and 16 channels of spectral information, respectively. Twenty-five recordings (one sentence per lexical-semantic emotion recorded in all five prosodies) were used for a practice run in the full-spectrum condition. The remaining 100 recordings were used as test stimuli. For each talker and condition of spectral degradation, listeners indicated the emotion associated with each recording in a single-interval, five-alternative forced-choice task. The responses were scored as proportion correct, where "correct" responses corresponded to the lexical-semantic emotion. CI users heard only the full-spectrum condition. RESULTS The results showed a significant interaction between hearing status (NH, CI) and congruency in identifying the lexical-semantic emotion associated with the stimuli. This interaction was as predicted, that is, CI users showed increased reliance on lexical-semantic cues in the incongruent conditions, while NH listeners showed increased reliance on the prosodic cues in the incongruent conditions. As predicted, NH listeners showed increased reliance on lexical-semantic cues to emotion when the stimuli were spectrally degraded. CONCLUSIONS The present study confirmed previous findings of prosodic dominance for emotion perception by NH listeners in the full-spectrum condition. Further, novel findings with CI patients and NH listeners in the CI-simulated conditions showed reduced reliance on prosodic cues and increased reliance on lexical-semantic cues to emotion. These results have implications for CI listeners' ability to perceive conflicts between prosodic and lexical-semantic cues, with repercussions for their identification of sarcasm and humor. Understanding instances of sarcasm or humor can impact a person's ability to develop relationships, follow conversation, understand vocal emotion and intended message of a speaker, following jokes, and everyday communication in general.
Collapse
|
11
|
Yu J, Liao Y, Wu S, Li Y, Huang M. Discourse Prosody in Children's Rhyme Speech Produced by Prelingually Deaf Mandarin-Speaking Children With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1736-1751. [PMID: 32543941 DOI: 10.1044/2020_jslhr-19-00333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose This study aimed to obtain a comprehensive understanding about how Mandarin-speaking children with cochlear implants (CIs) performed speech prosody in a connected discourse and to what extent their prosodic scenario differed from those normal-hearing (NH) peers. Method Fifteen prelingually deaf Mandarin-speaking children with unilateral multichannel CIs were chosen and 15 age-matched NH controls were recruited. Speech samples were spontaneously elicited by children's rhyme speech genre and subject to phonetic annotation. Acoustic analysis was conducted on all speech samples, mainly focusing on the measurements of duration and fundamental frequency (F0). Tempo measures included temporal fluency, syllable-lengthening, and rhythm metrics, whereas melodic measures included both local and global F0 variations under different prosodic domains. Results The CI children generally achieved compatible temporal performance with the NH children in spontaneous discourse, except that they were somewhat arbitrary when operationalizing lengthening strategy and pausing strategy at different prosodic boundaries. With regard to melodic performance, CI children may not sufficiently modulate local phonetic nuances of F0 variation, and meanwhile, they performed atypically in the global F0 declination pattern and overall F0 resetting pattern, failing to signal the specific structure of children's rhyme discourse. Early age at implantation and longer CI experience did not play a significant role in the temporal performance of the CI children but did facilitate their articulation of dynamic pitch variation in the spontaneous discourse to some extent. Conclusion CI children did exhibit atypical prosodic patterns in discourse context, especially the overall mapping between the prosodic manifestation and the discourse structure.
Collapse
Affiliation(s)
- Jue Yu
- Center for Speech-Language Processing, School of Foreign Languages, Tongji University, Shanghai, China
| | - Yiyuan Liao
- Center for Speech-Language Processing, School of Foreign Languages, Tongji University, Shanghai, China
| | - Shengyi Wu
- Lab of Phonetics and Phonology, School of Humanities, Shaoxing University, Zhejiang, China
| | - Yun Li
- Department of Otolaryngology Head & Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, China
| | - Meiping Huang
- Department of Otolaryngology Head & Neck Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, China
| |
Collapse
|
12
|
Christensen JA, Sis J, Kulkarni AM, Chatterjee M. Effects of Age and Hearing Loss on the Recognition of Emotions in Speech. Ear Hear 2020; 40:1069-1083. [PMID: 30614835 PMCID: PMC6606405 DOI: 10.1097/aud.0000000000000694] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Emotional communication is a cornerstone of social cognition and informs human interaction. Previous studies have shown deficits in facial and vocal emotion recognition in older adults, particularly for negative emotions. However, few studies have examined combined effects of aging and hearing loss on vocal emotion recognition by adults. The objective of this study was to compare vocal emotion recognition in adults with hearing loss relative to age-matched peers with normal hearing. We hypothesized that age would play a role in emotion recognition and that listeners with hearing loss would show deficits across the age range. DESIGN Thirty-two adults (22 to 74 years of age) with mild to severe, symmetrical sensorineural hearing loss, amplified with bilateral hearing aids and 30 adults (21 to 75 years of age) with normal hearing, participated in the study. Stimuli consisted of sentences spoken by 2 talkers, 1 male, 1 female, in 5 emotions (angry, happy, neutral, sad, and scared) in an adult-directed manner. The task involved a single-interval, five-alternative forced-choice paradigm, in which the participants listened to individual sentences and indicated which of the five emotions was targeted in each sentence. Reaction time was recorded as an indirect measure of cognitive load. RESULTS Results showed significant effects of age. Older listeners had reduced accuracy, increased reaction times, and reduced d' values. Normal hearing listeners showed an Age by Talker interaction where older listeners had more difficulty identifying male vocal emotion. Listeners with hearing loss showed reduced accuracy, increased reaction times, and lower d' values compared with age-matched normal-hearing listeners. Within the group with hearing loss, age and talker effects were significant, and low-frequency pure-tone averages showed a marginally significant effect. Contrary to other studies, once hearing thresholds were taken into account, no effects of listener sex were observed, nor were there effects of individual emotions on accuracy. However, reaction times and d' values showed significant differences between individual emotions. CONCLUSIONS The results of this study confirm existing findings in the literature showing that older adults show significant deficits in voice emotion recognition compared with their normally hearing peers, and that among listeners with normal hearing, age-related changes in hearing do not predict this age-related deficit. The present results also add to the literature by showing that hearing impairment contributes additionally to deficits in vocal emotion recognition, separate from deficits related to age. These effects of age and hearing loss appear to be quite robust, being evident in reduced accuracy scores and d' measures, as well as in reaction time measures.
Collapse
Affiliation(s)
- Julie A Christensen
- Auditory Prostheses and Perception Lab, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| | - Jenni Sis
- Auditory Prostheses and Perception Lab, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| | - Aditya M Kulkarni
- Auditory Prostheses and Perception Lab, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| | - Monita Chatterjee
- Auditory Prostheses and Perception Lab, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| |
Collapse
|
13
|
Neurophysiological Differences in Emotional Processing by Cochlear Implant Users, Extending Beyond the Realm of Speech. Ear Hear 2020; 40:1197-1209. [PMID: 30762600 DOI: 10.1097/aud.0000000000000701] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
OBJECTIVE Cochlear implants (CIs) restore a sense of hearing in deaf individuals. However, they do not transmit the acoustic signal with sufficient fidelity, leading to difficulties in recognizing emotions in voice and in music. The study aimed to explore the neurophysiological bases of these limitations. DESIGN Twenty-two adults (18 to 70 years old) with CIs and 22 age-matched controls with normal hearing participated. Event-related potentials (ERPs) were recorded in response to emotional bursts (happy, sad, or neutral) produced in each modality (voice or music) that were for the most part correctly identified behaviorally. RESULTS Compared to controls, the N1 and P2 components were attenuated and prolonged in CI users. To a smaller degree, N1 and P2 were also attenuated and prolonged in music compared to voice, in both populations. The N1-P2 complex was emotion-dependent (e.g., reduced and prolonged response to sadness), but this was also true in both populations. In contrast, the later portion of the response, between 600 and 850 ms, differentiated happy and sad from neutral stimuli in normal hearing but not in CI listeners. CONCLUSIONS The early portion of the ERP waveform reflected primarily the general reduction in sensory encoding by CI users (largely due to CI processing itself), whereas altered emotional processing (by CI users) could be found in the later portion of the ERP and extended beyond the realm of speech.
Collapse
|
14
|
Mehta AH, Lu H, Oxenham AJ. The Perception of Multiple Simultaneous Pitches as a Function of Number of Spectral Channels and Spectral Spread in a Noise-Excited Envelope Vocoder. J Assoc Res Otolaryngol 2020; 21:61-72. [PMID: 32048077 DOI: 10.1007/s10162-019-00738-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 10/30/2019] [Indexed: 01/06/2023] Open
Abstract
Cochlear implant (CI) listeners typically perform poorly on tasks involving the pitch of complex tones. This limitation in performance is thought to be mainly due to the restricted number of active channels and the broad current spread that leads to channel interactions and subsequent loss of precise spectral information, with temporal information limited primarily to temporal-envelope cues. Little is known about the degree of spectral resolution required to perceive combinations of multiple pitches, or a single pitch in the presence of other interfering tones in the same spectral region. This study used noise-excited envelope vocoders that simulate the limited resolution of CIs to explore the perception of multiple pitches presented simultaneously. The results show that the resolution required for perceiving multiple complex pitches is comparable to that found in a previous study using single complex tones. Although relatively high performance can be achieved with 48 channels, performance remained near chance when even limited spectral spread (with filter slopes as steep as 144 dB/octave) was introduced to the simulations. Overall, these tight constraints suggest that current CI technology will not be able to convey the pitches of combinations of spectrally overlapping complex tones.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA.
| | - Hao Lu
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| |
Collapse
|
15
|
Damm SA, Sis JL, Kulkarni AM, Chatterjee M. How Vocal Emotions Produced by Children With Cochlear Implants Are Perceived by Their Hearing Peers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:3728-3740. [PMID: 31589545 PMCID: PMC7201339 DOI: 10.1044/2019_jslhr-s-18-0497] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2018] [Revised: 06/13/2019] [Accepted: 07/16/2019] [Indexed: 06/10/2023]
Abstract
Purpose Cochlear implants (CIs) transmit a degraded version of the acoustic input to the listener. This impacts the perception of harmonic pitch, resulting in deficits in the perception of voice features critical to speech prosody. Such deficits may relate to changes in how children with CIs (CCIs) learn to produce vocal emotions. The purpose of this study was to investigate happy and sad emotional speech productions by school-age CCIs, compared to productions by children with normal hearing (NH), postlingually deaf adults with CIs, and adults with NH. Method All individuals recorded the same emotion-neutral sentences in a happy manner and a sad manner. These recordings were then used as stimuli in an emotion recognition task performed by child and adult listeners with NH. Their performance was taken as a measure of how well the 4 groups of talkers communicated the 2 emotions. Results Results showed high variability in the identifiability of emotions produced by CCIs, relative to other groups. Some CCIs produced highly identifiable emotions, while others showed deficits. The postlingually deaf adults with CIs produced highly identifiable emotions and relatively small intersubject variability. Age at implantation was found to be a significant predictor of performance by CCIs. In addition, the NH listeners' age predicted how well they could identify the emotions produced by CCIs. Thus, older NH child listeners were better able to identify the CCIs' intended emotions than younger NH child listeners. In contrast to the deficits in their emotion productions, CCIs produced highly intelligible words in the sentences carrying the emotions. Conclusions These results confirm previous findings showing deficits in CCIs' productions of prosodic cues and indicate that early auditory experience plays an important role in vocal emotion productions by individuals with CIs.
Collapse
Affiliation(s)
- Sara A. Damm
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Jenni L. Sis
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
- Department of Special Education and Communication Disorders, University of Nebraska–Lincoln Barkley Memorial Center
| | - Aditya M. Kulkarni
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| | - Monita Chatterjee
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE
| |
Collapse
|
16
|
Chatterjee M, Kulkarni AM, Siddiqui RM, Christensen JA, Hozan M, Sis JL, Damm SA. Acoustics of Emotional Prosody Produced by Prelingually Deaf Children With Cochlear Implants. Front Psychol 2019; 10:2190. [PMID: 31632320 PMCID: PMC6779094 DOI: 10.3389/fpsyg.2019.02190] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 09/11/2019] [Indexed: 11/27/2022] Open
Abstract
Purpose: Cochlear implants (CIs) provide reasonable levels of speech recognition quietly, but voice pitch perception is severely impaired in CI users. The central question addressed here relates to how access to acoustic input pre-implantation influences vocal emotion production by individuals with CIs. The objective of this study was to compare acoustic characteristics of vocal emotions produced by prelingually deaf school-aged children with cochlear implants (CCIs) who were implanted at the age of 2 and had no usable hearing before implantation with those produced by children with normal hearing (CNH), adults with normal hearing (ANH), and postlingually deaf adults with cochlear implants (ACI) who developed with good access to acoustic information prior to losing their hearing and receiving a CI. Method: A set of 20 sentences without lexically based emotional information was recorded by 13 CCI, 9 CNH, 9 ANH, and 10 ACI, each with a happy emotion and a sad emotion, without training or guidance. The sentences were analyzed for primary acoustic characteristics of the productions. Results: Significant effects of Emotion were observed in all acoustic features analyzed (mean voice pitch, standard deviation of voice pitch, intensity, duration, and spectral centroid). ACI and ANH did not differ in any of the analyses. Of the four groups, CCI produced the smallest acoustic contrasts between the emotions in voice pitch and emotions in its standard deviation. Effects of developmental age (highly correlated with the duration of device experience) and age at implantation (moderately correlated with duration of device experience) were observed, and interactions with the children's sex were also observed. Conclusion: Although prelingually deaf CCI and postlingually deaf ACI are listening to similar degraded speech and show similar deficits in vocal emotion perception, these groups are distinct in their productions of contrastive vocal emotions. The results underscore the importance of access to acoustic hearing in early childhood for the production of speech prosody and also suggest the need for a greater role of speech therapy in this area.
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses and Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, United States
| | | | | | | | | | | | | |
Collapse
|
17
|
Children's Recognition of Emotional Prosody in Spectrally Degraded Speech Is Predicted by Their Age and Cognitive Status. Ear Hear 2019; 39:874-880. [PMID: 29337761 DOI: 10.1097/aud.0000000000000546] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES It is known that school-aged children with cochlear implants show deficits in voice emotion recognition relative to normal-hearing peers. Little, however, is known about normal-hearing children's processing of emotional cues in cochlear implant-simulated, spectrally degraded speech. The objective of this study was to investigate school-aged, normal-hearing children's recognition of voice emotion, and the degree to which their performance could be predicted by their age, vocabulary, and cognitive factors such as nonverbal intelligence and executive function. DESIGN Normal-hearing children (6-19 years old) and young adults were tested on a voice emotion recognition task under three different conditions of spectral degradation using cochlear implant simulations (full-spectrum, 16-channel, and 8-channel noise-vocoded speech). Measures of vocabulary, nonverbal intelligence, and executive function were obtained as well. RESULTS Adults outperformed children on all tasks, and a strong developmental effect was observed. The children's age, the degree of spectral resolution, and nonverbal intelligence were predictors of performance, but vocabulary and executive functions were not, and no interactions were observed between age and spectral resolution. CONCLUSIONS These results indicate that cognitive function and age play important roles in children's ability to process emotional prosody in spectrally degraded speech. The lack of an interaction between the degree of spectral resolution and children's age further suggests that younger and older children may benefit similarly from improvements in spectral resolution. The findings imply that younger and older children with cochlear implants may benefit similarly from technical advances that improve spectral resolution.
Collapse
|
18
|
A tonal-language benefit for pitch in normally-hearing and cochlear-implanted children. Sci Rep 2019; 9:109. [PMID: 30643156 PMCID: PMC6331606 DOI: 10.1038/s41598-018-36393-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Accepted: 11/21/2018] [Indexed: 11/08/2022] Open
Abstract
In tonal languages, voice pitch inflections change the meaning of words, such that the brain processes pitch not merely as an acoustic characterization of sound but as semantic information. In normally-hearing (NH) adults, this linguistic pressure on pitch appears to sharpen its neural encoding and can lead to perceptual benefits, depending on the task relevance, potentially generalizing outside of the speech domain. In children, however, linguistic systems are still malleable, meaning that their encoding of voice pitch information might not receive as much neural specialization but might generalize more easily to ecologically irrelevant pitch contours. This would seem particularly true for early-deafened children wearing a cochlear implant (CI), who must exhibit great adaptability to unfamiliar sounds as their sense of pitch is severely degraded. Here, we provide the first demonstration of a tonal language benefit in dynamic pitch sensitivity among NH children (using both a sweep discrimination and labelling task) which extends partially to children with CI (i.e., in the labelling task only). Strong age effects suggest that sensitivity to pitch contours reaches adult-like levels early in tonal language speakers (possibly before 6 years of age) but continues to develop in non-tonal language speakers well into the teenage years. Overall, we conclude that language-dependent neuroplasticity can enhance behavioral sensitivity to dynamic pitch, even in extreme cases of auditory degradation, but it is most easily observable early in life.
Collapse
|
19
|
Hawthorne K. Prosody-driven syntax learning is robust to impoverished pitch and spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:2756. [PMID: 29857717 DOI: 10.1121/1.5031130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Across languages, prosodic boundaries tend to align with syntactic boundaries, and both infant and adult language learners capitalize on these correlations to jump-start syntax acquisition. However, it is unclear which prosodic cues-pauses, final-syllable lengthening, and/or pitch resets across boundaries-are necessary for prosodic bootstrapping to occur. It is also unknown how syntax acquisition is impacted when listeners do not have access to the full range of prosodic or spectral information. These questions were addressed using 14-channel noise-vocoded (spectrally degraded) speech. While pre-boundary lengthening and pauses are well-transmitted through noise-vocoded speech, pitch is not; overall intelligibility is also decreased. In two artificial grammar experiments, adult native English speakers showed a similar ability to use English-like prosody to bootstrap unfamiliar syntactic structures from degraded speech and natural, unmanipulated speech. Contrary to previous findings that listeners may require pitch resets and final lengthening to co-occur if no pause cue is present, participants in the degraded speech conditions were able to detect prosodic boundaries from lengthening alone. Results suggest that pitch is not necessary for adult English speakers to perceive prosodic boundaries associated with syntactic structures, and that prosodic bootstrapping is robust to degraded spectral information.
Collapse
Affiliation(s)
- Kara Hawthorne
- Department of Communication Sciences and Disorders, University of Mississippi, 304 George Hall, University, Mississippi 38677, USA
| |
Collapse
|
20
|
Başkent D, Fuller CD, Galvin JJ, Schepel L, Gaudrain E, Free RH. Musician effect on perception of spectro-temporally degraded speech, vocal emotion, and music in young adolescents. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL311. [PMID: 29857757 DOI: 10.1121/1.5034489] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In adult normal-hearing musicians, perception of music, vocal emotion, and speech in noise has been previously shown to be better than non-musicians, sometimes even with spectro-temporally degraded stimuli. In this study, melodic contour identification, vocal emotion identification, and speech understanding in noise were measured in young adolescent normal-hearing musicians and non-musicians listening to unprocessed or degraded signals. Different from adults, there was no musician effect for vocal emotion identification or speech in noise. Melodic contour identification with degraded signals was significantly better in musicians, suggesting potential benefits from music training for young cochlear-implant users, who experience similar spectro-temporal signal degradations.
Collapse
Affiliation(s)
- Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Christina D Fuller
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - John J Galvin
- House Ear Institute, Los Angeles, California 90057, USA , , , , ,
| | - Like Schepel
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Rolien H Free
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| |
Collapse
|
21
|
Nie Y, Galvin JJ, Morikawa M, André V, Wheeler H, Fu QJ. Music and Speech Perception in Children Using Sung Speech. Trends Hear 2018; 22:2331216518766810. [PMID: 29609496 PMCID: PMC5888806 DOI: 10.1177/2331216518766810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Collapse
Affiliation(s)
- Yingjiu Nie
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | | | - Michael Morikawa
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Victoria André
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Harley Wheeler
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Qian-Jie Fu
- 3 Department of Head and Neck Surgery, University of California-Los Angeles, CA, USA
| |
Collapse
|
22
|
Jiam NT, Caldwell M, Deroche ML, Chatterjee M, Limb CJ. Voice emotion perception and production in cochlear implant users. Hear Res 2017; 352:30-39. [PMID: 28088500 DOI: 10.1016/j.heares.2017.01.006] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 12/14/2016] [Accepted: 01/06/2017] [Indexed: 10/20/2022]
Abstract
Voice emotion is a fundamental component of human social interaction and social development. Unfortunately, cochlear implant users are often forced to interface with highly degraded prosodic cues as a result of device constraints in extraction, processing, and transmission. As such, individuals with cochlear implants frequently demonstrate significant difficulty in recognizing voice emotions in comparison to their normal hearing counterparts. Cochlear implant-mediated perception and production of voice emotion is an important but relatively understudied area of research. However, a rich understanding of the voice emotion auditory processing offers opportunities to improve upon CI biomedical design and to develop training programs benefiting CI performance. In this review, we will address the issues, current literature, and future directions for improved voice emotion processing in cochlear implant users.
Collapse
Affiliation(s)
- N T Jiam
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, School of Medicine, San Francisco, CA, USA
| | - M Caldwell
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, School of Medicine, San Francisco, CA, USA
| | - M L Deroche
- Centre for Research on Brain, Language and Music, McGill University Montreal, QC, Canada
| | - M Chatterjee
- Auditory Prostheses and Perception Laboratory, Boys Town National Research Hospital, Omaha, NE, USA
| | - C J Limb
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, School of Medicine, San Francisco, CA, USA.
| |
Collapse
|